nonforking process copying

Discussion:

LRN

2011-10-07 05:29:14 UTC

I've been wondering about the way cygwin forks W32 processes.

One of the things W32 seems to be unable to do is to let you delete an
executable file that is being executed (AFAIU, there's a file handle
opened somewhere within w32 subsystem, preventing that).

One of the most prominent self-deleting techniques is to run another
process, have it wait upon the main process, and once the main process
terminates, also delete main process' executable, and then terminate
itself (special delete-on-close behaviour is used to remove the
executable of the second process). This is, again, due to the fact
that the main process can't delete its own executable while it is running.

When fork() runs, it usually creates the forked process based on the
same executable image file, which is (usually) replaced later by the
contents of another executable file (using exec()).

However, AFAICS, nothing really prevents me from creating a fork()
variant that creates a new process using a *copy* of the original
executable image file, and then closing the original. This way it
should be possible to release the handle W32 subsystem keeps on the
original image file, so that it can be removed or replaced.

However we all remember the base addresses problem that arises when
forking in cygwin (and maybe other problems that i am not aware of),
so maybe forking is not a viable solution here (assuming that
arbitrary non-cygwin processes are going to be "moved" this way)? Are
matching base addresses for DLLs simply one of the POSIX requirements,
or is it a practical requirement (i.e. things break if this
requirement is not fulfilled)?

Corinna Vinschen

2011-10-07 08:30:07 UTC

Permalink

Post by LRN
I've been wondering about the way cygwin forks W32 processes.
One of the things W32 seems to be unable to do is to let you delete an
executable file that is being executed (AFAIU, there's a file handle
opened somewhere within w32 subsystem, preventing that).
One of the most prominent self-deleting techniques is to run another
process, have it wait upon the main process, and once the main process
terminates, also delete main process' executable, and then terminate
itself (special delete-on-close behaviour is used to remove the
executable of the second process). This is, again, due to the fact
that the main process can't delete its own executable while it is running.
When fork() runs, it usually creates the forked process based on the
same executable image file, which is (usually) replaced later by the
contents of another executable file (using exec()).
However, AFAICS, nothing really prevents me from creating a fork()
variant that creates a new process using a *copy* of the original
executable image file, and then closing the original. This way it
should be possible to release the handle W32 subsystem keeps on the
original image file, so that it can be removed or replaced.

When you fork, it's too late. The parent process has already an open
handle to the executable. The forked process just opens YA handle.

If that's feasible at all, it would be in the exec functions. However,
this wouldn't work for the first process started in a Cygwin process
tree. A general implementation would have to be called at process
initialization.

I think that something like this could be made to work, but I have a
gut feeling that this would slow down Cygwin even more. We already
have enough complaints about Cygwin's performance as it is.

Post by LRN
However we all remember the base addresses problem that arises when
forking in cygwin (and maybe other problems that i am not aware of),
so maybe forking is not a viable solution here (assuming that
arbitrary non-cygwin processes are going to be "moved" this way)? Are
matching base addresses for DLLs simply one of the POSIX requirements,
or is it a practical requirement (i.e. things break if this
requirement is not fulfilled)?

DLLs have data sections. Data sections contain pointers. If you can't
exactly reproduce the memory state of these DLLs after fork, you are in
trouble.

Corinna

--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat

LRN

2011-10-07 10:08:53 UTC

Permalink

When you fork, it's too late. The parent process has already an open
handle to the executable. The forked process just opens YA handle.
If that's feasible at all, it would be in the exec functions. However,
this wouldn't work for the first process started in a Cygwin process
tree. A general implementation would have to be called at process
initialization.
I think that something like this could be made to work, but I have a
gut feeling that this would slow down Cygwin even more. We already
have enough complaints about Cygwin's performance as it is.

To clarify: I did not intend to propose this as something to be
implemented in Cygwin
I'm writing this here because Cygwin developers tend to know about such things.
(sorry, if i was not supposed to talk about
not-directly-related-to-Cygwin things on this list).

DLLs have data sections. Data sections contain pointers. If you can't
exactly reproduce the memory state of these DLLs after fork, you are in
trouble.

OK, that makes this approach pretty much useless.

Christopher Faylor

2011-10-07 12:20:33 UTC

Permalink

Post by LRN
To clarify: I did not intend to propose this as something to be
implemented in Cygwin
I'm writing this here because Cygwin developers tend to know about such things.
(sorry, if i was not supposed to talk about
not-directly-related-to-Cygwin things on this list).

Yes, the cygwin-developers is intended solely for discussion of Cygwin
development. I didn't set up this mailing list as an expert forum for
people to discuss things that had nothing to do with Cygwin.
Regardless, if the idea was implemented, as Corinna says, this would
slow Cygwin down significantly for a corner case condition.

Please find somewhere else to discuss this.

cgf

Earnie Boyd

2011-10-07 11:51:54 UTC

Permalink

Why not just mark the executable file as "delete on close" and rename
it if you want to use the same file name? You can do both of these with
the file open. What brings you to want to ask this question?

Earnie