Discussion:
Protect fork() against dll- and exe-updates.
Michael Haubenwallner
2015-11-16 09:01:45 UTC
Permalink
Hi Corinna,
have reworked the hardlink-creation from scratch as discussed before,
now using /var/run/cygfork/ as the top-level hardlinks directory.
* At process start and during LoadLibrary, handles to all the loaded
dlls (including cygwin1.dll) and the main executable are opened.
* If they are not identical (any more), hardlinks to these dlls are
created in subdirectories into /var/run/cygfork/<sid>/.
This description puzzles me a bit. If the DLLs have changed, isn't
it too late to create the hardlink?!? I guess I'm missing something.
The hardlink is created using the handle opened during dll-initialization
(as we have discussed that before). Agreed there is the short time frame
between process start (where windows does open the dll) and opening the
file handle in dll-initialization, where the dll could be replaced - but
I'm not sure if that really will be that much of a problem.
* For the timing: Building cygwin-2.4.0-0.2 three times, the duration
difference is in the range of measuring fault - almost identical for
each possible variant vanilla,disabled,enabled,forced.
Wow. Really?
More thoughts?
Yes, but I'm not sure yet. I'm just wondering if /var/run or any
other such dir is such a good idea after all. I have this vague
notion that a subdir of the directory the DLL is installed in might
be safer in the log run (for DLLs in /bin a subdir of /bin, etc).
While I don't really care about /var/run, I doubt that /bin necessarily
is writeable for each user using a particular cygwin instance - or we
would have to create some cygfork directory with a=rwxt permission there
beforehand. Also, for the dll-redirection to work, all the linktime
dll's need to be available nearby app.exe and the app.exe.local file,
either as hardlink, copy, or symlink(?). Otherways we might have to
create some app.exe.manifest file - I've got lost in that docs...
More to discuss?
I'm a bit swamped ATM, but I'll try to take a look into your patch next
week or the week after. As for discussing, yes, I think so. Please
keep in mind that your patch is pretty intrusive, so there's certainly
something to discuss :}
ATM, there's one thing I'm unsure whether it is necessary at all, and how
to do if yes: Purging the bin - where the hardlinks get moved to.
I'm planning to fold in the cygwin-acl branch the week after and then
release 2.4.0. From there on we should create a feature branch from
mainline and apply your changes to perform further testing and tweaking
as necessary.
Sounds ok?
Absolutely fine with me!

Thanks a lot!
/haubi/
Corinna Vinschen
2015-11-16 10:22:27 UTC
Permalink
Post by Michael Haubenwallner
Hi Corinna,
have reworked the hardlink-creation from scratch as discussed before,
now using /var/run/cygfork/ as the top-level hardlinks directory.
* At process start and during LoadLibrary, handles to all the loaded
dlls (including cygwin1.dll) and the main executable are opened.
* If they are not identical (any more), hardlinks to these dlls are
created in subdirectories into /var/run/cygfork/<sid>/.
This description puzzles me a bit. If the DLLs have changed, isn't
it too late to create the hardlink?!? I guess I'm missing something.
The hardlink is created using the handle opened during dll-initialization
(as we have discussed that before).
That's what I missed. Duh.
Post by Michael Haubenwallner
Agreed there is the short time frame
between process start (where windows does open the dll) and opening the
file handle in dll-initialization, where the dll could be replaced - but
I'm not sure if that really will be that much of a problem.
* For the timing: Building cygwin-2.4.0-0.2 three times, the duration
difference is in the range of measuring fault - almost identical for
each possible variant vanilla,disabled,enabled,forced.
Wow. Really?
More thoughts?
Yes, but I'm not sure yet. I'm just wondering if /var/run or any
other such dir is such a good idea after all. I have this vague
notion that a subdir of the directory the DLL is installed in might
be safer in the log run (for DLLs in /bin a subdir of /bin, etc).
While I don't really care about /var/run, I doubt that /bin necessarily
is writeable for each user using a particular cygwin instance - or we
would have to create some cygfork directory with a=rwxt permission there
beforehand.
Same could happen with a subdir under /var/run in theory. Well, Setup
creates the dir with 1777 perms of course, but in a corp environment you
can't be sure installations are performed with Setup at all. But, yes,
I see the point.
Post by Michael Haubenwallner
Also, for the dll-redirection to work, all the linktime
dll's need to be available nearby app.exe and the app.exe.local file,
Ok. We're talking about pre-installed binaries in the first place,
right? I'm not yet clear on how this is going to handle binaries
installed to another drive...
Post by Michael Haubenwallner
either as hardlink, copy, or symlink(?). Otherways we might have to
create some app.exe.manifest file - I've got lost in that docs...
Heh, yeah. The manifest is pretty ugly, IMHO, especially the
compatibility stuff since Windows 8.1.
Post by Michael Haubenwallner
More to discuss?
I'm a bit swamped ATM, but I'll try to take a look into your patch next
week or the week after. As for discussing, yes, I think so. Please
keep in mind that your patch is pretty intrusive, so there's certainly
something to discuss :}
ATM, there's one thing I'm unsure whether it is necessary at all, and how
to do if yes: Purging the bin - where the hardlinks get moved to.
We look into that when we get to it.

Btw., I just skimmed your patch a bit and I stumbled over the usage of
GetFileInformationByHandle.

In dll_list::ctimename, please use NtQueryInformationFile with
FileBasicInformation instead. It allows to reduce the filetime
comparison to a single test using the QuadPart of the timestamp and
__small_swprintf simply uses a %016X.

Analogue in dll_list::update_forkables_needs. Do you really need to
check the volume serial number? Otherwise, just call
NtQueryInformationFile(FileInternalInformation).
Post by Michael Haubenwallner
I'm planning to fold in the cygwin-acl branch the week after and then
release 2.4.0. From there on we should create a feature branch from
mainline and apply your changes to perform further testing and tweaking
as necessary.
Sounds ok?
Absolutely fine with me!
Cool, thanks!


Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat
Michael Haubenwallner
2015-11-16 18:07:47 UTC
Permalink
Post by Corinna Vinschen
Post by Michael Haubenwallner
Also, for the dll-redirection to work, all the linktime
dll's need to be available nearby app.exe and the app.exe.local file,
Ok. We're talking about pre-installed binaries in the first place,
right?
What do you call pre-installed binaries? It doesn't matter for the
binary when and by whom it was installed, either setup.exe - which
is non-cygwin, or some in-cygwin package manager - which relies on
the update-safe fork when replacing in-use binaries.
Post by Corinna Vinschen
I'm not yet clear on how this is going to handle binaries
installed to another drive...
For now, when NtSetInformationFile (FileLinkInformation) does fail
for whatever reason (I cannot think of anything else than another
drive), this file is not hardlinked and will be used from its original
location. Must admit I haven't tested this scenario.
Post by Corinna Vinschen
Post by Michael Haubenwallner
either as hardlink, copy, or symlink(?).
However, I can imagine two different cross-drive approaches:
* Either create some cygfork-directory on each drive to hold the
hardlinks (still requires NTFS), and put symlinks into /var/run/
* or do a file-copy using the pre-opened handle into /var/run/
(should work even with FAT, but would be slow).
Post by Corinna Vinschen
Post by Michael Haubenwallner
Otherways we might have to
create some app.exe.manifest file - I've got lost in that docs...
Heh, yeah. The manifest is pretty ugly, IMHO, especially the
compatibility stuff since Windows 8.1.
Post by Michael Haubenwallner
More to discuss?
ATM, there's one thing I'm unsure whether it is necessary at all, and how
to do if yes: Purging the bin - where the hardlinks get moved to.
We look into that when we get to it.
Ok then.
Post by Corinna Vinschen
Btw., I just skimmed your patch a bit and I stumbled over the usage of
GetFileInformationByHandle.
Is there something wrong with GetFileInformationByHandle I should know of?
It can return FileId, LastWriteTime, and FileSize in one call...
Post by Corinna Vinschen
In dll_list::ctimename, please use NtQueryInformationFile with
FileBasicInformation instead. It allows to reduce the filetime
comparison to a single test using the QuadPart of the timestamp and
__small_swprintf simply uses a %016X.
Actually the FileInformation queried in dll_list::ctimename is reused
in dll_list::update_forkables_needs - have added a comment now.
Post by Corinna Vinschen
Analogue in dll_list::update_forkables_needs. Do you really need to
check the volume serial number? Otherwise, just call
NtQueryInformationFile(FileInternalInformation).
The idea behind checking the volume serial number is that I'm unsure
whether dlls can be replaced by symlinks to other drives as well, so
the FileId probably could become equal for different files...

BTW: MSDN document for FILE_INTERNAL_INFORMATION structure tells about
IndexNumber as the member - and so does /usr/include/w32api/, but
winsup/cygwin/ntdll.h does name it FileId - caused me some confusion...
Is that intentionally?

Thanks!
/haubi/
Corinna Vinschen
2015-11-17 13:48:10 UTC
Permalink
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
Also, for the dll-redirection to work, all the linktime
dll's need to be available nearby app.exe and the app.exe.local file,
Ok. We're talking about pre-installed binaries in the first place,
right?
What do you call pre-installed binaries? It doesn't matter for the
binary when and by whom it was installed, either setup.exe - which
is non-cygwin, or some in-cygwin package manager - which relies on
the update-safe fork when replacing in-use binaries.
What I meant was binaries in the distro tree in contrast to out-of-tree
binaries. I was just thinking aloud about the need (or lack thereof) to
handle binaries installed on drives other than the Cygwin installation
drive.
Post by Michael Haubenwallner
Post by Corinna Vinschen
I'm not yet clear on how this is going to handle binaries
installed to another drive...
For now, when NtSetInformationFile (FileLinkInformation) does fail
for whatever reason (I cannot think of anything else than another
drive), this file is not hardlinked and will be used from its original
location. Must admit I haven't tested this scenario.
Ok, something to keep in mind.
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
either as hardlink, copy, or symlink(?).
* Either create some cygfork-directory on each drive to hold the
hardlinks (still requires NTFS), and put symlinks into /var/run/
* or do a file-copy using the pre-opened handle into /var/run/
(should work even with FAT, but would be slow).
I realize that we don't have to handle this at all. The problem your
patch is supposed to handle are installation issues, which usually only
occur for installer-provided binaries, so only distro binaries should be
targeted.
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
Otherways we might have to
create some app.exe.manifest file - I've got lost in that docs...
Heh, yeah. The manifest is pretty ugly, IMHO, especially the
compatibility stuff since Windows 8.1.
Post by Michael Haubenwallner
More to discuss?
ATM, there's one thing I'm unsure whether it is necessary at all, and how
to do if yes: Purging the bin - where the hardlinks get moved to.
We look into that when we get to it.
Ok then.
Post by Corinna Vinschen
Btw., I just skimmed your patch a bit and I stumbled over the usage of
GetFileInformationByHandle.
Is there something wrong with GetFileInformationByHandle I should know of?
It can return FileId, LastWriteTime, and FileSize in one call...
Not as such, no. It's just that Cygwin isn't using the Windows calls.
Under the hood the GetFileInformationByHandle is calling
NtQueryVolumeInformationFile(FileFsVolumeInformation) and
NtQueryInformationFile(FileAllInformation).

Sidenote:

We're not using FileAllInformation anymore because in my (very, very
long ago) testing, FileAllInformation needed a big buffer to store the
file's pathname. However, I just tried it again while looking into
this, and it turns out that you can call FileAllInformation with a
buffer of size FILE_ALL_INFORMATION (which only has room for the first
character of the path). It returns STATUS_BUFFER_OVERFLOW, but it also
returns all the data in the structure. Yay.

I think I did some tests way back when, which showed that
FileAllInformation could be slower than three single calls
FileBasicInformation, FileStandardInformation and
FileInternalInformation combined. But that was back in XP times. If my
test results are outdated and FileAllInformation is quick enough, we
should contemplate to use FileAllInformation instead of
FileNetworkOpenInforation in class path_conv. This may drop some
conditional code paths and generally simply the code a bit...
Post by Michael Haubenwallner
Post by Corinna Vinschen
In dll_list::ctimename, please use NtQueryInformationFile with
FileBasicInformation instead. It allows to reduce the filetime
comparison to a single test using the QuadPart of the timestamp and
__small_swprintf simply uses a %016X.
Actually the FileInformation queried in dll_list::ctimename is reused
in dll_list::update_forkables_needs - have added a comment now.
Thanks.
Post by Michael Haubenwallner
Post by Corinna Vinschen
Analogue in dll_list::update_forkables_needs. Do you really need to
check the volume serial number? Otherwise, just call
NtQueryInformationFile(FileInternalInformation).
The idea behind checking the volume serial number is that I'm unsure
whether dlls can be replaced by symlinks to other drives as well, so
the FileId probably could become equal for different files...
Symlinks can't be used for various reasons, not the least of them
the fact that non-admin users can't create them by default.
Post by Michael Haubenwallner
BTW: MSDN document for FILE_INTERNAL_INFORMATION structure tells about
IndexNumber as the member - and so does /usr/include/w32api/, but
winsup/cygwin/ntdll.h does name it FileId - caused me some confusion...
Is that intentionally?
It's historical. FileId is the name of the field as used by Gary
Nebbett's book "Windows NT/2000 Native API Reference", heavily used when
MSDN had even less documentation of the NT API. Unfortunately there was
never a followup book :(

I'd be not too sad to get patches changing this to the currently blessed
name...


Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat
Corinna Vinschen
2015-12-08 21:48:08 UTC
Permalink
Hi Michael,


sorry for the delay, but I have some other duties besides Cygwin and the
W10 1511 problem has thrown me back even more, so I only now find the
time to review your patch.

Subterfuge aside, I have a few questions and comments, see below.
Plesae note that I didn't apply the patch so far but just inspected it.
Post by Michael Haubenwallner
Post by Corinna Vinschen
I'm not yet clear on how this is going to handle binaries
installed to another drive...
For now, when NtSetInformationFile (FileLinkInformation) does fail
for whatever reason (I cannot think of anything else than another
drive), this file is not hardlinked and will be used from its original
location. Must admit I haven't tested this scenario.
Post by Corinna Vinschen
Post by Michael Haubenwallner
either as hardlink, copy, or symlink(?).
* Either create some cygfork-directory on each drive to hold the
hardlinks (still requires NTFS), and put symlinks into /var/run/
* or do a file-copy using the pre-opened handle into /var/run/
(should work even with FAT, but would be slow).
Indeed. Let's stick to the DLLs from the distro which are not supposed
to be on another drive. /var/run and hardlinks should do. This
functionality won't be supported on FAT and I don't see a reason to.
Post by Michael Haubenwallner
BTW: MSDN document for FILE_INTERNAL_INFORMATION structure tells about
IndexNumber as the member - and so does /usr/include/w32api/, but
winsup/cygwin/ntdll.h does name it FileId - caused me some confusion...
Is that intentionally?
Fixed now.

First off, before I dive into the patch:

Would you mind terribly to split this patch into a patchset so we
get a set of smaller patches as far as it makes sense? I'm a bit
reluctant to apply such a big patch in one go. I did this myself
a lot back in the pre-git CVS times, but the longer I work with git
the more I appreciate patches split into sets of smaller patches.
They are easier to review and *much* easier to handle when bisecting
the code in case of searching for a bug.

Another thing occured to me: Doesn't using this stuff break the output
of /proc/<PID>/maps?
Post by Michael Haubenwallner
+/* Use this buffer under loader lock conditions only. */
+PWCHAR
+dll::nt_max_path_buf ()
+{
+ static WCHAR NO_COPY buf[NT_MAX_PATH];
+ return buf;
+}
Shouldn't buf (with different name then perhaps) be a private static
member of dll_list instead? dll::nt_max_path_buf() could become an
inline function then.
Post by Michael Haubenwallner
+/* Return *nameptr probably prefixed with "\\??\\", but
+ update *nameptr to stay without the "\\??\\" prefix. */
+PWCHAR
+dll::to_ntname (PWCHAR *nameptr, WCHAR ntnamebuf[NT_MAX_PATH])
Why does dll::to_ntname need a second parameter if it's always called
on the buffer returned by dll::nt_max_path_buf?
Post by Michael Haubenwallner
+{
+ /* avoid using path_conv here: cygheap might not be
+ initialized when started from non-cygwin process,
+ or still might be frozen in_forkee */
+ PWCHAR ntname = *nameptr;
+ if ((*nameptr)[1] == L':')
What if Cygwin has been installed on a network drive with no drive
letter attached? In that case you'd have to deal with UNC paths,
but they are ignored here.
Post by Michael Haubenwallner
+ {
+ ntname = ntnamebuf;
+ memmove (ntname + 4, *nameptr,
+ sizeof (*nameptr) * min (NT_MAX_PATH - 4, wcslen (*nameptr) + 1));
+ wcsncpy (ntname, L"\\??\\", 4);
+ ntname[NT_MAX_PATH - 1] = L'\0';
+ *nameptr = ntname + 4;
+ }
+ else
+ if (!wcsncmp (*nameptr, L"\\??\\", 4))
Small style nit: if else in a single line, please.
Post by Michael Haubenwallner
+ *nameptr += 4;
+ return ntname;
+}
+ [...]
+HANDLE
+dll::ntopenfile (PWCHAR ntname, NTSTATUS *pstatus, ULONG openopts)
+{
+ NTSTATUS status;
+ if (!pstatus)
+ pstatus = &status;
+
+ UNICODE_STRING fn;
+ RtlInitUnicodeString (&fn, ntname);
+
+ OBJECT_ATTRIBUTES oa;
+ InitializeObjectAttributes (&oa, &fn, 0, NULL, NULL);
+
+ ACCESS_MASK access = FILE_READ_ATTRIBUTES;
+ if (openopts & FILE_DELETE_ON_CLOSE)
+ access |= DELETE;
+ if (openopts & FILE_DIRECTORY_FILE)
+ access |= FILE_LIST_DIRECTORY;
+
+ access |= SYNCHRONIZE;
+ openopts |= FILE_SYNCHRONOUS_IO_NONALERT;
+
+ HANDLE fh = NULL;
+ ULONG share = FILE_SHARE_VALID_FLAGS;
+ IO_STATUS_BLOCK iosb;
+ *pstatus = NtOpenFile (&fh, access, &oa, &iosb, share, openopts);
+ debug_printf ("%y = NtOpenFile (%p, a %xh, sh %xh, o %xh, io %y, '%W')",
+ *pstatus, fh, access, share, openopts, iosb.Status, fn.Buffer);
+
+ return fh;
Am I too paranoid? AFAICS, NtOpenFile doesn't give a guarantee that
the handle stays unchanged if the function fails. Wouldn't be something
like

return NT_SUCCESS (*pstatus) ? fh : NULL;

be safer here?
Post by Michael Haubenwallner
else
{
+ size_t forknamesize = forkable_namesize (type, name, modname);
+
/* FIXME: Change this to new at some point. */
- d = (dll *) cmalloc (HEAP_2_DLL, sizeof (*d) + (namelen * sizeof (*name)));
+ d = (dll *) cmalloc (HEAP_2_DLL, sizeof (*d) +
+ ((namelen + 1 + forknamesize) * sizeof (*name)));
Minor style:

d = (dll *) cmalloc (HEAP_2_DLL, sizeof (*d)
+ ((namelen + 1 + forknamesize) * sizeof (*name)));
Post by Michael Haubenwallner
@@ -236,13 +326,24 @@ dll_list::alloc (HINSTANCE h, per_process *p, dll_type type)
wcscpy (d->name, name);
d->modname = d->name + (modname - name);
d->handle = h;
- d->has_dtors = true;
+ d->has_dtors = type > DLL_SELF;
A small comment at this point could be helpful in the long run.
Post by Michael Haubenwallner
d->p = p;
d->ndeps = 0;
d->deps = NULL;
d->image_size = ((pefile*)h)->optional_hdr ()->SizeOfImage;
d->preferred_base = (void*) ((pefile*)h)->optional_hdr()->ImageBase;
d->type = type;
+ d->forkable_name = d->name + namelen + 1;
+ *d->forkable_name = L'\0';
+ d->fbi.FileAttributes = INVALID_FILE_ATTRIBUTES;
+ LARGE_INTEGER zero = { 0 };
+ d->fii.FileId = zero;
Oops, sorry, FileId -> NameIndex :}
Post by Michael Haubenwallner
+ PWCHAR ntname = dll::to_ntname (&name, dll::nt_max_path_buf ());
+ NTSTATUS status;
+ d->fhandle = dll::ntopenfile (ntname, &status);
+ if (!d->fhandle)
+ system_printf ("Unable (ntstatus %y) to open file %W",
+ status, ntname);
append (d);
if (type == DLL_LOAD)
loaded_dlls++;
@@ -396,7 +497,7 @@ dll_list::detach (void *retaddr)
if (!myself || in_forkee)
return;
guard (true);
- if ((d = find (retaddr)))
+ if ((d = find (retaddr)) && d->type != DLL_SELF)
dll_list::find is only ever used to find dlls with d->type != DLL_SELF.
Wouldn't it make sense to move this check into the method?
Post by Michael Haubenwallner
diff --git a/winsup/cygwin/dll_init.h b/winsup/cygwin/dll_init.h
index 8127d0b..12344de 100644
--- a/winsup/cygwin/dll_init.h
+++ b/winsup/cygwin/dll_init.h
@@ -43,6 +43,7 @@ struct per_module
typedef enum
{
DLL_NONE,
+ DLL_SELF, /* main-program.exe, cygwin1.dll */
DLL_LINK,
DLL_LOAD,
DLL_ANY
@@ -61,7 +62,13 @@ struct dll
DWORD image_size;
void* preferred_base;
PWCHAR modname;
- WCHAR name[1];
+ /* forkable { */
Please drop these bracketing comments, especially the braces in them.
If you want to emphasize them, just use an introductory comment and
an empty line at the end of the block.
Post by Michael Haubenwallner
diff --git a/winsup/cygwin/fork.cc b/winsup/cygwin/fork.cc
index 084f8f0..dd0f9a6 100644
--- a/winsup/cygwin/fork.cc
+++ b/winsup/cygwin/fork.cc
@@ -356,8 +351,19 @@ frok::parent (volatile char * volatile stack_here)
while (1)
{
+ dlls.request_forkables ();
I'm aware that hold_everything has been called but, is that safe?
request_forkables calls update_forkables_needs which in turn uses the
static dll::nt_max_path_buf buffer...
Post by Michael Haubenwallner
+/* Mangle full srcpath as one single filename into targetbuf,
+ optionally telling the last path separator to allow for
+ restoring the file's basename.
+ Return value is the number of characters mangled. */
+static int
+mangle_as_filename (PWCHAR targetbuf, PCWCHAR srcpath, PWCHAR *lastpathsep)
+{
+ PWCHAR target = targetbuf;
+ if (lastpathsep)
+ *lastpathsep = NULL;
+
+ for (; *srcpath; ++srcpath)
+ switch (*srcpath)
+ {
+ if (lastpathsep)
+ *lastpathsep = target;
+ *(target++) = L',';
+ break;
+ *(target++) = L'_';
This parens are not required. Please remove them.
Post by Michael Haubenwallner
+ break;
+ *(target++) = *srcpath;
+ break;
+ }
+ return target - targetbuf;
+}
So what happens with a path where the parent dir path is > NAME_MAX?
If I didn't miss something, this won't work for very long paths.
Post by Michael Haubenwallner
+/* Create the lastsepcount directories found in dirname, where
+ counting is done along path separators (including trailing ones).
+ Returns true when these directories exist afterwards, false otherways.
+ The dirname is used for the path-splitting. */
+static
+bool mkdirs (PWCHAR dirname, SECURITY_ATTRIBUTES *psec, int lastsepcount)
+{
+ bool success = true;
+ int i = lastsepcount;
+ for (--i; i > 0; --i)
+ {
+ PWCHAR lastsep = wcsrchr (dirname, L'\\');
+ if (!lastsep)
+ break;
+ *lastsep = L'\0';
+ }
+
+ for (++i; i <= lastsepcount; ++i)
+ {
+ if (success && (i == 0 || wcslen (wcsrchr (dirname, L'\\')) > 1))
+ {
+ BOOL ret = CreateDirectoryW (dirname, psec);
NtCreateFile?
Post by Michael Haubenwallner
+ if (!ret && GetLastError () != ERROR_ALREADY_EXISTS)
+ success = false;
+ debug_printf ("%d = CreateDirectoryW (%W) %E", ret, dirname);
+ }
+ if (i < lastsepcount)
+ dirname[wcslen (dirname)] = L'\\'; /* restore original value */
+ }
+ return success;
+}
[...]
+/* Nominate the hardlink to an individual DLL inside dirx_name,
+ that ends with the path separator (hence the "x" varname).
+ With NULL as dirx_name, never nominate the hardlink any more.
+ With "" as dirx_name, denominate the hardlink. */
+void
+dll::nominate_forkable (PCWCHAR dirx_name)
+{
+ if (!dirx_name)
+ {
+ debug_printf ("type %d disable %W", type, name);
+ forkable_name = NULL; /* never create a hardlink for this dll */
+ }
+
+ if (!forkable_name)
+ return;
+
+ wcscpy (forkable_name, dirx_name);
Suggestion: Use

PWCHAR p = wcpcpy (forkable_name, dirx_name);

You can use p later on...
Post by Michael Haubenwallner
+
+ if (!*forkable_name)
+ return; /* denominate */
+
+ if (type < DLL_LOAD)
+ wcscat (forkable_name, modname);
...here as in

wcpcpy (p, modname);
Post by Michael Haubenwallner
+ else
+ {
+ * mangle full path into one single directory name,
+ * just keep original filename intact. The original
+ * filename is necessary to serve as linked
+ * dependencies of dynamically loaded dlls. */
+ PWCHAR lastpathsep;
+ mangle_as_filename (forkable_name + wcslen (forkable_name), name,
+ &lastpathsep);
...and here as in

mangle_as_filename (p, name, &lastpathsep);

This avoids calling wcslen twice.
Post by Michael Haubenwallner
+ if (lastpathsep)
+ *lastpathsep = L'\\';
+ }
+}
+
+/* Create the nominated hardlink for one indivitual dll,
+ inside another subdirectory when dynamically loaded. */
+bool
+dll::create_forkable ()
+{
+ if (!forkable_name || !*forkable_name)
+ return true; /* disabled */
+
+ if (!fhandle)
+ return false;
+
+ PWCHAR last = NULL;
+ bool success = true;
+ if (type >= DLL_LOAD)
+ {
+ last = wcsrchr (forkable_name, L'\\');
+ if (!last)
+ return false;
+ *last = L'\0';
+ success = mkdirs (forkable_name, &sec_none_nih, 1);
What's the uppermost directory which will be created here? Is it safe to
use sec_none_nih for all of them?
Post by Michael Haubenwallner
+/* Into buf if not NULL, write the ntname of cygwin installation_root.
+ Return the number of characters (that would be) written. */
+int
+dll_list::rootname (PWCHAR buf)
+{
+ PWCHAR cygroot = cygheap->installation_root;
+ if (!buf)
+ return wcslen (cygroot);
+
+ dll::to_ntname (&cygroot, dll::nt_max_path_buf ());
+ wcscpy (buf, cygroot);
+ return wcslen (buf);
return wcpcpy (buf, cygroot) - buf;

Alternatively we could add a cygheap->installation_root_len member.
Post by Michael Haubenwallner
+}
+
+/* Into buf if not NULL, write the string representation of current user sid.
+ Return the number of characters (that would be) written. */
+int
+dll_list::sidname (PWCHAR buf)
+{
+ if (!buf)
+ return 128;
+
+ UNICODE_STRING sid;
+ WCHAR sidbuf[128+1];
+ RtlInitEmptyUnicodeString (&sid, sidbuf, sizeof (sidbuf));
+ RtlConvertSidToUnicodeString (&sid, cygheap->user.sid (), FALSE);
+ wcscpy (buf, sid.Buffer);
+ return wcslen (buf);
See above.
Post by Michael Haubenwallner
+int
+dll_list::ctimename (PWCHAR buf)
+{
+ if (!buf)
+ return 16;
+
+ LARGE_INTEGER zero = { 0 };
+ LARGE_INTEGER newest = { 0 };
+ /* Need by-handle-file-information for _all_ loaded dlls,
+ as most recent ctime forms the hardlinks directory. */
+ dll *d = &dlls.start;
+ while ((d = d->next))
+ {
+ if (!d->fhandle)
+ continue;
+ if (d->fbi.FileAttributes == INVALID_FILE_ATTRIBUTES)
+ {
+ /* Relevant FILE_BASIC_INFORMATION members are
+ valid in child when cygheap-copied from parent,
+ so caching them is fine.
+ The FILE_BASIC_INFORMATION queried here also is
+ used in dll_list::update_forkables_needs below. */
+ NTSTATUS status;
+ IO_STATUS_BLOCK iosb;
+ status = NtQueryInformationFile (d->fhandle, &iosb,
+ &d->fbi, sizeof (d->fbi),
+ FileBasicInformation);
+ if (!NT_SUCCESS (status))
+ {
+ system_printf ("WARNING: %y = NtQueryInformationFile (%p,"
+ " BasicInfo, io.Status %y, %W)",
+ status, d->fhandle, iosb.Status, d->name);
+ d->fbi.FileAttributes = 0;
+ d->fbi.LastWriteTime = zero;
+ }
+ }
+
+ if (d->fbi.LastWriteTime.QuadPart > newest.QuadPart)
+ newest = d->fbi.LastWriteTime;
LastWriteTime? This is supposed to check if a newer DLL replaced
an older in-use one, so shouldn't that be CreationTime?
Post by Michael Haubenwallner
+/* Create the nominated forkable hardlinks and directories as necessary,
+ as well as the .local file for dll-redirection. */
+bool
+dll_list::create_forkables ()
+{
+ bool success = true;
+
+ int lastsepcount = 1; /* we have trailing pathsep */
+ for (namepart const *part = forkable_nameparts; part->text; ++part)
+ if (part->create_dir)
+ ++lastsepcount;
+
+ PWCHAR buf = dll::nt_max_path_buf ();
+ wcsncpy (buf, forkables_dirx_name, NT_MAX_PATH);
+
+ if (!mkdirs (buf, &sec_none_nih, lastsepcount))
Again, sec_none_nih for the entire directory tree?
Post by Michael Haubenwallner
+ success = false;
+
+ if (success)
+ {
+ /* create the DotLocal file as empty file */
+ wcsncat (buf, main_executable->modname, NT_MAX_PATH);
+ wcsncat (buf, L".local", NT_MAX_PATH);
+ HANDLE hlocal = CreateFileW (buf, GENERIC_WRITE,
+ FILE_SHARE_READ, &sec_none_nih,
+ OPEN_ALWAYS, 0, NULL);
Please use NtCreateFile. We're using CreateFile only for the console
and in one instance for pipes.


What's missing is user documentation for this feature. A bit of
description what happens, what has to be done by the user, and what
limitations it has would be helpful.


Thanks,
Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat
Michael Haubenwallner
2016-01-26 16:23:42 UTC
Permalink
Hi Corinna,
Post by Corinna Vinschen
Hi Michael,
sorry for the delay, but I have some other duties besides Cygwin and the
W10 1511 problem has thrown me back even more, so I only now find the
time to review your patch.
Never mind, same here ;)
Post by Corinna Vinschen
Post by Michael Haubenwallner
BTW: MSDN document for FILE_INTERNAL_INFORMATION structure tells about
IndexNumber as the member - and so does /usr/include/w32api/, but
winsup/cygwin/ntdll.h does name it FileId - caused me some confusion...
Is that intentionally?
Fixed now.
Thanks!
Post by Corinna Vinschen
Would you mind terribly to split this patch into a patchset so we
get a set of smaller patches as far as it makes sense? I'm a bit
reluctant to apply such a big patch in one go. I did this myself
a lot back in the pre-git CVS times, but the longer I work with git
the more I appreciate patches split into sets of smaller patches.
They are easier to review and *much* easier to handle when bisecting
the code in case of searching for a bug.
Accepted, fine with me!
've thought of splitting the patch along something like:
* declare additional struct members
* define additional methods as noop
* call new noop methods in existing code
* allocate additional memory
* open additional handles
* implement preparation methods
* implement nomination methods
* synchronize (use semaphore)
* implement cleanup
* implement hardlink-creation
* activate hardlink-creation
* probably more...

Will post them to -patches then.
Post by Corinna Vinschen
Another thing occured to me: Doesn't using this stuff break the output
of /proc/<PID>/maps?
Whenever the original binary has been removed - it is moved to trashbin
actually, it turns out that /proc/<PID>/maps shows the trashbin-filename.
Not sure if you call this "broken" - or what would be "unbroken" here.
Post by Corinna Vinschen
Post by Michael Haubenwallner
+/* Use this buffer under loader lock conditions only. */
+PWCHAR
+dll::nt_max_path_buf ()
+{
+ static WCHAR NO_COPY buf[NT_MAX_PATH];
+ return buf;
+}
Shouldn't buf (with different name then perhaps) be a private static
member of dll_list instead? dll::nt_max_path_buf() could become an
inline function then.
Fine with me.
Post by Corinna Vinschen
Post by Michael Haubenwallner
+/* Return *nameptr probably prefixed with "\\??\\", but
+ update *nameptr to stay without the "\\??\\" prefix. */
+PWCHAR
+dll::to_ntname (PWCHAR *nameptr, WCHAR ntnamebuf[NT_MAX_PATH])
Why does dll::to_ntname need a second parameter if it's always called
on the buffer returned by dll::nt_max_path_buf?
This merely is to indicate that it does need some buffer. When removing
the second parameter, I'd rename to something like dll::buffered_ntname.
Post by Corinna Vinschen
Post by Michael Haubenwallner
+{
+ /* avoid using path_conv here: cygheap might not be
+ initialized when started from non-cygwin process,
+ or still might be frozen in_forkee */
+ PWCHAR ntname = *nameptr;
+ if ((*nameptr)[1] == L':')
What if Cygwin has been installed on a network drive with no drive
letter attached? In that case you'd have to deal with UNC paths,
but they are ignored here.
Isn't the UNC prefix cut off right after GetModuleFileNameW in dll_list::alloc?

Actually, I didn't fully grok the subtleties along "\\?\", "\??\", "UNC\", "\\"
and their usage differences with functions from kernel32.dll and ntdll.dll yet.
Is it possible in every case that ntdll just may need the additional prefix "\??\",
while kernel32 takes the same path without that prefix?

I do think of storing only the ntdll-name in the structures, and have a non-
prefix name pointer into that ntdll-name buffer for use by CreateProcess&co then.
Post by Corinna Vinschen
Post by Michael Haubenwallner
+ else
+ if (!wcsncmp (*nameptr, L"\\??\\", 4))
Small style nit: if else in a single line, please.
ok.
Post by Corinna Vinschen
Post by Michael Haubenwallner
+ HANDLE fh = NULL;
+ ULONG share = FILE_SHARE_VALID_FLAGS;
+ IO_STATUS_BLOCK iosb;
+ *pstatus = NtOpenFile (&fh, access, &oa, &iosb, share, openopts);
+ debug_printf ("%y = NtOpenFile (%p, a %xh, sh %xh, o %xh, io %y, '%W')",
+ *pstatus, fh, access, share, openopts, iosb.Status, fn.Buffer);
+
+ return fh;
Am I too paranoid? AFAICS, NtOpenFile doesn't give a guarantee that
the handle stays unchanged if the function fails. Wouldn't be something
like
return NT_SUCCESS (*pstatus) ? fh : NULL;
be safer here?
Fine with me - you are the experienced one!
Post by Corinna Vinschen
Post by Michael Haubenwallner
- d = (dll *) cmalloc (HEAP_2_DLL, sizeof (*d) + (namelen * sizeof (*name)));
+ d = (dll *) cmalloc (HEAP_2_DLL, sizeof (*d) +
+ ((namelen + 1 + forknamesize) * sizeof (*name)));
d = (dll *) cmalloc (HEAP_2_DLL, sizeof (*d)
+ ((namelen + 1 + forknamesize) * sizeof (*name)));
ok.
Post by Corinna Vinschen
Post by Michael Haubenwallner
@@ -236,13 +326,24 @@ dll_list::alloc (HINSTANCE h, per_process *p, dll_type type)
wcscpy (d->name, name);
d->modname = d->name + (modname - name);
d->handle = h;
- d->has_dtors = true;
+ d->has_dtors = type > DLL_SELF;
A small comment at this point could be helpful in the long run.
Indeed.
Post by Corinna Vinschen
Post by Michael Haubenwallner
+ LARGE_INTEGER zero = { 0 };
+ d->fii.FileId = zero;
Oops, sorry, FileId -> NameIndex :}
IndexNumber, actually.
Post by Corinna Vinschen
Post by Michael Haubenwallner
@@ -396,7 +497,7 @@ dll_list::detach (void *retaddr)
if (!myself || in_forkee)
return;
guard (true);
- if ((d = find (retaddr)))
+ if ((d = find (retaddr)) && d->type != DLL_SELF)
dll_list::find is only ever used to find dlls with d->type != DLL_SELF.
Wouldn't it make sense to move this check into the method?
You mean into the dll_list::find method?
Fine with me.
Post by Corinna Vinschen
Post by Michael Haubenwallner
@@ -61,7 +62,13 @@ struct dll
DWORD image_size;
void* preferred_base;
PWCHAR modname;
- WCHAR name[1];
+ /* forkable { */
Please drop these bracketing comments, especially the braces in them.
If you want to emphasize them, just use an introductory comment and
an empty line at the end of the block.
ok.
Post by Corinna Vinschen
Post by Michael Haubenwallner
diff --git a/winsup/cygwin/fork.cc b/winsup/cygwin/fork.cc
index 084f8f0..dd0f9a6 100644
--- a/winsup/cygwin/fork.cc
+++ b/winsup/cygwin/fork.cc
@@ -356,8 +351,19 @@ frok::parent (volatile char * volatile stack_here)
while (1)
{
+ dlls.request_forkables ();
I'm aware that hold_everything has been called but, is that safe?
request_forkables calls update_forkables_needs which in turn uses the
static dll::nt_max_path_buf buffer...
While I have a vague idea only on what hold_everything actually does,
isn't one of the ideas for the static buffer to be useable even while
everything else (heap in particular) is on hold?
Especially as that static buffer does have the NO_COPY attribute...
Post by Corinna Vinschen
Post by Michael Haubenwallner
+ *(target++) = L'_';
This parens are not required. Please remove them.
ok.
Post by Corinna Vinschen
So what happens with a path where the parent dir path is > NAME_MAX?
If I didn't miss something, this won't work for very long paths.
Reading through the docs I've got the impression that while NT_MAX_PATH is
to hold a very long path, a single path part is still limited to NAME_MAX,
but I may easily be wrong here.

Feels hard - but possible - to utilize the static buffer for recursive
NtQueryDirectoryFile instead. Or add a second static buffer?
Post by Corinna Vinschen
Post by Michael Haubenwallner
+ for (++i; i <= lastsepcount; ++i)
+ {
+ if (success && (i == 0 || wcslen (wcsrchr (dirname, L'\\')) > 1))
+ {
+ BOOL ret = CreateDirectoryW (dirname, psec);
NtCreateFile?
ok.
Post by Corinna Vinschen
Post by Michael Haubenwallner
+ wcscpy (forkable_name, dirx_name);
Suggestion: Use
PWCHAR p = wcpcpy (forkable_name, dirx_name);
You can use p later on...
...and here as in
mangle_as_filename (p, name, &lastpathsep);
This avoids calling wcslen twice.
ok, 've not seen wcpcpy before.
Post by Corinna Vinschen
Post by Michael Haubenwallner
+/* Create the nominated hardlink for one indivitual dll,
+ inside another subdirectory when dynamically loaded. */
+bool
+dll::create_forkable ()
+{
+ if (type >= DLL_LOAD)
+ {
+ last = wcsrchr (forkable_name, L'\\');
+ if (!last)
+ return false;
+ *last = L'\0';
+ success = mkdirs (forkable_name, &sec_none_nih, 1);
What's the uppermost directory which will be created here?
The idea is to control this via the 'create_dir' member of forkable_nameparts[].
Currently /var/run/cygfork must exist, so /var/run/cygfork/<sid> will be created first.
Post by Corinna Vinschen
Is it safe to use sec_none_nih for all of them?
As long as /var/run/cygfork exists with tmpdir-like permission, sec_none_nih
seems fine: For the directories created, ls -l shows 'drwxr-xr-x+' permissions.
Post by Corinna Vinschen
Post by Michael Haubenwallner
+/* Into buf if not NULL, write the ntname of cygwin installation_root.
+ Return the number of characters (that would be) written. */
+int
+dll_list::rootname (PWCHAR buf)
+{
+ PWCHAR cygroot = cygheap->installation_root;
+ if (!buf)
+ return wcslen (cygroot);
+
+ dll::to_ntname (&cygroot, dll::nt_max_path_buf ());
+ wcscpy (buf, cygroot);
+ return wcslen (buf);
return wcpcpy (buf, cygroot) - buf;
ok.
Post by Corinna Vinschen
Alternatively we could add a cygheap->installation_root_len member.
Feels like subsequent optimization - beyond this feature patch for now.
Post by Corinna Vinschen
Post by Michael Haubenwallner
+int
+dll_list::sidname (PWCHAR buf)
+{
+ wcscpy (buf, sid.Buffer);
+ return wcslen (buf);
See above.
ok.
Post by Corinna Vinschen
Post by Michael Haubenwallner
+int
+dll_list::ctimename (PWCHAR buf)
+{
+ /* Relevant FILE_BASIC_INFORMATION members are
+ valid in child when cygheap-copied from parent,
+ so caching them is fine.
+ The FILE_BASIC_INFORMATION queried here also is
+ used in dll_list::update_forkables_needs below. */
+ NTSTATUS status;
+ IO_STATUS_BLOCK iosb;
+ status = NtQueryInformationFile (d->fhandle, &iosb,
+ &d->fbi, sizeof (d->fbi),
+ FileBasicInformation);
+ if (!NT_SUCCESS (status))
+ {
+ system_printf ("WARNING: %y = NtQueryInformationFile (%p,"
+ " BasicInfo, io.Status %y, %W)",
+ status, d->fhandle, iosb.Status, d->name);
+ d->fbi.FileAttributes = 0;
+ d->fbi.LastWriteTime = zero;
+ }
+ }
+
+ if (d->fbi.LastWriteTime.QuadPart > newest.QuadPart)
+ newest = d->fbi.LastWriteTime;
LastWriteTime? This is supposed to check if a newer DLL replaced
an older in-use one, so shouldn't that be CreationTime?
CreationTime bumps even when creating a hardlink, while LastWriteTime
seems to more properly tell about the last file-content modification.
Added a comment.
Post by Corinna Vinschen
Post by Michael Haubenwallner
+/* Create the nominated forkable hardlinks and directories as necessary,
+ as well as the .local file for dll-redirection. */
+bool
+dll_list::create_forkables ()
+{
+ if (!mkdirs (buf, &sec_none_nih, lastsepcount))
Again, sec_none_nih for the entire directory tree?
As I didn't fully grok the windows security too: What is your concern here?
Post by Corinna Vinschen
Post by Michael Haubenwallner
+ success = false;
+
+ if (success)
+ {
+ /* create the DotLocal file as empty file */
+ wcsncat (buf, main_executable->modname, NT_MAX_PATH);
+ wcsncat (buf, L".local", NT_MAX_PATH);
wcpncat missing?
Post by Corinna Vinschen
Post by Michael Haubenwallner
+ HANDLE hlocal = CreateFileW (buf, GENERIC_WRITE,
+ FILE_SHARE_READ, &sec_none_nih,
+ OPEN_ALWAYS, 0, NULL);
Please use NtCreateFile. We're using CreateFile only for the console
and in one instance for pipes.
ok.
Post by Corinna Vinschen
What's missing is user documentation for this feature. A bit of
description what happens, what has to be done by the user, and what
limitations it has would be helpful.
Agreed: Where to add such docs?

Thanks a lot!
/haubi/
Corinna Vinschen
2016-01-28 18:43:48 UTC
Permalink
Hi Michael,
Post by Michael Haubenwallner
[...]
Post by Corinna Vinschen
Would you mind terribly to split this patch into a patchset so we
get a set of smaller patches as far as it makes sense? I'm a bit
reluctant to apply such a big patch in one go. I did this myself
a lot back in the pre-git CVS times, but the longer I work with git
the more I appreciate patches split into sets of smaller patches.
They are easier to review and *much* easier to handle when bisecting
the code in case of searching for a bug.
Accepted, fine with me!
* declare additional struct members
* define additional methods as noop
* call new noop methods in existing code
* allocate additional memory
* open additional handles
* implement preparation methods
* implement nomination methods
* synchronize (use semaphore)
* implement cleanup
* implement hardlink-creation
* activate hardlink-creation
* probably more...
Will post them to -patches then.
Thanks, but splitting into a handful separately functional patches
should be sufficient.
Post by Michael Haubenwallner
Post by Corinna Vinschen
Another thing occured to me: Doesn't using this stuff break the output
of /proc/<PID>/maps?
Whenever the original binary has been removed - it is moved to trashbin
actually, it turns out that /proc/<PID>/maps shows the trashbin-filename.
Not sure if you call this "broken" - or what would be "unbroken" here.
No, what I mean is, what does it show if the file has *not* been deleted
yet? The path to /usr/bin/foo or the path to /var/run/cygfork/<SID>/foo?
The latter case would be rather irritating.

This behaviour may not be that bad in case of running with a deleted
object, though. On Linux /proc/$PID/maps prints the name of a deleted
object with a "(deleted)" tag. Maybe we can get there, too. Do we have
the information from where the file has been originally loaded still
available at that point? I guess the answer is no...
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
+/* Return *nameptr probably prefixed with "\\??\\", but
+ update *nameptr to stay without the "\\??\\" prefix. */
+PWCHAR
+dll::to_ntname (PWCHAR *nameptr, WCHAR ntnamebuf[NT_MAX_PATH])
Why does dll::to_ntname need a second parameter if it's always called
on the buffer returned by dll::nt_max_path_buf?
This merely is to indicate that it does need some buffer. When removing
the second parameter, I'd rename to something like dll::buffered_ntname.
ack
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
+{
+ /* avoid using path_conv here: cygheap might not be
+ initialized when started from non-cygwin process,
+ or still might be frozen in_forkee */
+ PWCHAR ntname = *nameptr;
+ if ((*nameptr)[1] == L':')
What if Cygwin has been installed on a network drive with no drive
letter attached? In that case you'd have to deal with UNC paths,
but they are ignored here.
Isn't the UNC prefix cut off right after GetModuleFileNameW in
dll_list::alloc?
Not necessarily. GetModuleFileNameW returns \\?\ paths as well. It
depends on how the executable has been loaded.
Post by Michael Haubenwallner
Actually, I didn't fully grok the subtleties along "\\?\", "\??\",
"UNC\", "\\" and their usage differences with functions from
kernel32.dll and ntdll.dll yet. Is it possible in every case that
ntdll just may need the additional prefix "\??\", while kernel32 takes
the same path without that prefix?
It's not very complicated, just puzzeling at first:

Short Win32 paths:

Drive letter path: C:\foo\bar
UNC path: \\server\share\baz

Equivalent long Win32 paths:

Drive letter path: \\?\C:\foo\bar
UNC path: \\?\UNC\server\share\baz

Equivalent native NT paths:

Drive letter path: \??\C:\foo\bar
UNC path: \??\UNC\server\share\baz

Note 1: Long Win32 paths are identical to internal NT paths, except
for the first question mark replaced by a backslash. Why?
I have no idea. History, probably.

Note 2: Short Drive letter paths simply get a "\??\" prepended to be
converted to NT paths.

Note 3: UNC paths get their first backslash replace with "\??\UNC",
so one backslash is lost on the way to the native NT form.

Note 4: "\??\" is just the name of a(*) virtual directory in the native
NT namespace which contains symlinks to the actual devices. The
"winobj" tool from sysinternals is quite instructive. Exposure
to the Win32 world is performed by the QueryDosDevices and
DefineDosDevice calls.

(*) Actually more than one, what with system and user symlinks
merged into a single view...
Post by Michael Haubenwallner
I do think of storing only the ntdll-name in the structures, and have a non-
prefix name pointer into that ntdll-name buffer for use by CreateProcess&co then.
Good idea in general. Just the UNC path problem above is slightly in
the way.
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
+ LARGE_INTEGER zero = { 0 };
+ d->fii.FileId = zero;
Oops, sorry, FileId -> NameIndex :}
IndexNumber, actually.
Duh, right.
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
@@ -396,7 +497,7 @@ dll_list::detach (void *retaddr)
if (!myself || in_forkee)
return;
guard (true);
- if ((d = find (retaddr)))
+ if ((d = find (retaddr)) && d->type != DLL_SELF)
dll_list::find is only ever used to find dlls with d->type != DLL_SELF.
Wouldn't it make sense to move this check into the method?
You mean into the dll_list::find method?
Yep.
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
diff --git a/winsup/cygwin/fork.cc b/winsup/cygwin/fork.cc
index 084f8f0..dd0f9a6 100644
--- a/winsup/cygwin/fork.cc
+++ b/winsup/cygwin/fork.cc
@@ -356,8 +351,19 @@ frok::parent (volatile char * volatile stack_here)
while (1)
{
+ dlls.request_forkables ();
I'm aware that hold_everything has been called but, is that safe?
request_forkables calls update_forkables_needs which in turn uses the
static dll::nt_max_path_buf buffer...
While I have a vague idea only on what hold_everything actually does,
isn't one of the ideas for the static buffer to be useable even while
everything else (heap in particular) is on hold?
Especially as that static buffer does have the NO_COPY attribute...
Yeah, right.
Post by Michael Haubenwallner
Post by Corinna Vinschen
So what happens with a path where the parent dir path is > NAME_MAX?
If I didn't miss something, this won't work for very long paths.
Reading through the docs I've got the impression that while NT_MAX_PATH is
to hold a very long path, a single path part is still limited to NAME_MAX,
but I may easily be wrong here.
Well, yes, NAME_MAX is the maximum length of a single path component.
But the comment preceeding mangle_as_filename says:

"Mangle full srcpath as one single filename ..."

The function replaces backslashes with commas. So, IIUC, an incoming
path consisting of, e.g., a dir with NAME_MAX length, a backslash, and
a filename with NAME_MAX length will be converted to an invalid path
with a single path component of 2 * NAME_MAX + 1 (comma) length. What
am I missing?
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
+ wcscpy (forkable_name, dirx_name);
Suggestion: Use
PWCHAR p = wcpcpy (forkable_name, dirx_name);
You can use p later on...
...and here as in
mangle_as_filename (p, name, &lastpathsep);
This avoids calling wcslen twice.
ok, 've not seen wcpcpy before.
http://pubs.opengroup.org/onlinepubs/9699919799/functions/wcpcpy.html
Post by Michael Haubenwallner
Post by Corinna Vinschen
Is it safe to use sec_none_nih for all of them?
As long as /var/run/cygfork exists with tmpdir-like permission,
sec_none_nih seems fine: For the directories created, ls -l shows
'drwxr-xr-x+' permissions.
That's inherited from the parent. Hmm, ok, yeah, that sounds like
the right thing. We can revisit it later if a need arises for some
reason.
Post by Michael Haubenwallner
Post by Corinna Vinschen
Alternatively we could add a cygheap->installation_root_len member.
Feels like subsequent optimization - beyond this feature patch for now.
Ok.
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
+ if (d->fbi.LastWriteTime.QuadPart > newest.QuadPart)
+ newest = d->fbi.LastWriteTime;
LastWriteTime? This is supposed to check if a newer DLL replaced
an older in-use one, so shouldn't that be CreationTime?
CreationTime bumps even when creating a hardlink,
Not on my machine. Observe the "Birth" date in stat after creating a
hardlink to a file:

$ stat tsock.c | grep Birth
Birth: 2016-01-11 18:57:13.584871900 +0100
$ ln tsock.c tsock.link
$ stat tsock.c | grep Birth
$ stat tsock.c tsock.link | grep Birth
Birth: 2016-01-11 18:57:13.584871900 +0100
Birth: 2016-01-11 18:57:13.584871900 +0100

Are you confusing CreationTime with ChangeTime by any chance?
Post by Michael Haubenwallner
while LastWriteTime
seems to more properly tell about the last file-content modification.
LastWriteTime might be sufficient, but the file is actually recreated
when replacing it, it's not overwritten, that's why CreationTime sounds
more correct to me. Unfortunately CreationTime is as much unsafe
against tampering as is LastWriteTime, so never mind.

A bit awkward is the name of the method, though. Ctime is the POSIX
equivalent for ChangeTime, not for LastWriteTime. Either mtime when
using LastWriteTime or birthtime when using CreationTime would be
better, methinks.
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
+/* Create the nominated forkable hardlinks and directories as necessary,
+ as well as the .local file for dll-redirection. */
+bool
+dll_list::create_forkables ()
+{
+ if (!mkdirs (buf, &sec_none_nih, lastsepcount))
Again, sec_none_nih for the entire directory tree?
As I didn't fully grok the windows security too: What is your concern here?
Not sure. I was just mulling over the default perms resulting from
it being inadaquate. Let's just keep it in mind.
Post by Michael Haubenwallner
Post by Corinna Vinschen
What's missing is user documentation for this feature. A bit of
description what happens, what has to be done by the user, and what
limitations it has would be helpful.
Agreed: Where to add such docs?
Good question. The docs have grown a bit wild... maybe create a chapter
in pathnames.xml for now. We can move it around later.

There's one point I forgot when I reviewed the patch last year. You're
doing the check if /var/run/cygfork exists in dll_list::create_forkables
once per process, per loaded DLL. Why? In theory, if you found out
that you can't create a hardlink once, you're done.

What I'd like to see is that failing to hardlink a file the first time
*because /var/run/cygfork does not exist* results in setting a flag in
cygheap so that no other process in the process tree ever tries to use
this feature again. An explicit check for existence of the dir seems
necessary at some early stage in the code. Creating the directory and
exiting Cygwin processes is required to activate the feature then, but
that's ok. The impact on users not using this feature should be as low
as possible.

I'm still not overly excited about using the existence of the dir alone
as marker to activate this feature, but that can be added later...


Thanks,
Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat
Michael Haubenwallner
2016-03-30 18:37:25 UTC
Permalink
(back to this topic)
Post by Corinna Vinschen
Post by Michael Haubenwallner
[...]
Post by Corinna Vinschen
Would you mind terribly to split this patch into a patchset so we
get a set of smaller patches as far as it makes sense? I'm a bit
reluctant to apply such a big patch in one go. I did this myself
a lot back in the pre-git CVS times, but the longer I work with git
the more I appreciate patches split into sets of smaller patches.
They are easier to review and *much* easier to handle when bisecting
the code in case of searching for a bug.
Will post them to -patches then.
split and updated patches on the way now.
Post by Corinna Vinschen
Post by Michael Haubenwallner
Post by Corinna Vinschen
Another thing occured to me: Doesn't using this stuff break the output
of /proc/<PID>/maps?
Whenever the original binary has been removed - it is moved to trashbin
actually, it turns out that /proc/<PID>/maps shows the trashbin-filename.
Not sure if you call this "broken" - or what would be "unbroken" here.
No, what I mean is, what does it show if the file has *not* been deleted
yet? The path to /usr/bin/foo or the path to /var/run/cygfork/<SID>/foo?
The latter case would be rather irritating.
Haven't managed to see the /var/run/cygfork/ path in /proc/*/maps.
Post by Corinna Vinschen
This behaviour may not be that bad in case of running with a deleted
object, though. On Linux /proc/$PID/maps prints the name of a deleted
object with a "(deleted)" tag. Maybe we can get there, too. Do we have
the information from where the file has been originally loaded still
available at that point? I guess the answer is no...
When Windows moves some file into trash, IMHO there is some extra info
file containing the original path. But with Cygwin unlink() ?
Post by Corinna Vinschen
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
+/* Return *nameptr probably prefixed with "\\??\\", but
+ update *nameptr to stay without the "\\??\\" prefix. */
+PWCHAR
+dll::to_ntname (PWCHAR *nameptr, WCHAR ntnamebuf[NT_MAX_PATH])
Why does dll::to_ntname need a second parameter if it's always called
on the buffer returned by dll::nt_max_path_buf?
This merely is to indicate that it does need some buffer. When removing
the second parameter, I'd rename to something like dll::buffered_ntname.
ack
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
+{
+ /* avoid using path_conv here: cygheap might not be
+ initialized when started from non-cygwin process,
+ or still might be frozen in_forkee */
+ PWCHAR ntname = *nameptr;
+ if ((*nameptr)[1] == L':')
What if Cygwin has been installed on a network drive with no drive
letter attached? In that case you'd have to deal with UNC paths,
but they are ignored here.
Isn't the UNC prefix cut off right after GetModuleFileNameW in dll_list::alloc?
Not necessarily. GetModuleFileNameW returns \\?\ paths as well. It
depends on how the executable has been loaded.
Post by Michael Haubenwallner
Actually, I didn't fully grok the subtleties along "\\?\", "\??\",
"UNC\", "\\" and their usage differences with functions from
kernel32.dll and ntdll.dll yet. Is it possible in every case that
ntdll just may need the additional prefix "\??\", while kernel32 takes
the same path without that prefix?
Drive letter path: C:\foo\bar
UNC path: \\server\share\baz
Drive letter path: \\?\C:\foo\bar
UNC path: \\?\UNC\server\share\baz
Drive letter path: \??\C:\foo\bar
UNC path: \??\UNC\server\share\baz
Note 1: Long Win32 paths are identical to internal NT paths, except
for the first question mark replaced by a backslash. Why?
I have no idea. History, probably.
Note 2: Short Drive letter paths simply get a "\??\" prepended to be
converted to NT paths.
Note 3: UNC paths get their first backslash replace with "\??\UNC",
so one backslash is lost on the way to the native NT form.
Note 4: "\??\" is just the name of a(*) virtual directory in the native
NT namespace which contains symlinks to the actual devices. The
"winobj" tool from sysinternals is quite instructive. Exposure
to the Win32 world is performed by the QueryDosDevices and
DefineDosDevice calls.
(*) Actually more than one, what with system and user symlinks
merged into a single view...
Thanks, this is very helpful!
Post by Corinna Vinschen
Post by Michael Haubenwallner
I do think of storing only the ntdll-name in the structures, and have a non-
prefix name pointer into that ntdll-name buffer for use by CreateProcess&co then.
Good idea in general. Just the UNC path problem above is slightly in
the way.
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
+ LARGE_INTEGER zero = { 0 };
+ d->fii.FileId = zero;
Oops, sorry, FileId -> NameIndex :}
IndexNumber, actually.
Duh, right.
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
@@ -396,7 +497,7 @@ dll_list::detach (void *retaddr)
if (!myself || in_forkee)
return;
guard (true);
- if ((d = find (retaddr)))
+ if ((d = find (retaddr)) && d->type != DLL_SELF)
dll_list::find is only ever used to find dlls with d->type != DLL_SELF.
Wouldn't it make sense to move this check into the method?
You mean into the dll_list::find method?
Yep.
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
diff --git a/winsup/cygwin/fork.cc b/winsup/cygwin/fork.cc
index 084f8f0..dd0f9a6 100644
--- a/winsup/cygwin/fork.cc
+++ b/winsup/cygwin/fork.cc
@@ -356,8 +351,19 @@ frok::parent (volatile char * volatile stack_here)
while (1)
{
+ dlls.request_forkables ();
I'm aware that hold_everything has been called but, is that safe?
request_forkables calls update_forkables_needs which in turn uses the
static dll::nt_max_path_buf buffer...
While I have a vague idea only on what hold_everything actually does,
isn't one of the ideas for the static buffer to be useable even while
everything else (heap in particular) is on hold?
Especially as that static buffer does have the NO_COPY attribute...
Yeah, right.
Post by Michael Haubenwallner
Post by Corinna Vinschen
So what happens with a path where the parent dir path is > NAME_MAX?
If I didn't miss something, this won't work for very long paths.
Reading through the docs I've got the impression that while NT_MAX_PATH is
to hold a very long path, a single path part is still limited to NAME_MAX,
but I may easily be wrong here.
Well, yes, NAME_MAX is the maximum length of a single path component.
"Mangle full srcpath as one single filename ..."
The function replaces backslashes with commas. So, IIUC, an incoming
path consisting of, e.g., a dir with NAME_MAX length, a backslash, and
a filename with NAME_MAX length will be converted to an invalid path
with a single path component of 2 * NAME_MAX + 1 (comma) length. What
am I missing?
Nice catch (out-of-coffee exception)!

I'm using the main executable's IndexNumber now, and the original
directory's IndexNumber for the directory holding the hardlink for
a dynamically loaded dll.
Post by Corinna Vinschen
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
+ wcscpy (forkable_name, dirx_name);
Suggestion: Use
PWCHAR p = wcpcpy (forkable_name, dirx_name);
You can use p later on...
...and here as in
mangle_as_filename (p, name, &lastpathsep);
This avoids calling wcslen twice.
ok, 've not seen wcpcpy before.
http://pubs.opengroup.org/onlinepubs/9699919799/functions/wcpcpy.html
Post by Michael Haubenwallner
Post by Corinna Vinschen
Is it safe to use sec_none_nih for all of them?
As long as /var/run/cygfork exists with tmpdir-like permission,
sec_none_nih seems fine: For the directories created, ls -l shows
'drwxr-xr-x+' permissions.
That's inherited from the parent. Hmm, ok, yeah, that sounds like
the right thing. We can revisit it later if a need arises for some
reason.
Post by Michael Haubenwallner
Post by Corinna Vinschen
Alternatively we could add a cygheap->installation_root_len member.
Feels like subsequent optimization - beyond this feature patch for now.
Ok.
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
+ if (d->fbi.LastWriteTime.QuadPart > newest.QuadPart)
+ newest = d->fbi.LastWriteTime;
LastWriteTime? This is supposed to check if a newer DLL replaced
an older in-use one, so shouldn't that be CreationTime?
CreationTime bumps even when creating a hardlink,
Not on my machine. Observe the "Birth" date in stat after creating a
$ stat tsock.c | grep Birth
Birth: 2016-01-11 18:57:13.584871900 +0100
$ ln tsock.c tsock.link
$ stat tsock.c | grep Birth
$ stat tsock.c tsock.link | grep Birth
Birth: 2016-01-11 18:57:13.584871900 +0100
Birth: 2016-01-11 18:57:13.584871900 +0100
Are you confusing CreationTime with ChangeTime by any chance?
Post by Michael Haubenwallner
while LastWriteTime
seems to more properly tell about the last file-content modification.
LastWriteTime might be sufficient, but the file is actually recreated
when replacing it, it's not overwritten, that's why CreationTime sounds
more correct to me. Unfortunately CreationTime is as much unsafe
against tampering as is LastWriteTime, so never mind.
A bit awkward is the name of the method, though. Ctime is the POSIX
equivalent for ChangeTime, not for LastWriteTime. Either mtime when
using LastWriteTime or birthtime when using CreationTime would be
better, methinks.
I've renamed to lwtime (LastWriteTime) now. ctime is the initial name
when I started with ChangeTime, which bumps on hardlink creation - not
sure if I really tried CreationTime.
Post by Corinna Vinschen
Post by Michael Haubenwallner
Post by Corinna Vinschen
Post by Michael Haubenwallner
+/* Create the nominated forkable hardlinks and directories as necessary,
+ as well as the .local file for dll-redirection. */
+bool
+dll_list::create_forkables ()
+{
+ if (!mkdirs (buf, &sec_none_nih, lastsepcount))
Again, sec_none_nih for the entire directory tree?
As I didn't fully grok the windows security too: What is your concern here?
Not sure. I was just mulling over the default perms resulting from
it being inadaquate. Let's just keep it in mind.
Ok.
Post by Corinna Vinschen
Post by Michael Haubenwallner
Post by Corinna Vinschen
What's missing is user documentation for this feature. A bit of
description what happens, what has to be done by the user, and what
limitations it has would be helpful.
Agreed: Where to add such docs?
Good question. The docs have grown a bit wild... maybe create a chapter
in pathnames.xml for now. We can move it around later.
I've identified faq-api.xml and hightlights.xml, near the existing
fork docs...
Post by Corinna Vinschen
There's one point I forgot when I reviewed the patch last year. You're
doing the check if /var/run/cygfork exists in dll_list::create_forkables
once per process, per loaded DLL. Why? In theory, if you found out
that you can't create a hardlink once, you're done.
The idea is to re-check once per fork() call when it was available before,
to not necessarily break when /var/run/cygfork/ is removed...
Post by Corinna Vinschen
What I'd like to see is that failing to hardlink a file the first time
*because /var/run/cygfork does not exist* results in setting a flag in
cygheap so that no other process in the process tree ever tries to use
this feature again. An explicit check for existence of the dir seems
necessary at some early stage in the code. Creating the directory and
exiting Cygwin processes is required to activate the feature then, but
that's ok. The impact on users not using this feature should be as low
as possible.
Ok.
Post by Corinna Vinschen
I'm still not overly excited about using the existence of the dir alone
as marker to activate this feature, but that can be added later...
I could think of some flag to set in setup.exe, but still where to store
the flag except for the existence of the directory. I'm not sure if the
registry should be used for everything...

Thanks!
/haubi/
Corinna Vinschen
2016-04-05 08:35:37 UTC
Permalink
Post by Michael Haubenwallner
Post by Corinna Vinschen
This behaviour may not be that bad in case of running with a deleted
object, though. On Linux /proc/$PID/maps prints the name of a deleted
object with a "(deleted)" tag. Maybe we can get there, too. Do we have
the information from where the file has been originally loaded still
available at that point? I guess the answer is no...
When Windows moves some file into trash, IMHO there is some extra info
file containing the original path. But with Cygwin unlink() ?
We don't write the $I<tmpfilename> and our files are called something
along the lines of .cyg<...> rather than $R<tmpfilename>. We could
change this but while the original path name is in the $I file, the rest
of the file appears to be undocumented. No idea if we can reproduce this.
And the DLL info in Cygwin is not accessible from other processes. Hmm.
Post by Michael Haubenwallner
Post by Corinna Vinschen
I'm still not overly excited about using the existence of the dir alone
as marker to activate this feature, but that can be added later...
I could think of some flag to set in setup.exe, but still where to store
the flag except for the existence of the directory. I'm not sure if the
registry should be used for everything...
No, not the registry. The CYGWIN environment variable, perhaps.


Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat
Michael Haubenwallner
2016-04-12 07:38:26 UTC
Permalink
Post by Corinna Vinschen
Post by Michael Haubenwallner
Post by Corinna Vinschen
I'm still not overly excited about using the existence of the dir alone
as marker to activate this feature, but that can be added later...
I could think of some flag to set in setup.exe, but still where to store
the flag except for the existence of the directory. I'm not sure if the
registry should be used for everything...
No, not the registry. The CYGWIN environment variable, perhaps.
We use 'env -i' to start some package management processes, which clobbers
the CYGWIN environment variable as well...

/haubi/

Loading...