Discussion:
data and bss tests in dll_list::alloc
Corinna Vinschen
2012-02-08 14:54:19 UTC
Permalink
Hi,


I just fixed a typo in the fabort calls in dll_list::alloc. But in fact
I'm wondering if we really need the extensive data_start/data_end/
bss_start/bss_end tests. The reason is simple. All DLL segments are
always loaded into adjacent addresses, always in the order given by
the DLL segement information.

Therefore, a single address comparison is sufficient to recognize a
situation in which a child DLL is not loaded to the same address as
in the parent.

And given that, we don't even have to compare data and bss addresses
at all. The HINSTANCE is the address of the module. Just compare it
to the stored d->handle and if they are not identical, we're done,
right?

Or am I missing something?


Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
Christopher Faylor
2012-02-08 15:21:12 UTC
Permalink
Post by Corinna Vinschen
I just fixed a typo in the fabort calls in dll_list::alloc. But in
fact I'm wondering if we really need the extensive data_start/data_end/
bss_start/bss_end tests. The reason is simple. All DLL segments are
always loaded into adjacent addresses, always in the order given by the
DLL segement information.
If that is the case can we simplify the child_copy operation? That
would speed up fork slightly.
Post by Corinna Vinschen
And given that, we don't even have to compare data and bss addresses at
all. The HINSTANCE is the address of the module. Just compare it to
the stored d->handle and if they are not identical, we're done, right?
Or am I missing something?
Don't think so.

cgf
Corinna Vinschen
2012-02-08 15:41:52 UTC
Permalink
Post by Christopher Faylor
Post by Corinna Vinschen
I just fixed a typo in the fabort calls in dll_list::alloc. But in
fact I'm wondering if we really need the extensive data_start/data_end/
bss_start/bss_end tests. The reason is simple. All DLL segments are
always loaded into adjacent addresses, always in the order given by the
DLL segement information.
If that is the case can we simplify the child_copy operation? That
would speed up fork slightly.
I'm not sure, but if you're asking if we can only give a single address
to child_copy, then the answer is probably no. You can't rely on the
fact that data and bss segments are adjacent segments in the DLL, just
that adjacent segments in the DLL will be loaded into adjacent addresses
in the processes VM.


Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
Christopher Faylor
2012-02-08 16:26:13 UTC
Permalink
Post by Corinna Vinschen
Post by Christopher Faylor
Post by Corinna Vinschen
I just fixed a typo in the fabort calls in dll_list::alloc. But in
fact I'm wondering if we really need the extensive data_start/data_end/
bss_start/bss_end tests. The reason is simple. All DLL segments are
always loaded into adjacent addresses, always in the order given by the
DLL segement information.
If that is the case can we simplify the child_copy operation? That
would speed up fork slightly.
I'm not sure, but if you're asking if we can only give a single address
to child_copy, then the answer is probably no. You can't rely on the
fact that data and bss segments are adjacent segments in the DLL, just
that adjacent segments in the DLL will be loaded into adjacent addresses
in the processes VM.
I thought you were implying that data/bss load order was always the same.
It is isn't it?

cgf
Ryan Johnson
2012-02-08 16:33:10 UTC
Permalink
Post by Christopher Faylor
Post by Corinna Vinschen
Post by Christopher Faylor
Post by Corinna Vinschen
I just fixed a typo in the fabort calls in dll_list::alloc. But in
fact I'm wondering if we really need the extensive data_start/data_end/
bss_start/bss_end tests. The reason is simple. All DLL segments are
always loaded into adjacent addresses, always in the order given by the
DLL segement information.
If that is the case can we simplify the child_copy operation? That
would speed up fork slightly.
I'm not sure, but if you're asking if we can only give a single address
to child_copy, then the answer is probably no. You can't rely on the
fact that data and bss segments are adjacent segments in the DLL, just
that adjacent segments in the DLL will be loaded into adjacent addresses
in the processes VM.
I thought you were implying that data/bss load order was always the same.
It is isn't it?
My understanding is that the bss and data segments usually don't occupy
the same positions in the dll-as-file that they do in the
dll-as-mmaped-entity (what Corinna said), but that any two mapped
instances of a dll would put data/bss in consistent positions relative
to the dll's base (what CGF wonders). However, we'd want to confirm that
data/bss were actually adjacent before firing off a single memcpy.

Does that make sense or am I misunderstanding the issue?

Ryan
Christopher Faylor
2012-02-08 16:39:02 UTC
Permalink
Post by Ryan Johnson
Post by Christopher Faylor
Post by Corinna Vinschen
Post by Christopher Faylor
Post by Corinna Vinschen
I just fixed a typo in the fabort calls in dll_list::alloc. But in
fact I'm wondering if we really need the extensive data_start/data_end/
bss_start/bss_end tests. The reason is simple. All DLL segments are
always loaded into adjacent addresses, always in the order given by the
DLL segement information.
If that is the case can we simplify the child_copy operation? That
would speed up fork slightly.
I'm not sure, but if you're asking if we can only give a single address
to child_copy, then the answer is probably no. You can't rely on the
fact that data and bss segments are adjacent segments in the DLL, just
that adjacent segments in the DLL will be loaded into adjacent addresses
in the processes VM.
I thought you were implying that data/bss load order was always the same.
It is isn't it?
My understanding is that the bss and data segments usually don't occupy
the same positions in the dll-as-file that they do in the
dll-as-mmaped-entity (what Corinna said), but that any two mapped
instances of a dll would put data/bss in consistent positions relative
to the dll's base (what CGF wonders). However, we'd want to confirm that
data/bss were actually adjacent before firing off a single memcpy.
Ok. So a simple optimization would be to detect when copied sections
were adjacent and coalesce them into one copy operation. I'll take a
look into doing that.
Post by Ryan Johnson
Does that make sense or am I misunderstanding the issue?
Makes sense. Thanks.

cgf
Christopher Faylor
2012-02-17 15:37:07 UTC
Permalink
Post by Christopher Faylor
Post by Ryan Johnson
Post by Christopher Faylor
Post by Corinna Vinschen
Post by Christopher Faylor
Post by Corinna Vinschen
I just fixed a typo in the fabort calls in dll_list::alloc. But in
fact I'm wondering if we really need the extensive data_start/data_end/
bss_start/bss_end tests. The reason is simple. All DLL segments are
always loaded into adjacent addresses, always in the order given by the
DLL segement information.
If that is the case can we simplify the child_copy operation? That
would speed up fork slightly.
I'm not sure, but if you're asking if we can only give a single address
to child_copy, then the answer is probably no. You can't rely on the
fact that data and bss segments are adjacent segments in the DLL, just
that adjacent segments in the DLL will be loaded into adjacent addresses
in the processes VM.
I thought you were implying that data/bss load order was always the same.
It is isn't it?
My understanding is that the bss and data segments usually don't occupy
the same positions in the dll-as-file that they do in the
dll-as-mmaped-entity (what Corinna said), but that any two mapped
instances of a dll would put data/bss in consistent positions relative
to the dll's base (what CGF wonders). However, we'd want to confirm that
data/bss were actually adjacent before firing off a single memcpy.
Ok. So a simple optimization would be to detect when copied sections
were adjacent and coalesce them into one copy operation. I'll take a
look into doing that.
Just to close the record on this: I did reimplement child_copy to do
this but I never saw it actually work. This was apparently due to
section alignment since there would be a few bytes between the end
of data and the beginning of bss. So, rather than go to more effort
to figure this out, I just dropped the idea.

cgf

Corinna Vinschen
2012-02-08 16:33:16 UTC
Permalink
Post by Christopher Faylor
Post by Corinna Vinschen
Post by Christopher Faylor
Post by Corinna Vinschen
I just fixed a typo in the fabort calls in dll_list::alloc. But in
fact I'm wondering if we really need the extensive data_start/data_end/
bss_start/bss_end tests. The reason is simple. All DLL segments are
always loaded into adjacent addresses, always in the order given by the
DLL segement information.
If that is the case can we simplify the child_copy operation? That
would speed up fork slightly.
I'm not sure, but if you're asking if we can only give a single address
to child_copy, then the answer is probably no. You can't rely on the
fact that data and bss segments are adjacent segments in the DLL, just
that adjacent segments in the DLL will be loaded into adjacent addresses
in the processes VM.
I thought you were implying that data/bss load order was always the same.
It is isn't it?
No, it's not. If you have a DLL with four segments:

.text
.data
.bss
.rdata

Then these segments are always loaded in the given order in adjacent
memory locations. But this:

.text
.data
.rdata
.bss

is a valid DLL, too. The segments will be loaded in the given order in
adjacent memory locations as well, but the .rdata segment might be read
only, for instance. If you try to copy it in copy_child it will fail.


Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
Christopher Faylor
2012-02-08 16:39:57 UTC
Permalink
Post by Corinna Vinschen
Post by Christopher Faylor
Post by Corinna Vinschen
Post by Christopher Faylor
Post by Corinna Vinschen
I just fixed a typo in the fabort calls in dll_list::alloc. But in
fact I'm wondering if we really need the extensive data_start/data_end/
bss_start/bss_end tests. The reason is simple. All DLL segments are
always loaded into adjacent addresses, always in the order given by the
DLL segement information.
If that is the case can we simplify the child_copy operation? That
would speed up fork slightly.
I'm not sure, but if you're asking if we can only give a single address
to child_copy, then the answer is probably no. You can't rely on the
fact that data and bss segments are adjacent segments in the DLL, just
that adjacent segments in the DLL will be loaded into adjacent addresses
in the processes VM.
I thought you were implying that data/bss load order was always the same.
It is isn't it?
.text
.data
.bss
.rdata
Then these segments are always loaded in the given order in adjacent
.text
.data
.rdata
.bss
is a valid DLL, too. The segments will be loaded in the given order in
adjacent memory locations as well, but the .rdata segment might be read
only, for instance. If you try to copy it in copy_child it will fail.
I know it's a valid DLL but I'm wondering if it ever shows up in the wild.

But, nevermind, it's easy enough to detect in the code so it's a moot
point.

cgf
Václav Zeman
2012-02-08 15:31:57 UTC
Permalink
Hi,
I just fixed a typo in the fabort calls in dll_list::alloc.  But in fact
I'm wondering if we really need the extensive data_start/data_end/
bss_start/bss_end tests.  The reason is simple.  All DLL segments are
always loaded into adjacent addresses, always in the order given by
the DLL segement information.
Therefore, a single address comparison is sufficient to recognize a
situation in which a child DLL is not loaded to the same address as
in the parent.
And given that, we don't even have to compare data and bss addresses
at all.  The HINSTANCE is the address of the module.  Just compare it
to the stored d->handle and if they are not identical, we're done,
right?
Or am I missing something?
I think that this article about Windows 2000 loader supports that:
<http://msdn.microsoft.com/en-gb/magazine/cc301727.aspx>
"Now that LdrpMapDll has the section handle, it can actually load the DLL into the process's address. The DLL is brought in as a memory-mapped file through the services of NtMapViewOfSection."
My understanding is that the DLL sections are mapped in in the order
they are stored in PE executable headers, each adjacent to the
previous one.
--
VZ
Corinna Vinschen
2012-02-08 15:43:32 UTC
Permalink
Post by Václav Zeman
Hi,
I just fixed a typo in the fabort calls in dll_list::alloc.  But in fact
I'm wondering if we really need the extensive data_start/data_end/
bss_start/bss_end tests.  The reason is simple.  All DLL segments are
always loaded into adjacent addresses, always in the order given by
the DLL segement information.
Therefore, a single address comparison is sufficient to recognize a
situation in which a child DLL is not loaded to the same address as
in the parent.
And given that, we don't even have to compare data and bss addresses
at all.  The HINSTANCE is the address of the module.  Just compare it
to the stored d->handle and if they are not identical, we're done,
right?
Or am I missing something?
<http://msdn.microsoft.com/en-gb/magazine/cc301727.aspx>
"Now that LdrpMapDll has the section handle, it can actually load the DLL into the process's address. The DLL is brought in as a memory-mapped file through the services of NtMapViewOfSection."
My understanding is that the DLL sections are mapped in in the order
they are stored in PE executable headers, each adjacent to the
previous one.
Yep, that;'s what I meant. I never saw a case where DLL segments were
loaded into arbitrary addresses spreaded over the processes VM. Having
a single load address in the PE/COFF header doesn't make much sense
then, and it's much more work for the loader as well.


Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
Loading...