Discussion:
Potential fix: gdb/strace swallows stderr
Ryan Johnson
2011-08-18 00:57:49 UTC
Permalink
Hi all (attn: Corinna),

I've been looking into the way stderr disappears when a cygwin process
is started by gdb or strace (64-bit win7 box), and it looks like
cygheap->fdtab[2] somehow gets initialized to NULL. The EBADF which
results kills off the rest of the write "syscall."

Looking deeper shows that, in dtable::stdio_init, GetStdHandle() returns
the same value for stdout and stderr, but being_debugged() and
/* STD_ERROR_HANDLE has been observed to be the same as
STD_OUTPUT_HANDLE. We need separate handles (e.g. using pipes
to pass data from child to parent). */
/* CV 2008-10-17: Under debugger control, std fd's have been potentially
initialized in dtable::get_debugger_info (). In this case
init_std_file_from_handle is a no-op, so, even if out == err we don't
want to duplicate the handle since it will be unused. */
if (out == err && (!being_debugged () || !not_open (2)))
{
/* Since this code is not invoked for forked tasks, we don't have
to worry about the close-on-exec flag here. */
if (!DuplicateHandle (GetCurrentProcess (), out,
GetCurrentProcess (), &err,
0, TRUE, DUPLICATE_SAME_ACCESS))
{
/* If that fails, do this as a fall back. */
err = out;
system_printf ("couldn't make stderr distinct from stdout, %E");
}
}
init_std_file_from_handle (1, out);
init_std_file_from_handle (2, err);
Always duplicating the handle when out==err seems to fix the problem for
both gdb and strace, without harming non-traced execution. However, I
doubt that's the correct thing to do, since the other checks are clearly
not accidental. Calls to not_open(1) and not_open(2) both return 1, so I
wonder if an assumption has become invalid (e.g. did it used to be that
stderr should have already been opened but may have been closed already
as well, but now stderr has not even been opened yet?).

Corinna, can you dredge up any useful memories about the issue? The code
in dtable::get_debugger_info definitely runs (gdb prints "warning:
cYgstd 28cc69 d 3"), but std[][] remains empty, so none of the std
handles was initialized in that way.

So, which of the following changes, if any, is a proper fix? The first
assumes that the whole !not_open(2) thing has become completely bogus
(or always was), while the second is a more conservative workaround. The
third assumes that a reverse-sense boolean just slipped in unnoticed.
All three changes seem to behave correctly under my limited testing...

- if (out == err && (!being_debugged () || !not_open (2)))
+ if (out == err)
+ if (out == err && (!being_debugged () || (not_open (1) && not_open
(2)) || !not_open (2)))
+ if (out == err && (!being_debugged () || not_open (2)))

Based on the code comments, I suspect #2 is the correct fix -- stderr
must be usable if there's no debugger, if the debugger explicitly
initialized stderr (but to a duplicate handle that needs fixup), or --
this is the new case -- if the debugger didn't initialize any handles
(so stderr needs initialized with a duplicated handle).

Thoughts?
Ryan
if (!fh->open ((i ? (i == 2 ? O_RDWR : O_WRONLY) : O_RDONLY)
| O_BINARY, 0777))
release (i);
else
CloseHandle (h);
/* Copy to Windows' idea of a standard handle, otherwise
we have invalid standard handles when calling Windows
functions (small_printf and strace might suffer, too). */
SetStdHandle (std_consts[i], i ? fh->get_output_handle ()
: fh->get_handle ());
The indentation of the call to SetStdHandle (and the comment describing
it) suggests that it was intended to be part of the else clause, but
it's not. It's not causing problems in this particular case because it's
unreached, but it does look confusing.
Corinna Vinschen
2011-08-18 09:14:31 UTC
Permalink
Post by Ryan Johnson
Hi all (attn: Corinna),
I've been looking into the way stderr disappears when a cygwin
process is started by gdb or strace (64-bit win7 box), and it looks
like cygheap->fdtab[2] somehow gets initialized to NULL. The EBADF
which results kills off the rest of the write "syscall."
Looking deeper shows that, in dtable::stdio_init, GetStdHandle()
returns the same value for stdout and stderr, but being_debugged()
and not_open(2) both return 1, with the result that this code
/* STD_ERROR_HANDLE has been observed to be the same as
STD_OUTPUT_HANDLE. We need separate handles (e.g. using pipes
to pass data from child to parent). */
/* CV 2008-10-17: Under debugger control, std fd's have been potentially
initialized in dtable::get_debugger_info (). In this case
init_std_file_from_handle is a no-op, so, even if out == err we don't
want to duplicate the handle since it will be unused. */
if (out == err && (!being_debugged () || !not_open (2)))
{
/* Since this code is not invoked for forked tasks, we don't have
to worry about the close-on-exec flag here. */
if (!DuplicateHandle (GetCurrentProcess (), out,
GetCurrentProcess (), &err,
0, TRUE, DUPLICATE_SAME_ACCESS))
{
/* If that fails, do this as a fall back. */
err = out;
system_printf ("couldn't make stderr distinct from stdout, %E");
}
}
init_std_file_from_handle (1, out);
init_std_file_from_handle (2, err);
Always duplicating the handle when out==err seems to fix the problem
for both gdb and strace, without harming non-traced execution.
However, I doubt that's the correct thing to do, since the other
checks are clearly not accidental. Calls to not_open(1) and
not_open(2) both return 1, so I wonder if an assumption has become
invalid (e.g. did it used to be that stderr should have already been
opened but may have been closed already as well, but now stderr has
not even been opened yet?).
Corinna, can you dredge up any useful memories about the issue? The
Ha ha ha, huh huh, good joke. After three years, the comment is all
what's left of the entire scenario. :}
Post by Ryan Johnson
code in dtable::get_debugger_info definitely runs (gdb prints
"warning: cYgstd 28cc69 d 3"), but std[][] remains empty, so none of
the std handles was initialized in that way.
So, which of the following changes, if any, is a proper fix? The
first assumes that the whole !not_open(2) thing has become
completely bogus (or always was), while the second is a more
conservative workaround. The third assumes that a reverse-sense
boolean just slipped in unnoticed. All three changes seem to behave
correctly under my limited testing...
- if (out == err && (!being_debugged () || !not_open (2)))
+ if (out == err)
+ if (out == err && (!being_debugged () || (not_open (1) &&
not_open (2)) || !not_open (2)))
+ if (out == err && (!being_debugged () || not_open (2)))
Based on the code comments, I suspect #2 is the correct fix --
stderr must be usable if there's no debugger, if the debugger
explicitly initialized stderr (but to a duplicate handle that needs
fixup), or -- this is the new case -- if the debugger didn't
initialize any handles (so stderr needs initialized with a
duplicated handle).
I'm wondering how I could ever apply this. The !not_open(2) is just
plain wrong (looks like a copy-paste bug). If not_open(2), then we
*want* to initialize fd 2, no matter what. If !not_open(2), then fd 2
has been initialized and we don't want to create a useless handle.
So the condition is just upside down. I'll apply a patch.
Post by Ryan Johnson
if (!fh->open ((i ? (i == 2 ? O_RDWR : O_WRONLY) : O_RDONLY)
| O_BINARY, 0777))
release (i);
else
CloseHandle (h);
/* Copy to Windows' idea of a standard handle, otherwise
we have invalid standard handles when calling Windows
functions (small_printf and strace might suffer, too). */
SetStdHandle (std_consts[i], i ? fh->get_output_handle ()
: fh->get_handle ());
The indentation of the call to SetStdHandle (and the comment
describing it) suggests that it was intended to be part of the else
clause, but it's not. It's not causing problems in this particular
case because it's unreached, but it does look confusing.
Right, braces are missing. Looks like I had a really bad day on
2008-10-17. I fix that as well.

Thanks for the heads up.


Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
Ryan Johnson
2011-08-18 15:49:14 UTC
Permalink
Post by Corinna Vinschen
Post by Ryan Johnson
Looking deeper shows that, in dtable::stdio_init, GetStdHandle()
returns the same value for stdout and stderr, but being_debugged()
and not_open(2) both return 1, with the result that this code
/* STD_ERROR_HANDLE has been observed to be the same as
STD_OUTPUT_HANDLE. We need separate handles (e.g. using pipes
to pass data from child to parent). */
/* CV 2008-10-17: Under debugger control, std fd's have been potentially
initialized in dtable::get_debugger_info (). In this case
init_std_file_from_handle is a no-op, so, even if out == err we don't
want to duplicate the handle since it will be unused. */
Always duplicating the handle when out==err seems to fix the problem
for both gdb and strace, without harming non-traced execution.
However, I doubt that's the correct thing to do, since the other
checks are clearly not accidental. Calls to not_open(1) and
not_open(2) both return 1, so I wonder if an assumption has become
invalid (e.g. did it used to be that stderr should have already been
opened but may have been closed already as well, but now stderr has
not even been opened yet?).
Corinna, can you dredge up any useful memories about the issue? The
Ha ha ha, huh huh, good joke. After three years, the comment is all
what's left of the entire scenario. :}
I know that feeling all too well... the wording of the question was no
accident ;)
Post by Corinna Vinschen
Post by Ryan Johnson
code in dtable::get_debugger_info definitely runs (gdb prints
"warning: cYgstd 28cc69 d 3"), but std[][] remains empty, so none of
the std handles was initialized in that way.
So, which of the following changes, if any, is a proper fix? The
first assumes that the whole !not_open(2) thing has become
completely bogus (or always was), while the second is a more
conservative workaround. The third assumes that a reverse-sense
boolean just slipped in unnoticed. All three changes seem to behave
correctly under my limited testing...
- if (out == err&& (!being_debugged () || !not_open (2)))
+ if (out == err)
+ if (out == err&& (!being_debugged () || (not_open (1)&&
not_open (2)) || !not_open (2)))
+ if (out == err&& (!being_debugged () || not_open (2)))
Based on the code comments, I suspect #2 is the correct fix --
stderr must be usable if there's no debugger, if the debugger
explicitly initialized stderr (but to a duplicate handle that needs
fixup), or -- this is the new case -- if the debugger didn't
initialize any handles (so stderr needs initialized with a
duplicated handle).
I'm wondering how I could ever apply this. The !not_open(2) is just
plain wrong (looks like a copy-paste bug). If not_open(2), then we
*want* to initialize fd 2, no matter what. If !not_open(2), then fd 2
has been initialized and we don't want to create a useless handle.
So the condition is just upside down. I'll apply a patch.
Hmm. Looking at webcvs, it seems you went with this route:
if (out == err && (!being_debugged () || not_open (2)))

Maybe I'm just paranoid, but can it ever arise that out==err when
!being_debugged()? If so, is it correct to create a duplicate handle in
that case? I assume it is, given the code comment about needing
different handles. But then, if out==err because the debugger
initialized them that way, wouldn't we again need to duplicate the
handles to avoid whatever trouble is presumably caused by not
duplicating them?

Ryan
Corinna Vinschen
2011-08-18 16:02:50 UTC
Permalink
Post by Ryan Johnson
Post by Corinna Vinschen
I'm wondering how I could ever apply this. The !not_open(2) is just
plain wrong (looks like a copy-paste bug). If not_open(2), then we
*want* to initialize fd 2, no matter what. If !not_open(2), then fd 2
has been initialized and we don't want to create a useless handle.
So the condition is just upside down. I'll apply a patch.
if (out == err && (!being_debugged () || not_open (2)))
Maybe I'm just paranoid, but can it ever arise that out==err when
!being_debugged()?
Yes. The good old Windows telnet client was such a candidate.
Post by Ryan Johnson
If so, is it correct to create a duplicate handle
in that case? I assume it is, given the code comment about needing
different handles. But then, if out==err because the debugger
initialized them that way, wouldn't we again need to duplicate the
handles to avoid whatever trouble is presumably caused by not
duplicating them?
The idea is not to touch the handles if the debugger has set them up
that way. I know that there was that case with some GDB version
way back when, but looking into the GDB sources from CVS, the code
which handled the cYgstd string is not present.

Chris, wasn't some set_child_tty function part of GDB at one point?
Or was that Red Hat only and never has been pushed upstream?


Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
Christopher Faylor
2011-08-18 16:12:03 UTC
Permalink
Post by Corinna Vinschen
Chris, wasn't some set_child_tty function part of GDB at one point?
Or was that Red Hat only and never has been pushed upstream?
I added code to gdb which I think we decided was no longer needed.

However, see previous comments about... something. I forget what
they were.

cgf
Corinna Vinschen
2011-08-18 19:16:23 UTC
Permalink
Post by Christopher Faylor
Post by Corinna Vinschen
Chris, wasn't some set_child_tty function part of GDB at one point?
Or was that Red Hat only and never has been pushed upstream?
I added code to gdb which I think we decided was no longer needed.
However, see previous comments about... something. I forget what
they were.
Hmm. Doesn't that mean we can get rid of dtable::get_debugger_info()?
I mean, why do we need to make sure that an obscure fetaure in some old
versions of GDB which were never released to the public are still
supported?


Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
Corinna Vinschen
2011-08-18 19:18:53 UTC
Permalink
Post by Corinna Vinschen
Post by Christopher Faylor
Post by Corinna Vinschen
Chris, wasn't some set_child_tty function part of GDB at one point?
Or was that Red Hat only and never has been pushed upstream?
I added code to gdb which I think we decided was no longer needed.
However, see previous comments about... something. I forget what
they were.
Hmm. Doesn't that mean we can get rid of dtable::get_debugger_info()?
I mean, why do we need to make sure that an obscure fetaure in some old
versions of GDB which were never released to the public are still
^^^^
was
Post by Corinna Vinschen
supported?
Corinna
Christopher Faylor
2011-08-18 20:25:07 UTC
Permalink
Post by Corinna Vinschen
Post by Christopher Faylor
Post by Corinna Vinschen
Chris, wasn't some set_child_tty function part of GDB at one point?
Or was that Red Hat only and never has been pushed upstream?
I added code to gdb which I think we decided was no longer needed.
However, see previous comments about... something. I forget what
they were.
Hmm. Doesn't that mean we can get rid of dtable::get_debugger_info()?
I mean, why do we need to make sure that an obscure fetaure in some old
versions of GDB which were never released to the public are still
supported?
I was just investigating that while looking into the other problem you
reported to me in private email. I'll see if it can be deleted.

cgf

Loading...