Discussion:
crash on latest cygwin snapshot
Christopher Faylor
2012-07-02 16:01:57 UTC
Permalink
[redirecting to cygwin-developers]
Sorry, Marco. Nevermind. I duplicated this. No need to upload anything.
I'm still working on it.
it seems solved on 20120702 snapshots
Thanks for confirming. I suspect that this snapshot really only masks the
problem. The stack was getting corrupted by something but I was having a
really hard time figuring out what was doing it.

Compiling path.cc without optimization or passwd.cc without
-fomit-frame-pointer "fixed" the problem but clearly something is wrong.

I tried building Cygwin with stack probes and that made the DLL
unrunnable. I tried instrumenting it with -finstrument-functions and
that made the problem go away.

The problem is that something, somewhere along the line, replaces a
frame pointer in the stack with 0x10c (268) and the return address with
zero. So, eventually, when a function returns, %ebp is set to 0x10c and
the program jumps to address zero. That is one manifestation. In others
the 0x10c is still there but it is interpreted as a pointer to something
and dereferencing it causes a SEGV.

0x10c is 256 + 12 but I haven't been able to find that anywhere in the
source.

Your test triggered a problem which has existed in Cygwin since last November.
I noticed it in snapshots going back to 2012-11-14. But, when I tried to
build a version of Cygwin from before that time, it still manifested the
problem. I did change to gcc 4.5.3 around that time so I'm thinking that
either this version of gcc exposed a problem in Cygwin or Cygwin has exposed
a problem in this version gcc. There was an odd problem in select() where
it seemed like constructors weren't being properly run on a local variable
when alloca was used in the same function.

cgf
marco atzeri
2012-07-03 08:05:34 UTC
Permalink
Post by Christopher Faylor
[redirecting to cygwin-developers]
Sorry, Marco. Nevermind. I duplicated this. No need to upload anything.
I'm still working on it.
it seems solved on 20120702 snapshots
Thanks for confirming. I suspect that this snapshot really only masks the
problem. The stack was getting corrupted by something but I was having a
really hard time figuring out what was doing it.
Compiling path.cc without optimization or passwd.cc without
-fomit-frame-pointer "fixed" the problem but clearly something is wrong.
I tried building Cygwin with stack probes and that made the DLL
unrunnable. I tried instrumenting it with -finstrument-functions and
that made the problem go away.
The problem is that something, somewhere along the line, replaces a
frame pointer in the stack with 0x10c (268) and the return address with
zero. So, eventually, when a function returns, %ebp is set to 0x10c and
the program jumps to address zero. That is one manifestation. In others
the 0x10c is still there but it is interpreted as a pointer to something
and dereferencing it causes a SEGV.
0x10c is 256 + 12 but I haven't been able to find that anywhere in the
source.
Your test triggered a problem which has existed in Cygwin since last November.
I noticed it in snapshots going back to 2012-11-14. But, when I tried to
build a version of Cygwin from before that time, it still manifested the
problem. I did change to gcc 4.5.3 around that time so I'm thinking that
either this version of gcc exposed a problem in Cygwin or Cygwin has exposed
a problem in this version gcc. There was an odd problem in select() where
it seemed like constructors weren't being properly run on a local variable
when alloca was used in the same function.
cgf
gcc-4.5 has some issue on windows platform (cygwin and mingw)
for wrong optimization of function return value on C++ code.

On octave I catched a fault that corrupted the stack
http://savannah.gnu.org/bugs/?34210

and the solution was to add a "volatile" declaration
---------------------------------------------------------------
profile_data_accumulator::query_time (void) const
{
octave_time now;

// FIXME -- is this volatile declaration really needed?
// See bug #34210 for additional details.
volatile double dnow = now.double_value ();

return dnow;
}
----------------------------------------------------------------
On the other platform volatile was not needed.
The nightmare was that while stepping with gdb the fault did not arise

I will not be surprised if cygwin could be hit by similar issue.

Marco
Ryan Johnson
2012-07-03 11:31:21 UTC
Permalink
Post by marco atzeri
Post by Christopher Faylor
[redirecting to cygwin-developers]
Sorry, Marco. Nevermind. I duplicated this. No need to upload anything.
I'm still working on it.
it seems solved on 20120702 snapshots
Thanks for confirming. I suspect that this snapshot really only masks the
problem. The stack was getting corrupted by something but I was having a
really hard time figuring out what was doing it.
Compiling path.cc without optimization or passwd.cc without
-fomit-frame-pointer "fixed" the problem but clearly something is wrong.
I tried building Cygwin with stack probes and that made the DLL
unrunnable. I tried instrumenting it with -finstrument-functions and
that made the problem go away.
The problem is that something, somewhere along the line, replaces a
frame pointer in the stack with 0x10c (268) and the return address with
zero. So, eventually, when a function returns, %ebp is set to 0x10c and
the program jumps to address zero. That is one manifestation. In others
the 0x10c is still there but it is interpreted as a pointer to something
and dereferencing it causes a SEGV.
0x10c is 256 + 12 but I haven't been able to find that anywhere in the
source.
Your test triggered a problem which has existed in Cygwin since last November.
I noticed it in snapshots going back to 2012-11-14. But, when I tried to
build a version of Cygwin from before that time, it still manifested the
problem. I did change to gcc 4.5.3 around that time so I'm thinking that
either this version of gcc exposed a problem in Cygwin or Cygwin has exposed
a problem in this version gcc. There was an odd problem in select() where
it seemed like constructors weren't being properly run on a local variable
when alloca was used in the same function.
cgf
gcc-4.5 has some issue on windows platform (cygwin and mingw)
for wrong optimization of function return value on C++ code.
I have also had issues with 4.5 (4.5.3 in particular) on Linux, where
destructors failed to run properly in optimized code. I now make a point
of upgrading (or self-compiling) to at least 4.6; I've actually had no
problems yet with 4.7 after using it for some months under both cygwin
and linux, though I haven't tried compiling a cygwin dll with it yet.

Ryan
Corinna Vinschen
2012-07-03 13:28:21 UTC
Permalink
Post by Ryan Johnson
Post by marco atzeri
Post by Christopher Faylor
[redirecting to cygwin-developers]
Sorry, Marco. Nevermind. I duplicated this. No need to
upload anything.
I'm still working on it.
it seems solved on 20120702 snapshots
Thanks for confirming. I suspect that this snapshot really only masks the
problem. The stack was getting corrupted by something but I was having a
really hard time figuring out what was doing it.
Compiling path.cc without optimization or passwd.cc without
-fomit-frame-pointer "fixed" the problem but clearly something is wrong.
I tried building Cygwin with stack probes and that made the DLL
unrunnable. I tried instrumenting it with -finstrument-functions and
that made the problem go away.
The problem is that something, somewhere along the line, replaces a
frame pointer in the stack with 0x10c (268) and the return address with
zero. So, eventually, when a function returns, %ebp is set to 0x10c and
the program jumps to address zero. That is one manifestation. In others
the 0x10c is still there but it is interpreted as a pointer to something
and dereferencing it causes a SEGV.
0x10c is 256 + 12 but I haven't been able to find that anywhere in the
source.
Your test triggered a problem which has existed in Cygwin since last November.
I noticed it in snapshots going back to 2012-11-14. But, when I tried to
build a version of Cygwin from before that time, it still manifested the
problem. I did change to gcc 4.5.3 around that time so I'm
thinking that
either this version of gcc exposed a problem in Cygwin or Cygwin has exposed
a problem in this version gcc. There was an odd problem in
select() where
it seemed like constructors weren't being properly run on a
local variable
when alloca was used in the same function.
cgf
gcc-4.5 has some issue on windows platform (cygwin and mingw)
for wrong optimization of function return value on C++ code.
I have also had issues with 4.5 (4.5.3 in particular) on Linux,
where destructors failed to run properly in optimized code. I now
make a point of upgrading (or self-compiling) to at least 4.6; I've
actually had no problems yet with 4.7 after using it for some months
under both cygwin and linux, though I haven't tried compiling a
cygwin dll with it yet.
Dave, any chance we can get a gcc update to 4.6?


Thanks,
Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
Christopher Faylor
2012-07-03 15:17:19 UTC
Permalink
Post by Corinna Vinschen
Post by Ryan Johnson
Post by marco atzeri
Post by Christopher Faylor
[redirecting to cygwin-developers]
Sorry, Marco. Nevermind. I duplicated this. No need to
upload anything.
I'm still working on it.
it seems solved on 20120702 snapshots
Thanks for confirming. I suspect that this snapshot really only masks the
problem. The stack was getting corrupted by something but I was having a
really hard time figuring out what was doing it.
Compiling path.cc without optimization or passwd.cc without
-fomit-frame-pointer "fixed" the problem but clearly something is wrong.
I tried building Cygwin with stack probes and that made the DLL
unrunnable. I tried instrumenting it with -finstrument-functions and
that made the problem go away.
The problem is that something, somewhere along the line, replaces a
frame pointer in the stack with 0x10c (268) and the return address with
zero. So, eventually, when a function returns, %ebp is set to 0x10c and
the program jumps to address zero. That is one manifestation. In others
the 0x10c is still there but it is interpreted as a pointer to something
and dereferencing it causes a SEGV.
0x10c is 256 + 12 but I haven't been able to find that anywhere in the
source.
Your test triggered a problem which has existed in Cygwin since last November.
I noticed it in snapshots going back to 2012-11-14. But, when I tried to
build a version of Cygwin from before that time, it still manifested the
problem. I did change to gcc 4.5.3 around that time so I'm
thinking that
either this version of gcc exposed a problem in Cygwin or Cygwin has exposed
a problem in this version gcc. There was an odd problem in
select() where
it seemed like constructors weren't being properly run on a
local variable
when alloca was used in the same function.
cgf
gcc-4.5 has some issue on windows platform (cygwin and mingw)
for wrong optimization of function return value on C++ code.
I have also had issues with 4.5 (4.5.3 in particular) on Linux,
where destructors failed to run properly in optimized code. I now
make a point of upgrading (or self-compiling) to at least 4.6; I've
actually had no problems yet with 4.7 after using it for some months
under both cygwin and linux, though I haven't tried compiling a
cygwin dll with it yet.
Dave, any chance we can get a gcc update to 4.6?
Or, barring that, an experimental gcc 4.6 cross-compiler for Linux.

I will roll one myself if Yaakov doesn't already have one lurking in
the wings.

cgf
Yaakov (Cygwin/X)
2012-07-05 07:49:35 UTC
Permalink
Post by Christopher Faylor
Post by Corinna Vinschen
Post by Ryan Johnson
Post by marco atzeri
gcc-4.5 has some issue on windows platform (cygwin and mingw)
for wrong optimization of function return value on C++ code.
I have also had issues with 4.5 (4.5.3 in particular) on Linux,
where destructors failed to run properly in optimized code. I now
make a point of upgrading (or self-compiling) to at least 4.6; I've
actually had no problems yet with 4.7 after using it for some months
under both cygwin and linux, though I haven't tried compiling a
cygwin dll with it yet.
Dave, any chance we can get a gcc update to 4.6?
AFAICS the last time Dave was heard from on the lists was March 24.
Post by Christopher Faylor
Or, barring that, an experimental gcc 4.6 cross-compiler for Linux.
I will roll one myself if Yaakov doesn't already have one lurking in
the wings.
I don't currently. But wouldn't 4.7.1 be preferable?


Yaakov
Corinna Vinschen
2012-07-05 08:08:02 UTC
Permalink
Post by Yaakov (Cygwin/X)
Post by Christopher Faylor
Post by Corinna Vinschen
Post by Ryan Johnson
Post by marco atzeri
gcc-4.5 has some issue on windows platform (cygwin and mingw)
for wrong optimization of function return value on C++ code.
I have also had issues with 4.5 (4.5.3 in particular) on Linux,
where destructors failed to run properly in optimized code. I now
make a point of upgrading (or self-compiling) to at least 4.6; I've
actually had no problems yet with 4.7 after using it for some months
under both cygwin and linux, though I haven't tried compiling a
cygwin dll with it yet.
Dave, any chance we can get a gcc update to 4.6?
AFAICS the last time Dave was heard from on the lists was March 24.
We have gone through that situation already at one point. Eventually
Dave showed up and created a new package but we really thought he was
gone for many weeks.

Dave, if you read that, this limbo state in which nobody knows if you
still maintain gcc or not because you're gone for many weeks is pretty
frustrating. I know, life and all, but still...

Yaakov, I guess you're still willing to take over gcc maintainership if
Dave doesn't show up? If so, I think we should give Dave 2 weeks.
Vacation happens.
Post by Yaakov (Cygwin/X)
Post by Christopher Faylor
Or, barring that, an experimental gcc 4.6 cross-compiler for Linux.
I will roll one myself if Yaakov doesn't already have one lurking in
the wings.
I don't currently. But wouldn't 4.7.1 be preferable?
That would be cool. Are you prepared to roll out a 4.7.1 cross gcc
for testing?


Thanks,
Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
marco atzeri
2012-07-05 08:29:46 UTC
Permalink
Post by Yaakov (Cygwin/X)
Post by Corinna Vinschen
Post by Ryan Johnson
Post by marco atzeri
gcc-4.5 has some issue on windows platform (cygwin and mingw)
for wrong optimization of function return value on C++ code.
I have also had issues with 4.5 (4.5.3 in particular) on Linux,
where destructors failed to run properly in optimized code. I now
make a point of upgrading (or self-compiling) to at least 4.6; I've
actually had no problems yet with 4.7 after using it for some months
under both cygwin and linux, though I haven't tried compiling a
cygwin dll with it yet.
Dave, any chance we can get a gcc update to 4.6?
AFAICS the last time Dave was heard from on the lists was March 24.
Dave seems still working on the 4.7.1 for cygwin
http://article.gmane.org/gmane.comp.gcc.devel/127180/match=

Loading...