[Cygwin64] dash segfault

Discussion:

[Cygwin64] dash segfault

Peter Rosin

2013-03-08 22:13:15 UTC

Hi!

I doubt there is a shortage of obscure things to track down in the land
of 64-bit, but while building a package using the stuff from install/release
I noticed a segfault in dash when it ran a libtool script to generate a
dll. Retrying got the dll built correctly.

Fact is, I do see segfaults once in a while, but retrying has always helped
so far, so I haven't pursued it.

How do I set up a debugger to get more info than the below stackdump?

Cheers,
Peter

bash-4.1$ cat dash.exe.stackdump
Exception: STATUS_ACCESS_VIOLATION at rip=00180148035
rax=0000000000000000 rbx=000006FFFFF94A00 rcx=0000000000000043
rdx=00000001802B6DE8 rsi=000006FFFFF8F630 rdi=0000000000000043
r8 =0000000010454050 r9 =0000000010454040 r10=00000001800BEEC0
r11=0000000100410325 r12=0000000100419910 r13=00000000FFFFFFFB
r14=0000000000000004 r15=000006FFFFF948D0
rbp=0000000000000001 rsp=00000000002285F0
program=C:\Cygwin\opt\bin\dash.exe, pid 38688, thread main
cs=0033 ds=002B es=002B fs=0053 gs=002B ss=002B

Corinna Vinschen

2013-03-09 12:50:15 UTC

Post by Peter Rosin
Hi!
I doubt there is a shortage of obscure things to track down in the land
of 64-bit, but while building a package using the stuff from install/release
I noticed a segfault in dash when it ran a libtool script to generate a
dll. Retrying got the dll built correctly.
Fact is, I do see segfaults once in a while, but retrying has always helped
so far, so I haven't pursued it.
How do I set up a debugger to get more info than the below stackdump?

I added a 64 bit Cygwin GDB package to the install area a couple
of days ago. I guess a debug version of dash (especially built w/o
optimization) won't hurt either.

Corinna

--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat

Peter Rosin

2013-03-09 23:05:04 UTC

Post by Corinna Vinschen

Post by Peter Rosin
Hi!
I doubt there is a shortage of obscure things to track down in the land
of 64-bit, but while building a package using the stuff from install/release
I noticed a segfault in dash when it ran a libtool script to generate a
dll. Retrying got the dll built correctly.
Fact is, I do see segfaults once in a while, but retrying has always helped
so far, so I haven't pursued it.
How do I set up a debugger to get more info than the below stackdump?

I added a 64 bit Cygwin GDB package to the install area a couple
of days ago. I guess a debug version of dash (especially built w/o
optimization) won't hurt either.

Ok, I recompiled dash locally (.../configure CFLAGS=-g --prefix=/usr)
and used CYGWIN='error_start=C:\...\bin\dumper.exe' and got myself a
core file...

Not much appears to be going on though, suggestions are welcome...

Cheers,
Peter

bash-4.1$ gdb
GNU gdb (GDB) 7.5.50.20130305-cvs
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-cygwin".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
(gdb) target core ./ggi/default-shared/dash.exe.core
[New Thread 0x9ea0]
[New Thread 0xad50]
[New Thread 0xa5ec]
[New Thread 0xb1ac]
#0 0x00000001 in ?? ()
(gdb) bt
#0 0x00000001 in ?? ()
Cannot access memory at address 0x5
(gdb) thread 2
[Switching to thread 2 (Thread 0xad50)]
#0 0x00000000 in ?? ()
(gdb) bt
#0 0x00000000 in ?? ()
(gdb) thread 3
[Switching to thread 3 (Thread 0xa5ec)]
#0 0x00000000 in ?? ()
(gdb) bt
#0 0x00000000 in ?? ()
Cannot access memory at address 0x801cf358
(gdb) thread 4
[Switching to thread 4 (Thread 0xb1ac)]
#0 0x00000000 in ?? ()
(gdb) bt
#0 0x00000000 in ?? ()
(gdb)

Corinna Vinschen

2013-03-10 10:18:12 UTC

Post by Peter Rosin

Post by Corinna Vinschen

Post by Peter Rosin
Hi!
I doubt there is a shortage of obscure things to track down in the land
of 64-bit, but while building a package using the stuff from install/release
I noticed a segfault in dash when it ran a libtool script to generate a
dll. Retrying got the dll built correctly.
Fact is, I do see segfaults once in a while, but retrying has always helped
so far, so I haven't pursued it.
How do I set up a debugger to get more info than the below stackdump?

I added a 64 bit Cygwin GDB package to the install area a couple
of days ago. I guess a debug version of dash (especially built w/o
optimization) won't hurt either.

Ok, I recompiled dash locally (.../configure CFLAGS=-g --prefix=/usr)
and used CYGWIN='error_start=C:\...\bin\dumper.exe' and got myself a
core file...
Not much appears to be going on though, suggestions are welcome...

Hmm. What about error_start=C:\...\gdb.exe? Maybe that gives you a bit
more "life" information.

Corinna

--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat

Corinna Vinschen

2013-03-10 11:45:02 UTC

Post by Corinna Vinschen

Post by Peter Rosin

Post by Corinna Vinschen

Post by Peter Rosin
Hi!
I doubt there is a shortage of obscure things to track down in the land
of 64-bit, but while building a package using the stuff from install/release
I noticed a segfault in dash when it ran a libtool script to generate a
dll. Retrying got the dll built correctly.
Fact is, I do see segfaults once in a while, but retrying has always helped
so far, so I haven't pursued it.
How do I set up a debugger to get more info than the below stackdump?

I added a 64 bit Cygwin GDB package to the install area a couple
of days ago. I guess a debug version of dash (especially built w/o
optimization) won't hurt either.

Ok, I recompiled dash locally (.../configure CFLAGS=-g --prefix=/usr)
and used CYGWIN='error_start=C:\...\bin\dumper.exe' and got myself a
core file...
Not much appears to be going on though, suggestions are welcome...

Hmm. What about error_start=C:\...\gdb.exe? Maybe that gives you a bit
more "life" information.

Btw., I just checked the RIP value in the stackdump output you sent.

Assuming you're using cygwin1.dll from the base package, this would be
ptmalloc3.cc, line 792. This in turn would point to a call of free() on
something not a valid pointer.

Assuming you're using cygwin1.dll from the cygwin-1.7.18-2.tar.bz2
package in the 64bit/release area, that would be malloc-private.h, line 88.

That would be a mutex_unlock call from within the ptmalloc3 code.

The missing stack is a pity, though, since that leaves us with no
trace about the cicumstances. If you reproduce the same with a
non-optimized debug version of dash, does the stackdump contain a
stack backtrace?

Corinna

--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat

Corinna Vinschen

2013-03-10 12:03:02 UTC

Post by Corinna Vinschen

Post by Corinna Vinschen

Post by Peter Rosin

Post by Corinna Vinschen

Post by Peter Rosin
Hi!
I doubt there is a shortage of obscure things to track down in the land
of 64-bit, but while building a package using the stuff from install/release
I noticed a segfault in dash when it ran a libtool script to generate a
dll. Retrying got the dll built correctly.
Fact is, I do see segfaults once in a while, but retrying has always helped
so far, so I haven't pursued it.
How do I set up a debugger to get more info than the below stackdump?

I added a 64 bit Cygwin GDB package to the install area a couple
of days ago. I guess a debug version of dash (especially built w/o
optimization) won't hurt either.

Ok, I recompiled dash locally (.../configure CFLAGS=-g --prefix=/usr)
and used CYGWIN='error_start=C:\...\bin\dumper.exe' and got myself a
core file...
Not much appears to be going on though, suggestions are welcome...

Hmm. What about error_start=C:\...\gdb.exe? Maybe that gives you a bit
more "life" information.

Btw., I just checked the RIP value in the stackdump output you sent.
Assuming you're using cygwin1.dll from the base package, this would be
ptmalloc3.cc, line 792. This in turn would point to a call of free() on
something not a valid pointer.
Assuming you're using cygwin1.dll from the cygwin-1.7.18-2.tar.bz2
package in the 64bit/release area, that would be malloc-private.h, line 88.
That would be a mutex_unlock call from within the ptmalloc3 code.
The missing stack is a pity, though, since that leaves us with no
trace about the cicumstances. If you reproduce the same with a
non-optimized debug version of dash, does the stackdump contain a
stack backtrace?

And, another btw., you should definitely use the cygwin-1.7.18-2.tar.bz2
version. It fixes a serious bug present in the base package's Cygwin
DLL. FWIW, I'm trying to reproduce the problem for the last half hour
by repeating a libedit build over and over again, but I can't get it to
crash. I'm now going to send the mail in the hope that the crash
occurs right after I hit the send button.

[...just seconds pass...]

Yay, I have a crash right *before* I hit the send button. The threat
alone seem to have convinced dash that it's time to stop kidding around.

Corinna

--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat

Peter Rosin

2013-03-10 18:31:23 UTC

Post by Corinna Vinschen

Post by Corinna Vinschen

Post by Corinna Vinschen

Post by Peter Rosin

Post by Corinna Vinschen

Post by Peter Rosin
Hi!
I doubt there is a shortage of obscure things to track down in the land
of 64-bit, but while building a package using the stuff from install/release
I noticed a segfault in dash when it ran a libtool script to generate a
dll. Retrying got the dll built correctly.
Fact is, I do see segfaults once in a while, but retrying has always helped
so far, so I haven't pursued it.
How do I set up a debugger to get more info than the below stackdump?

I added a 64 bit Cygwin GDB package to the install area a couple
of days ago. I guess a debug version of dash (especially built w/o
optimization) won't hurt either.

Ok, I recompiled dash locally (.../configure CFLAGS=-g --prefix=/usr)
and used CYGWIN='error_start=C:\...\bin\dumper.exe' and got myself a
core file...
Not much appears to be going on though, suggestions are welcome...

Hmm. What about error_start=C:\...\gdb.exe? Maybe that gives you a bit
more "life" information.

Btw., I just checked the RIP value in the stackdump output you sent.
Assuming you're using cygwin1.dll from the base package, this would be
ptmalloc3.cc, line 792. This in turn would point to a call of free() on
something not a valid pointer.
Assuming you're using cygwin1.dll from the cygwin-1.7.18-2.tar.bz2
package in the 64bit/release area, that would be malloc-private.h, line 88.
That would be a mutex_unlock call from within the ptmalloc3 code.
The missing stack is a pity, though, since that leaves us with no
trace about the cicumstances. If you reproduce the same with a
non-optimized debug version of dash, does the stackdump contain a
stack backtrace?

And, another btw., you should definitely use the cygwin-1.7.18-2.tar.bz2
version. It fixes a serious bug present in the base package's Cygwin
DLL.

I got the below with gdb as error_start.

As to what cygwin1.dll I've got, this is the uname -a output:
CYGWIN_NT-6.1 PEDA-PC 1.7.18(0.263/5/3) 2013-03-07 13:54 x86_64 Cygwin

So I guess the one from install. However, I did untar the release/cygwin
one as well, but, I did use "tar xkf". I did it from 32-bit Cygwin with
a "find .... | xargs -n 1 tar xzf" invocation after mirroring the install
and release areas. I didn't really expect clashes...

I'm now going to install the dll from release/cygwin (for real) and retry.

Cheers,
Peter

GNU gdb (GDB) 7.5.50.20130305-cvs
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-cygwin".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/dash.exe...done.
Attaching to program `/usr/bin/dash.exe', process 45916
[New Thread 45916.0xa0c8]
[New Thread 45916.0xad64]
[New Thread 45916.0xb344]
[New Thread 45916.0x811c]
(gdb) thread
[Current thread is 4 (Thread 45916.0x811c)]
(gdb) bt
#0 0x0000000076eb0531 in ntdll!DbgBreakPoint ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x0000000076f57ef8 in ntdll!DbgUiRemoteBreakin ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#2 0x0000000000000000 in ?? ()
(gdb) thread 1
[Switching to thread 1 (Thread 45916.0xa0c8)]
#0 0x0000000076eb165a in ntdll!ZwDelayExecution ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
(gdb) bt
#0 0x0000000076eb165a in ntdll!ZwDelayExecution ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x000007fefd4b1203 in SleepEx ()
from /cygdrive/c/Windows/system32/KERNELBASE.dll
#2 0x0000000000228958 in ?? ()
#3 0x0000000000000001 in ?? ()
#4 0x0000000000000000 in ?? ()
(gdb) thread 2
[Switching to thread 2 (Thread 45916.0xad64)]
#0 0x0000000076eb137a in ntdll!ZwReadFile ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
(gdb) bt
#0 0x0000000076eb137a in ntdll!ZwReadFile ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x000007fefd4b1a7a in ReadFile ()
from /cygdrive/c/Windows/system32/KERNELBASE.dll
#2 0x0000000000000000 in ?? ()
(gdb) thread 3
[Switching to thread 3 (Thread 45916.0xb344)]
#0 0x0000000076eb135a in ntdll!ZwWaitForSingleObject ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
(gdb) bt
#0 0x0000000076eb135a in ntdll!ZwWaitForSingleObject ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x000007fefd4b10dc in WaitForSingleObjectEx ()
from /cygdrive/c/Windows/system32/KERNELBASE.dll
#2 0x0000000000000000 in ?? ()

And another one with a slightly different thread 1:

(gdb) thread 1
[Switching to thread 1 (Thread 34696.0xa148)]
#0 0x0000000076e8805d in ntdll!RtlReleaseSRWLockExclusive ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
(gdb) bt
#0 0x0000000076e8805d in ntdll!RtlReleaseSRWLockExclusive ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x0000000000152000 in ?? ()
#2 0x0000002000001000 in ?? ()
#3 0x0000000001000000 in ?? ()
#4 0x0000000100000000 in ?? ()
#5 0x0000000000000000 in ?? ()
(gdb) thread 2

Peter Rosin

2013-03-10 19:20:23 UTC

Post by Peter Rosin

Post by Corinna Vinschen

Post by Corinna Vinschen

Post by Corinna Vinschen

Post by Peter Rosin

Post by Corinna Vinschen

Post by Peter Rosin
Hi!
I doubt there is a shortage of obscure things to track down in the land
of 64-bit, but while building a package using the stuff from install/release
I noticed a segfault in dash when it ran a libtool script to generate a
dll. Retrying got the dll built correctly.
Fact is, I do see segfaults once in a while, but retrying has always helped
so far, so I haven't pursued it.
How do I set up a debugger to get more info than the below stackdump?

I added a 64 bit Cygwin GDB package to the install area a couple
of days ago. I guess a debug version of dash (especially built w/o
optimization) won't hurt either.

Ok, I recompiled dash locally (.../configure CFLAGS=-g --prefix=/usr)
and used CYGWIN='error_start=C:\...\bin\dumper.exe' and got myself a
core file...
Not much appears to be going on though, suggestions are welcome...

Hmm. What about error_start=C:\...\gdb.exe? Maybe that gives you a bit
more "life" information.

Btw., I just checked the RIP value in the stackdump output you sent.
Assuming you're using cygwin1.dll from the base package, this would be
ptmalloc3.cc, line 792. This in turn would point to a call of free() on
something not a valid pointer.
Assuming you're using cygwin1.dll from the cygwin-1.7.18-2.tar.bz2
package in the 64bit/release area, that would be malloc-private.h, line 88.
That would be a mutex_unlock call from within the ptmalloc3 code.
The missing stack is a pity, though, since that leaves us with no
trace about the cicumstances. If you reproduce the same with a
non-optimized debug version of dash, does the stackdump contain a
stack backtrace?

And, another btw., you should definitely use the cygwin-1.7.18-2.tar.bz2
version. It fixes a serious bug present in the base package's Cygwin
DLL.

I got the below with gdb as error_start.
CYGWIN_NT-6.1 PEDA-PC 1.7.18(0.263/5/3) 2013-03-07 13:54 x86_64 Cygwin
So I guess the one from install. However, I did untar the release/cygwin
one as well, but, I did use "tar xkf". I did it from 32-bit Cygwin with
a "find .... | xargs -n 1 tar xzf" invocation after mirroring the install
and release areas. I didn't really expect clashes...
I'm now going to install the dll from release/cygwin (for real) and retry.

Ok, here's a crash with the dll from release, still with home-built dash
w/o -O2.

Cheers,
Peter

GNU gdb (GDB) 7.5.50.20130305-cvs
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-cygwin".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/dash.exe...done.
Attaching to program `/usr/bin/dash.exe', process 45916
[New Thread 45916.0xb298]
[New Thread 45916.0xa19c]
[New Thread 45916.0x9730]
[New Thread 45916.0xb060]
[New Thread 45916.0xa584]
[New Thread 45916.0xaeb4]
(gdb) thread
[Current thread is 6 (Thread 45916.0xaeb4)]
(gdb) bt
#0 0x0000000076eb0531 in ntdll!DbgBreakPoint ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x0000000076f57ef8 in ntdll!DbgUiRemoteBreakin ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#2 0x0000000000000000 in ?? ()
(gdb) thread 1
[Switching to thread 1 (Thread 45916.0xb298)]
#0 0x0000000076eb59db in ntdll!RtlEqualUnicodeString ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
(gdb) bt
#0 0x0000000076eb59db in ntdll!RtlEqualUnicodeString ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x0000000000000000 in ?? ()
(gdb) thread 2
[Switching to thread 2 (Thread 45916.0xa19c)]
#0 0x0000000076eb137a in ntdll!ZwReadFile ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
(gdb) bt
#0 0x0000000076eb137a in ntdll!ZwReadFile ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x000007fefd4b1a7a in ReadFile ()
from /cygdrive/c/Windows/system32/KERNELBASE.dll
#2 0x0000000000000000 in ?? ()
(gdb) thread 3
[Switching to thread 3 (Thread 45916.0x9730)]
#0 0x0000000076eb135a in ntdll!ZwWaitForSingleObject ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
(gdb) bt
#0 0x0000000076eb135a in ntdll!ZwWaitForSingleObject ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x000007fefd4b10dc in WaitForSingleObjectEx ()
from /cygdrive/c/Windows/system32/KERNELBASE.dll
#2 0x0000000000000000 in ?? ()
(gdb) thread 4
[Switching to thread 4 (Thread 45916.0xb060)]
#0 0x0000000076eb135a in ntdll!ZwWaitForSingleObject ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
(gdb) bt
#0 0x0000000076eb135a in ntdll!ZwWaitForSingleObject ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x000007fefd4b10dc in WaitForSingleObjectEx ()
from /cygdrive/c/Windows/system32/KERNELBASE.dll
#2 0x0000000000000000 in ?? ()
(gdb) thread 5
[Switching to thread 5 (Thread 45916.0xa584)]
#0 0x0000000076eb135a in ntdll!ZwWaitForSingleObject ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
(gdb) bt
#0 0x0000000076eb135a in ntdll!ZwWaitForSingleObject ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x000007fefd4b10dc in WaitForSingleObjectEx ()
from /cygdrive/c/Windows/system32/KERNELBASE.dll
#2 0x0000000000000000 in ?? ()
(gdb)

Peter Rosin

2013-03-10 19:38:21 UTC

Post by Peter Rosin
Ok, here's a crash with the dll from release, still with home-built dash
w/o -O2.

Below is another one, looking more similar to the ones I got with the
dll from the install area...

Every time that it has happened (or, I should say, every time I have
checked), it has been a libtool script linking a dll (either as a
library or as a module), that has crashed dash.

Cheers,
Peter

Reading symbols from /usr/bin/dash.exe...done.
Attaching to program `/usr/bin/dash.exe', process 40540
[New Thread 40540.0xb0e4]
[New Thread 40540.0xb0a0]
[New Thread 40540.0xa3ec]
[New Thread 40540.0xb3b0]
(gdb) thread
[Current thread is 4 (Thread 40540.0xb3b0)]
(gdb) bt
#0 0x0000000076eb0531 in ntdll!DbgBreakPoint ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x0000000076f57ef8 in ntdll!DbgUiRemoteBreakin ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#2 0x0000000000000000 in ?? ()
(gdb) thread 1
[Switching to thread 1 (Thread 40540.0xb0e4)]
#0 0x0000000076eb154a in ntdll!ZwQueryVirtualMemory ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
(gdb) bt
#0 0x0000000076eb154a in ntdll!ZwQueryVirtualMemory ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x00000000769c5cd1 in KERNEL32!GetEnvironmentStringsA ()
from /cygdrive/c/Windows/system32/kernel32.dll
#2 0x0000000000000000 in ?? ()
(gdb) thread 2
[Switching to thread 2 (Thread 40540.0xb0a0)]
#0 0x0000000076eb137a in ntdll!ZwReadFile ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
(gdb) bt
#0 0x0000000076eb137a in ntdll!ZwReadFile ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x000007fefd4b1a7a in ReadFile ()
from /cygdrive/c/Windows/system32/KERNELBASE.dll
#2 0x0000000000000000 in ?? ()
(gdb) thread 3
[Switching to thread 3 (Thread 40540.0xa3ec)]
#0 0x0000000076eb135a in ntdll!ZwWaitForSingleObject ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
(gdb) bt
#0 0x0000000076eb135a in ntdll!ZwWaitForSingleObject ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x000007fefd4b10dc in WaitForSingleObjectEx ()
from /cygdrive/c/Windows/system32/KERNELBASE.dll
#2 0x0000000000000000 in ?? ()
(gdb)

Peter Rosin

2013-03-10 20:10:43 UTC

Post by Peter Rosin

Post by Peter Rosin
Ok, here's a crash with the dll from release, still with home-built dash
w/o -O2.

Below is another one, looking more similar to the ones I got with the
dll from the install area...
Every time that it has happened (or, I should say, every time I have
checked), it has been a libtool script linking a dll (either as a
library or as a module), that has crashed dash.

I of course only needed to whisper that for it to happen elsewhere, this
time when running my not-overly-complicated build script. Still dash though.

And now again in libtool --mode=install (not relinking).

I have also seen a couple of instances of dash simply exiting w/o triggering
error_start (sadly wasn't sane enough to harvest an exit-code).

Let me know if you want more backtraces, I get the feeling they're pretty
useless? I'd also appreciate further debugging tips.

Cheers,
Peter

Corinna Vinschen

2013-03-10 20:41:18 UTC

Post by Peter Rosin

Post by Peter Rosin

Post by Peter Rosin
Ok, here's a crash with the dll from release, still with home-built dash
w/o -O2.

Below is another one, looking more similar to the ones I got with the
dll from the install area...
Every time that it has happened (or, I should say, every time I have
checked), it has been a libtool script linking a dll (either as a
library or as a module), that has crashed dash.

I of course only needed to whisper that for it to happen elsewhere, this
time when running my not-overly-complicated build script. Still dash though.
And now again in libtool --mode=install (not relinking).
I have also seen a couple of instances of dash simply exiting w/o triggering
error_start (sadly wasn't sane enough to harvest an exit-code).
Let me know if you want more backtraces, I get the feeling they're pretty
useless? I'd also appreciate further debugging tips.

I don't know. I've tried myself but hadn't much time and fun to debug
this more closely today. The most interesting snippet I got was another
stackdump with my local non-optimized Cygwin DLL which again pointed to
ptmalloc3.cc, line 792, so there seems to be some free on an invalid
address. The rest of the information I could gather so far was not very
helpful either. I have this on my plate for tomorrow and most of next
week, but I would naturally appreciate if others would help debugging
this, too. It seems a rather tricky one. If worst comes to the worst,
I rip out ptmalloc3, and we can try again with the old malloc code.

Corinna

--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat

Peter Rosin

2013-03-10 20:55:35 UTC

Post by Corinna Vinschen

Post by Peter Rosin

Post by Peter Rosin

Post by Peter Rosin
Ok, here's a crash with the dll from release, still with home-built dash
w/o -O2.

Below is another one, looking more similar to the ones I got with the
dll from the install area...
Every time that it has happened (or, I should say, every time I have
checked), it has been a libtool script linking a dll (either as a
library or as a module), that has crashed dash.

I of course only needed to whisper that for it to happen elsewhere, this
time when running my not-overly-complicated build script. Still dash though.
And now again in libtool --mode=install (not relinking).
I have also seen a couple of instances of dash simply exiting w/o triggering
error_start (sadly wasn't sane enough to harvest an exit-code).
Let me know if you want more backtraces, I get the feeling they're pretty
useless? I'd also appreciate further debugging tips.

I don't know. I've tried myself but hadn't much time and fun to debug
this more closely today. The most interesting snippet I got was another
stackdump with my local non-optimized Cygwin DLL which again pointed to
ptmalloc3.cc, line 792, so there seems to be some free on an invalid
address. The rest of the information I could gather so far was not very
helpful either. I have this on my plate for tomorrow and most of next
week, but I would naturally appreciate if others would help debugging
this, too. It seems a rather tricky one. If worst comes to the worst,
I rip out ptmalloc3, and we can try again with the old malloc code.

I ripped out "CONFIG_SHELL=/bin/dash /bin/dash" from in front of my
configure invocations, and replaced /bin/sh with bash. I have now built
the project twice in a row w/o any crash. With dash, there seemed to
be at least one crash in every build attempt.

How sure are we that the dash memory allocations are correct? This
could easily be some use-after-free issue that was "safe" with the
old malloc...

Cheers,
Peter

Teemu Nätkinniemi

2013-03-10 21:05:24 UTC

Post by Peter Rosin
I ripped out "CONFIG_SHELL=/bin/dash /bin/dash" from in front of my
configure invocations, and replaced /bin/sh with bash. I have now built
the project twice in a row w/o any crash. With dash, there seemed to
be at least one crash in every build attempt.

I was about to ask if you have tested your scripts with dash as I
noticed that tcsh + dash combination was very unstable and looks like I
haven't been the only one having problems with dash.

Peter Rosin

2013-03-11 05:51:37 UTC

Post by Corinna Vinschen

Post by Peter Rosin

Post by Peter Rosin

Post by Peter Rosin
Ok, here's a crash with the dll from release, still with home-built dash
w/o -O2.

Below is another one, looking more similar to the ones I got with the
dll from the install area...
Every time that it has happened (or, I should say, every time I have
checked), it has been a libtool script linking a dll (either as a
library or as a module), that has crashed dash.

I of course only needed to whisper that for it to happen elsewhere, this
time when running my not-overly-complicated build script. Still dash though.
And now again in libtool --mode=install (not relinking).
I have also seen a couple of instances of dash simply exiting w/o triggering
error_start (sadly wasn't sane enough to harvest an exit-code).
Let me know if you want more backtraces, I get the feeling they're pretty
useless? I'd also appreciate further debugging tips.

I don't know. I've tried myself but hadn't much time and fun to debug
this more closely today. The most interesting snippet I got was another
stackdump with my local non-optimized Cygwin DLL which again pointed to
ptmalloc3.cc, line 792, so there seems to be some free on an invalid
address. The rest of the information I could gather so far was not very
helpful either. I have this on my plate for tomorrow and most of next
week, but I would naturally appreciate if others would help debugging
this, too. It seems a rather tricky one. If worst comes to the worst,
I rip out ptmalloc3, and we can try again with the old malloc code.

I got what looks like a better backtrace, no time to look at it immediately
though. I did rebuild dash with .../configure DFLAGS=-g CPPFLAGS=-DDEBUG,
but don't know if that was instrumental in getting the backtrace. I will
let this one sit in gdb for a while, so if anyone want me to examine
something specific, let me know.

Cheers,
Peter

Reading symbols from /usr/bin/dash.exe...done.
Attaching to program `/usr/bin/dash.exe', process 9636
[New Thread 9636.0xb268]
[New Thread 9636.0xb660]
[New Thread 9636.0x9c88]
[New Thread 9636.0xb4f0]
[New Thread 9636.0xb6e8]
[New Thread 9636.0xb608]
(gdb) t a a bt

Thread 6 (Thread 9636.0xb608):
#0 0x0000000076eb0531 in ntdll!DbgBreakPoint ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x0000000076f57ef8 in ntdll!DbgUiRemoteBreakin ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#2 0x0000000000000000 in ?? ()

Thread 5 (Thread 9636.0xb6e8):
#0 0x0000000076eb135a in ntdll!ZwWaitForSingleObject ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x000007fefd4b10dc in WaitForSingleObjectEx ()
from /cygdrive/c/Windows/system32/KERNELBASE.dll
#2 0x0000000000000000 in ?? ()

Thread 4 (Thread 9636.0xb4f0):
#0 0x0000000076eb135a in ntdll!ZwWaitForSingleObject ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x000007fefd4b10dc in WaitForSingleObjectEx ()
from /cygdrive/c/Windows/system32/KERNELBASE.dll
#2 0x0000000000000000 in ?? ()

Thread 3 (Thread 9636.0x9c88):
#0 0x0000000076eb135a in ntdll!ZwWaitForSingleObject ()
---Type <return> to continue, or q <return> to quit---
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x000007fefd4b10dc in WaitForSingleObjectEx ()
from /cygdrive/c/Windows/system32/KERNELBASE.dll
#2 0x0000000000000000 in ?? ()

Thread 2 (Thread 9636.0xb660):
#0 0x0000000076eb137a in ntdll!ZwReadFile ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x000007fefd4b1a7a in ReadFile ()
from /cygdrive/c/Windows/system32/KERNELBASE.dll
#2 0x0000000000000000 in ?? ()

Thread 1 (Thread 9636.0xb268):
#0 strlen (str=0x1 <Address 0x1 out of bounds>)
at /usr/src/debug/cygwin-1.7.18-2/newlib/libc/string/strlen.c:68
#1 0x00000001800bf65e in strdup (s=0x1 <Address 0x1 out of bounds>)
at /usr/src/debug/cygwin-1.7.18-2/winsup/cygwin/malloc_wrapper.cc:213
#2 0x00000001801114eb in _sigfe () from /usr/bin/cygwin1.dll
#3 0x0000000000229d70 in ?? ()
#4 0x0000000100416a31 in findvar (vpp=0x6fffff841c8,
name=0x6fffff841c8 "old_library=") at ../../src/var.c:700
#5 0x0000000100415dd7 in setvareq (s=0x6fffff841c8 "old_library=", flags=4)
at ../../src/var.c:298
#6 0x0000000100416474 in mklocal (name=0x6fffff841c8 "old_library=")
---Type <return> to continue, or q <return> to quit---
at ../../src/var.c:513
#7 0x00000001004040ed in evalcommand (cmd=0x6ffffea1900, flags=0)
at ../../src/eval.c:745
#8 0x000000010040321a in evaltree (n=0x6ffffea1900, flags=0)
at ../../src/eval.c:280
#9 0x000000010040321a in evaltree (n=0x6ffffea1900, flags=0)
at ../../src/eval.c:280
#10 0x00000001004031ce in evaltree (n=0x6ffffea1890, flags=0)
at ../../src/eval.c:269
#11 0x00000001004031ce in evaltree (n=0x6ffffe96448, flags=0)
at ../../src/eval.c:269
#12 0x0000000100403711 in evalcase (n=0x6ffffe95ab0, flags=0)
at ../../src/eval.c:434
#13 0x000000010040321a in evaltree (n=0x6ffffe95ab0, flags=0)
at ../../src/eval.c:280
#14 0x000000010040321a in evaltree (n=0x6ffffe95ab0, flags=0)
at ../../src/eval.c:280
#15 0x00000001004031ce in evaltree (n=0x6ffffe92d70, flags=0)
at ../../src/eval.c:269
#16 0x00000001004031ce in evaltree (n=0x6ffffe92c60, flags=0)
at ../../src/eval.c:269
#17 0x00000001004031ce in evaltree (n=0x6ffffe900a8, flags=0)
at ../../src/eval.c:269
#18 0x00000001004031ce in evaltree (n=0x6ffffe90030, flags=0)
---Type <return> to continue, or q <return> to quit---
at ../../src/eval.c:269
#19 0x000000010040486a in evalfun (func=0x6ffffe90010, argc=42,
argv=0x6fffffbfd10, flags=0) at ../../src/eval.c:948
#20 0x0000000100404514 in evalcommand (cmd=0x6fffffbbf18, flags=0)
at ../../src/eval.c:871
#21 0x000000010040321a in evaltree (n=0x6fffffbbf18, flags=0)
at ../../src/eval.c:280
#22 0x000000010040321a in evaltree (n=0x6fffffbbf18, flags=0)
at ../../src/eval.c:280
#23 0x000000010040c359 in cmdloop (top=1) at ../../src/main.c:238
#24 0x000000010040c229 in main (argc=46, argv=0x22a9c0)
at ../../src/main.c:178
(gdb)

Peter Rosin

2013-03-11 08:20:07 UTC

Post by Peter Rosin
#0 strlen (str=0x1 <Address 0x1 out of bounds>)
at /usr/src/debug/cygwin-1.7.18-2/newlib/libc/string/strlen.c:68
#1 0x00000001800bf65e in strdup (s=0x1 <Address 0x1 out of bounds>)
at /usr/src/debug/cygwin-1.7.18-2/winsup/cygwin/malloc_wrapper.cc:213
#2 0x00000001801114eb in _sigfe () from /usr/bin/cygwin1.dll
#3 0x0000000000229d70 in ?? ()
#4 0x0000000100416a31 in findvar (vpp=0x6fffff841c8,
name=0x6fffff841c8 "old_library=") at ../../src/var.c:700
#5 0x0000000100415dd7 in setvareq (s=0x6fffff841c8 "old_library=", flags=4)
at ../../src/var.c:298

Hmm, frames #4 and #5 don't match, as var.c:298 doesn't call findvar, it
calls memalloc.c:savestr, which is a wrapper around strdup (line 83). That
fits better with frames #1 and #0. So, the stack still seems trashed?

Anyway, inspired by frame #0, I wrote the following silly program:

#include <string.h>
int main(void)
{
return strlen((const char *)1);
}

and it too crashes into gdb without any usable backtrace. Maybe that
could be fixed before debugging the bigger problem?

Cheers,
Peter

Corinna Vinschen

2013-03-11 09:46:15 UTC

Post by Peter Rosin
#0 strlen (str=0x1 <Address 0x1 out of bounds>)
at /usr/src/debug/cygwin-1.7.18-2/newlib/libc/string/strlen.c:68
#1 0x00000001800bf65e in strdup (s=0x1 <Address 0x1 out of bounds>)
at /usr/src/debug/cygwin-1.7.18-2/winsup/cygwin/malloc_wrapper.cc:213

This doesn't look like the same problem as the one which crashes in
free(). But it might have the same reason. A pointer value of 1
indicates that some function returned a NULL pointer but the calling
function didn't check the return value. If you still have that in
GDB, can you check where the value is coming from?

Post by Peter Rosin
Hmm, frames #4 and #5 don't match, as var.c:298 doesn't call findvar, it
calls memalloc.c:savestr, which is a wrapper around strdup (line 83). That
fits better with frames #1 and #0. So, the stack still seems trashed?

It's possible that GDB chokes on missing debug or SEH information
in the _sigfe_strdup wrapper and/or the sigfe function, both of
which are naked assembler code. I guess we should fix that so that
GDB and Cygwin's stackdump code have less problems to backtrace
over this assembler code.

Corinna

--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat

Peter Rosin

2013-03-11 10:57:56 UTC

Post by Corinna Vinschen

Post by Peter Rosin
#0 strlen (str=0x1 <Address 0x1 out of bounds>)
at /usr/src/debug/cygwin-1.7.18-2/newlib/libc/string/strlen.c:68
#1 0x00000001800bf65e in strdup (s=0x1 <Address 0x1 out of bounds>)
at /usr/src/debug/cygwin-1.7.18-2/winsup/cygwin/malloc_wrapper.cc:213

This doesn't look like the same problem as the one which crashes in
free(). But it might have the same reason. A pointer value of 1
indicates that some function returned a NULL pointer but the calling
function didn't check the return value. If you still have that in
GDB, can you check where the value is coming from?

It's still kicking in GDB, but I'm not sure how I'm going to find out
where the bogus 1 is coming from? Assuming that frame #5 is correct and
that it really is at var.c:298, that line is

s = savestr(s);

with s pointing to "old_library=" (0x6ff:fff841c8). savestr is a simple
wrapper around strdup, so anything replacing that pointer with 1 must
be coming from some non-obvious place. But it really is weird, because
the value that is transformed into 1 is passed in ecx and not on the
stack, so a trashed stack does not explain it (unless the stack is
trashed in a way that totally fools me).

I need more help to help out with this.

Cheers,
Peter

Ps, here "bt full" output in case it helps.

(gdb) bt full
#0 strlen (str=0x1 <Address 0x1 out of bounds>)
at /usr/src/debug/cygwin-1.7.18-2/newlib/libc/string/strlen.c:68
start = 0x1 <Address 0x1 out of bounds>
aligned_addr = <optimized out>
#1 0x00000001800bf65e in strdup (s=0x1 <Address 0x1 out of bounds>)
at /usr/src/debug/cygwin-1.7.18-2/winsup/cygwin/malloc_wrapper.cc:213
p = <optimized out>
len = <optimized out>
#2 0x00000001801114eb in _sigfe () from /usr/bin/cygwin1.dll
No symbol table info available.
#3 0x0000000000229d70 in ?? ()
No symbol table info available.
#4 0x0000000100416a31 in findvar (vpp=0x6fffff841c8,
name=0x6fffff841c8 "old_library=") at ../../src/var.c:700
No locals.
#5 0x0000000100415dd7 in setvareq (s=0x6fffff841c8 "old_library=", flags=4)
at ../../src/var.c:298
vp = 0x6fffff8e940
vpp = 0x6fffffbaa10
#6 0x0000000100416474 in mklocal (name=0x6fffff841c8 "old_library=")
at ../../src/var.c:513
eq = 0x6fffff841d3 "="
lvp = 0x6fffff8fd90
vpp = 0x100423550 <vartab+176>
---Type <return> to continue, or q <return> to quit---
vp = 0x6fffff8e940
#7 0x00000001004040ed in evalcommand (cmd=0x6ffffea1900, flags=0)
at ../../src/eval.c:745
spp = 0x229e80
p = 0x229f70 ""
localvar_stop = 0x6fffffeafa0
redir_stop = 0x0
smark = {stackp = 0x6fffffbfc90, stacknxt = 0x6fffffbfe80 "test",
stacknleft = 16}
argp = 0x6ffffea1920
arglist = {list = 0x0, lastp = 0x229e90}
varlist = {list = 0x6fffff841d8, lastp = 0x6fffff841d8}
argv = 0x6fffffbfe88
argc = 0
sp = 0x0
cmdentry = {cmdtype = 2, u = {index = 4301952,
cmd = 0x10041a480 <bltin>, func = 0x10041a480 <bltin>}}
jp = 0x0
lastarg = 0x0
path = 0x1802e3af8 "PATH=/usr/bin"
spclbltin = 0
execcmd = 2269024
status = 0
nargv = 0x6fffffbfe88
---Type <return> to continue, or q <return> to quit---
#8 0x000000010040321a in evaltree (n=0x6ffffea1900, flags=0)
at ../../src/eval.c:280
checkexit = 0
evalfn = 0x100403e2e <evalcommand>
isor = 2
status = 1
#9 0x000000010040321a in evaltree (n=0x6ffffea1900, flags=0)
at ../../src/eval.c:280
checkexit = 0
evalfn = 0x100402f73 <evaltree>
isor = 2
status = 1791
#10 0x00000001004031ce in evaltree (n=0x6ffffea1890, flags=0)
at ../../src/eval.c:269
checkexit = 0
evalfn = 0x6fffffbfe6b
isor = 2
status = 0
#11 0x00000001004031ce in evaltree (n=0x6ffffe96448, flags=0)
at ../../src/eval.c:269
checkexit = 0
evalfn = 0x6fffffbfc90
isor = 2
status = 1791
---Type <return> to continue, or q <return> to quit---
#12 0x0000000100403711 in evalcase (n=0x6ffffe95ab0, flags=0)
at ../../src/eval.c:434
cp = 0x6ffffe96428
patp = 0x6ffffeb0180
arglist = {list = 0x6fffffbfe70, lastp = 0x6fffffbfe70}
smark = {stackp = 0x6fffffbfc90, stacknxt = 0x6fffffbfe68 "lib",
stacknleft = 40}
#13 0x000000010040321a in evaltree (n=0x6ffffe95ab0, flags=0)
at ../../src/eval.c:280
checkexit = 0
evalfn = 0x100403626 <evalcase>
isor = 2
status = 0
#14 0x000000010040321a in evaltree (n=0x6ffffe95ab0, flags=0)
at ../../src/eval.c:280
checkexit = 0
evalfn = 0x100402f73 <evaltree>
isor = 2
status = 0
#15 0x00000001004031ce in evaltree (n=0x6ffffe92d70, flags=0)
at ../../src/eval.c:269
checkexit = 0
evalfn = 0x10
isor = 2
---Type <return> to continue, or q <return> to quit---
status = 0
#16 0x00000001004031ce in evaltree (n=0x6ffffe92c60, flags=0)
at ../../src/eval.c:269
checkexit = 0
evalfn = 0x100000000
isor = 2
status = 0
#17 0x00000001004031ce in evaltree (n=0x6ffffe900a8, flags=0)
at ../../src/eval.c:269
checkexit = 0
evalfn = 0x0
isor = 2
status = 0
#18 0x00000001004031ce in evaltree (n=0x6ffffe90030, flags=0)
at ../../src/eval.c:269
checkexit = 0
evalfn = 0x100423fd9 <stackbase+505>
isor = 2
status = 1
#19 0x000000010040486a in evalfun (func=0x6ffffe90010, argc=42,
argv=0x6fffffbfd10, flags=0) at ../../src/eval.c:948
saveparam = {nparam = 41, malloc = 1 '\001', p = 0x6fffffb6230,
optind = 1, optoff = -1}
savehandler = 0x22a840
---Type <return> to continue, or q <return> to quit---
jmploc = {loc = {0, 0, 2270248, 2270384, 6445443304, 2280688, 0, 0,
0, 0, 4299179880, 2285608, 0 <repeats 20 times>}}
e = 0
savefuncline = 0
#20 0x0000000100404514 in evalcommand (cmd=0x6fffffbbf18, flags=0)
at ../../src/eval.c:871
localvar_stop = 0x0
redir_stop = 0x0
smark = {stackp = 0x6fffffbbef0,
stacknxt = 0x6fffffbbf50 "func_mode_link", stacknleft = 416}
argp = 0x0
arglist = {list = 0x6fffffbbf60, lastp = 0x6fffffbfcf8}
varlist = {list = 0x0, lastp = 0x22a5d0}
argv = 0x6fffffbfd10
argc = 42
sp = 0x0
cmdentry = {cmdtype = 1, u = {index = -1507312, cmd = 0x6ffffe90010,
func = 0x6ffffe90010}}
jp = 0x0
lastarg = 0x0
path = 0x1802e3afd "/usr/bin"
spclbltin = -1
execcmd = 0
status = 0
---Type <return> to continue, or q <return> to quit---
nargv = 0x6fffffbfe60
#21 0x000000010040321a in evaltree (n=0x6fffffbbf18, flags=0)
at ../../src/eval.c:280
checkexit = 0
evalfn = 0x100403e2e <evalcommand>
isor = 1
status = 0
#22 0x000000010040321a in evaltree (n=0x6fffffbbf18, flags=0)
at ../../src/eval.c:280
checkexit = 0
evalfn = 0x100402f73 <evaltree>
isor = 0
status = 1
#23 0x000000010040c359 in cmdloop (top=1) at ../../src/main.c:238
skip = 0
n = 0x6fffffbbf38
smark = {stackp = 0x100423de0 <stackbase>,
stacknxt = 0x100423de8 <stackbase+8> "{", stacknleft = 504}
inter = 0
status = 0
numeof = 0
#24 0x000000010040c229 in main (argc=46, argv=0x22a9c0)
at ../../src/main.c:178
shinit = 0x22ccf0 ""
---Type <return> to continue, or q <return> to quit---
state = 4
jmploc = {loc = {0, 2271680, 2271224, 2271360, 6445443304, 2280688,
0, 0, 0, 0, 4299210697, 2285608, 0 <repeats 20 times>}}
smark = {stackp = 0x100423de0 <stackbase>,
stacknxt = 0x100423de8 <stackbase+8> "{", stacknleft = 504}
login = 0
(gdb)

Corinna Vinschen

2013-03-11 12:32:03 UTC

Hi Peter,

Post by Peter Rosin

Post by Corinna Vinschen

Post by Peter Rosin
#0 strlen (str=0x1 <Address 0x1 out of bounds>)
at /usr/src/debug/cygwin-1.7.18-2/newlib/libc/string/strlen.c:68
#1 0x00000001800bf65e in strdup (s=0x1 <Address 0x1 out of bounds>)
at /usr/src/debug/cygwin-1.7.18-2/winsup/cygwin/malloc_wrapper.cc:213

This doesn't look like the same problem as the one which crashes in
free(). But it might have the same reason. A pointer value of 1
indicates that some function returned a NULL pointer but the calling
function didn't check the return value. If you still have that in
GDB, can you check where the value is coming from?

It's still kicking in GDB, but I'm not sure how I'm going to find out
where the bogus 1 is coming from? Assuming that frame #5 is correct and
that it really is at var.c:298, that line is
s = savestr(s);
with s pointing to "old_library=" (0x6ff:fff841c8). savestr is a simple
wrapper around strdup, so anything replacing that pointer with 1 must
be coming from some non-obvious place. But it really is weird, because
the value that is transformed into 1 is passed in ecx and not on the
stack, so a trashed stack does not explain it (unless the stack is
trashed in a way that totally fools me).
I need more help to help out with this.

I've just uploaded a new 64 bit Cygwin DLL package 1.7.18-3. Can you
please restart testing with this version? I'm trying for about an
hour to reproduce the problem now, but so far I didn't succeeed, which
is a good sign, hopefully.

Corinna

--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat

Peter Rosin

2013-03-11 14:35:04 UTC

Post by Corinna Vinschen
I've just uploaded a new 64 bit Cygwin DLL package 1.7.18-3. Can you
please restart testing with this version? I'm trying for about an
hour to reproduce the problem now, but so far I didn't succeeed, which
is a good sign, hopefully.

Good new first, it seems to be less frequent! But there's still
something bad going on, so sorry, no cigar...

I've seen errors of three classes with -3. First, there's the
well-known limited-stack-bt crash (but this one looks a little
bit different for thread 2):

Thread 4 (Thread 9636.0xb830):
#0 0x0000000076eb0531 in ntdll!DbgBreakPoint ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x0000000076f57ef8 in ntdll!DbgUiRemoteBreakin ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#2 0x0000000000000000 in ?? ()

Thread 3 (Thread 9636.0xb5e4):
#0 0x0000000076eb135a in ntdll!ZwWaitForSingleObject ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x000007fefd4b10dc in WaitForSingleObjectEx ()
from /cygdrive/c/Windows/system32/KERNELBASE.dll
#2 0x0000000000000000 in ?? ()

Thread 2 (Thread 9636.0xb7b0):
#0 0x0000000076eb137a in ntdll!ZwReadFile ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x000007fefd4b1a7a in ReadFile ()
from /cygdrive/c/Windows/system32/KERNELBASE.dll
#2 0x000000000022ce00 in ?? ()
#3 0x00000001802da8f8 in fhandler_disk_file::isdevice ()
from /usr/bin/cygwin1.dll
#4 0x000000018006e390 in try_to_debug ()
---Type <return> to continue, or q <return> to quit---
at /usr/src/debug/cygwin-1.7.18-3/winsup/cygwin/exceptions.cc:500
#5 0x000000000062aa40 in ?? ()
#6 0x000000000062aaf0 in ?? ()
#7 0x00000000000000b0 in ?? ()
#8 0x0000000000000000 in ?? ()

Thread 1 (Thread 9636.0x85ac):
#0 0x0000000076eb165a in ntdll!ZwDelayExecution ()
from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1 0x000007fefd4b1203 in SleepEx ()
from /cygdrive/c/Windows/system32/KERNELBASE.dll
#2 0x00000000002285f8 in ?? ()
#3 0x0000000000000001 in ?? ()
#4 0x0000000000000000 in ?? ()

Second, there's the non-crash exit (where dash sometimes exits w/o
triggering error_start, this still happens with -3):

Basically, this time config.status didn't complete, and the configure
output ended with:

config.status: creating m4/Makefile
config.status: creating libgii.conf

but it is expected that it carries on with:

config.status: creating dist/Makefile
config.status: creating dist/rpm/Makefile
config.status: creating dist/rpm/libgii.spec
config.status: creating config.h
config.status: executing depfiles commands
config.status: executing libtool commands

That premature unexplained exit later killed make with:

...
Making all in input
make[2]: Entering directory `/cygdrive/c/Cygwin/home/peda/ggi/cyg64/gii/input'
Making all in directx
make[3]: Entering directory `/cygdrive/c/Cygwin/home/peda/ggi/cyg64/gii/input/directx'
Makefile:314: .deps/di.Plo: No such file or directory
Makefile:315: .deps/dxguid.Plo: No such file or directory
Makefile:316: .deps/input.Plo: No such file or directory
...

due to the deps not being there.

This bug is really troublesome, it does not give me a cozy feeling when
scripts only get half-done w/o any notice...

Third (which I haven't reported previously, but I have seen it with -2 as
well), sometimes gdb gets triggered by error_start, but it fails to attach
to the process. When this happens I see this in the gdb window (after the
boilerplate):

Reading symbols from /usr/bin/dash.exe...done.
Can't attach to process.
/cygdrive/c/Cygwin/home/peda/ggi/cyg64/ggi/default-shared/47784: No such file or directory.
(gdb)

Lastly, I have a tiny unrelated wishlist request of very low priority.
Could the w32api headers be updated to the latest from the mingw64 repo
the next time there's a gcc update? I had a small patch upstreamed that
will enable me to drop a local workaround. Thanks!

Cheers,
Peter

Corinna Vinschen

2013-03-11 15:39:30 UTC

Post by Peter Rosin

Post by Corinna Vinschen
I've just uploaded a new 64 bit Cygwin DLL package 1.7.18-3. Can you
please restart testing with this version? I'm trying for about an
hour to reproduce the problem now, but so far I didn't succeeed, which
is a good sign, hopefully.

Good new first, it seems to be less frequent! But there's still
something bad going on, so sorry, no cigar...

Too bad. Unfortunately the backtrace is not helpful.

Post by Peter Rosin
Second, there's the non-crash exit (where dash sometimes exits w/o
[...]
Third (which I haven't reported previously, but I have seen it with -2 as
well), sometimes gdb gets triggered by error_start, but it fails to attach
to the process. When this happens I see this in the gdb window (after the
Reading symbols from /usr/bin/dash.exe...done.
Can't attach to process.
/cygdrive/c/Cygwin/home/peda/ggi/cyg64/ggi/default-shared/47784: No such file or directory.
(gdb)

Hmm, maybe the process doesn't exist anymore for some reason.

Post by Peter Rosin
Lastly, I have a tiny unrelated wishlist request of very low priority.
Could the w32api headers be updated to the latest from the mingw64 repo
the next time there's a gcc update? I had a small patch upstreamed that
will enable me to drop a local workaround. Thanks!

I'll see to it for the next rebuild, but right now Mingw HEAD doesn't
build for me.

Can you please test something for me? Can you please try the files

ftp://ftp.cygwin.com/pub/cygwin/64bit/cygwin1.dbg.bz2
ftp://ftp.cygwin.com/pub/cygwin/64bit/cygwin1.dll.bz2

Bunzip them and install into /bin and try again. I would like to test
an assumption.

Thanks,
Corinna

--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat

Peter Rosin

2013-03-11 16:16:49 UTC

Post by Corinna Vinschen

Post by Peter Rosin

Post by Corinna Vinschen
I've just uploaded a new 64 bit Cygwin DLL package 1.7.18-3. Can you
please restart testing with this version? I'm trying for about an
hour to reproduce the problem now, but so far I didn't succeeed, which
is a good sign, hopefully.

Good new first, it seems to be less frequent! But there's still
something bad going on, so sorry, no cigar...

Too bad. Unfortunately the backtrace is not helpful.

Post by Peter Rosin
Second, there's the non-crash exit (where dash sometimes exits w/o
[...]
Third (which I haven't reported previously, but I have seen it with -2 as
well), sometimes gdb gets triggered by error_start, but it fails to attach
to the process. When this happens I see this in the gdb window (after the
Reading symbols from /usr/bin/dash.exe...done.
Can't attach to process.
/cygdrive/c/Cygwin/home/peda/ggi/cyg64/ggi/default-shared/47784: No such file or directory.
(gdb)

Hmm, maybe the process doesn't exist anymore for some reason.

Post by Peter Rosin
Lastly, I have a tiny unrelated wishlist request of very low priority.
Could the w32api headers be updated to the latest from the mingw64 repo
the next time there's a gcc update? I had a small patch upstreamed that
will enable me to drop a local workaround. Thanks!

I'll see to it for the next rebuild, but right now Mingw HEAD doesn't
build for me.
Can you please test something for me? Can you please try the files
ftp://ftp.cygwin.com/pub/cygwin/64bit/cygwin1.dbg.bz2
ftp://ftp.cygwin.com/pub/cygwin/64bit/cygwin1.dll.bz2
Bunzip them and install into /bin and try again. I would like to test
an assumption.

I got one new symptom, at one point there was this "Hangup" in the
middle of the output:

checking if we should build filter-save... yes
checking if we should build filter-tcp... yes
checking if we should build filter-tile... yes
Hangup
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: creating gii/Makefile

Also, later I noticed this in the output:

config.status: creating config.h
config.status: executing depfiles commands
Segmentation fault
config.status: executing libtool commands
exit status: 0

And then last, but not least, in another configure run:

checking for dlfcn.h... yes
checking for as... as
checking for dlltool... (cached) dlltool
checking for objdump... (cached) objdump
checking for objdir... .libs
exit status: 0

(exit status is written by my build script)

But no crashes into gdb yet, should I keep going for one of
those or did you get sufficient answers?

Cheers,
Peter

Corinna Vinschen

2013-03-11 16:43:07 UTC

Post by Peter Rosin

Post by Corinna Vinschen

Post by Peter Rosin

Post by Corinna Vinschen
I've just uploaded a new 64 bit Cygwin DLL package 1.7.18-3. Can you
please restart testing with this version? I'm trying for about an
hour to reproduce the problem now, but so far I didn't succeeed, which
is a good sign, hopefully.

Good new first, it seems to be less frequent! But there's still
something bad going on, so sorry, no cigar...

Too bad. Unfortunately the backtrace is not helpful.

Post by Peter Rosin
Second, there's the non-crash exit (where dash sometimes exits w/o
[...]
Third (which I haven't reported previously, but I have seen it with -2 as
well), sometimes gdb gets triggered by error_start, but it fails to attach
to the process. When this happens I see this in the gdb window (after the
Reading symbols from /usr/bin/dash.exe...done.
Can't attach to process.
/cygdrive/c/Cygwin/home/peda/ggi/cyg64/ggi/default-shared/47784: No such file or directory.
(gdb)

Hmm, maybe the process doesn't exist anymore for some reason.

Post by Peter Rosin
Lastly, I have a tiny unrelated wishlist request of very low priority.
Could the w32api headers be updated to the latest from the mingw64 repo
the next time there's a gcc update? I had a small patch upstreamed that
will enable me to drop a local workaround. Thanks!

I'll see to it for the next rebuild, but right now Mingw HEAD doesn't
build for me.
Can you please test something for me? Can you please try the files
ftp://ftp.cygwin.com/pub/cygwin/64bit/cygwin1.dbg.bz2
ftp://ftp.cygwin.com/pub/cygwin/64bit/cygwin1.dll.bz2
Bunzip them and install into /bin and try again. I would like to test
an assumption.

I got one new symptom, at one point there was this "Hangup" in the
checking if we should build filter-save... yes
checking if we should build filter-tcp... yes
checking if we should build filter-tile... yes
Hangup
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: creating gii/Makefile
config.status: creating config.h
config.status: executing depfiles commands
Segmentation fault

Any stackdump file?

Post by Peter Rosin
config.status: executing libtool commands
exit status: 0
checking for dlfcn.h... yes
checking for as... as
checking for dlltool... (cached) dlltool
checking for objdump... (cached) objdump
checking for objdir... .libs
exit status: 0
(exit status is written by my build script)
But no crashes into gdb yet, should I keep going for one of
those or did you get sufficient answers?

Just run this DLL for the time being.

Thanks,
Corinna

--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat

Peter Rosin

2013-03-11 17:54:02 UTC

Post by Corinna Vinschen

Post by Peter Rosin

Post by Corinna Vinschen

Post by Peter Rosin

Post by Corinna Vinschen
I've just uploaded a new 64 bit Cygwin DLL package 1.7.18-3. Can you
please restart testing with this version? I'm trying for about an
hour to reproduce the problem now, but so far I didn't succeeed, which
is a good sign, hopefully.

Good new first, it seems to be less frequent! But there's still
something bad going on, so sorry, no cigar...

Too bad. Unfortunately the backtrace is not helpful.

Post by Peter Rosin
Second, there's the non-crash exit (where dash sometimes exits w/o
[...]
Third (which I haven't reported previously, but I have seen it with -2 as
well), sometimes gdb gets triggered by error_start, but it fails to attach
to the process. When this happens I see this in the gdb window (after the
Reading symbols from /usr/bin/dash.exe...done.
Can't attach to process.
/cygdrive/c/Cygwin/home/peda/ggi/cyg64/ggi/default-shared/47784: No such file or directory.
(gdb)

Hmm, maybe the process doesn't exist anymore for some reason.

Post by Peter Rosin
Lastly, I have a tiny unrelated wishlist request of very low priority.
Could the w32api headers be updated to the latest from the mingw64 repo
the next time there's a gcc update? I had a small patch upstreamed that
will enable me to drop a local workaround. Thanks!

I'll see to it for the next rebuild, but right now Mingw HEAD doesn't
build for me.
Can you please test something for me? Can you please try the files
ftp://ftp.cygwin.com/pub/cygwin/64bit/cygwin1.dbg.bz2
ftp://ftp.cygwin.com/pub/cygwin/64bit/cygwin1.dll.bz2
Bunzip them and install into /bin and try again. I would like to test
an assumption.

I got one new symptom, at one point there was this "Hangup" in the
checking if we should build filter-save... yes
checking if we should build filter-tcp... yes
checking if we should build filter-tile... yes
Hangup
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: creating gii/Makefile
config.status: creating config.h
config.status: executing depfiles commands
Segmentation fault

Any stackdump file?

I Don't think so, I had already removed that build dir, but in a
fresh build with (at least) three "Segmentation fault" messages,
"find . -name '*.stackdump'" came up empty.

Post by Corinna Vinschen

Post by Peter Rosin
config.status: executing libtool commands
exit status: 0
checking for dlfcn.h... yes
checking for as... as
checking for dlltool... (cached) dlltool
checking for objdump... (cached) objdump
checking for objdir... .libs
exit status: 0
(exit status is written by my build script)
But no crashes into gdb yet, should I keep going for one of
those or did you get sufficient answers?

Just run this DLL for the time being.

Will do.

Cheers,
Peter

Peter Rosin

2013-03-11 16:27:44 UTC

Post by Peter Rosin
Second, there's the non-crash exit (where dash sometimes exits w/o
Basically, this time config.status didn't complete, and the configure
config.status: creating m4/Makefile
config.status: creating libgii.conf
config.status: creating dist/Makefile
config.status: creating dist/rpm/Makefile
config.status: creating dist/rpm/libgii.spec
config.status: creating config.h
config.status: executing depfiles commands
config.status: executing libtool commands
...
Making all in input
make[2]: Entering directory `/cygdrive/c/Cygwin/home/peda/ggi/cyg64/gii/input'
Making all in directx
make[3]: Entering directory `/cygdrive/c/Cygwin/home/peda/ggi/cyg64/gii/input/directx'
Makefile:314: .deps/di.Plo: No such file or directory
Makefile:315: .deps/dxguid.Plo: No such file or directory
Makefile:316: .deps/input.Plo: No such file or directory
...
due to the deps not being there.
This bug is really troublesome, it does not give me a cozy feeling when
scripts only get half-done w/o any notice...

[I wrote most of this before you asked me to test some assumption]

I have now finally managed to collect the exit status after such
a premature exit. Zero (or it might not get propagated back to the
calling shell by configure). This time, configure output ended
with:

checking for as... as
checking for dlltool... (cached) dlltool
checking for objdump... (cached) objdump
checking for objdir... .libs

The next expected line (which didn't appear) is:

checking if gcc supports -fno-rtti -fno-exceptions... no

But it seems to happen more often when configure runs config.status
near the end. Oh, right, I sometimes see a fourth class of errors,

config.status: creating config.h
config.status: executing depfiles commands
Segmentation fault
config.status: executing libtool commands

I.e., sometimes I get a Segmentation fault message when the process
terminates prematurely.

I'm also sad to report that while the gdb triggers have gone down
in frequency with -3, the premature exits have gone up, and I have
yet to see a clean run with either of 1.7.18-2 or -3...

It might be time to reveal exactly what I'm doing, even if it isn't
anything special...

mkdir ggi
cd ggi
cvs -d:pserver:anonymous-***@public.gmane.org:/cvsroot/ggi login
cvs -z3 -d:pserver:anonymous-***@public.gmane.org:/cvsroot/ggi co -P ggi-core
mkdir cyg64
cd cyg64
cat <<'EOF' > build.sh
#!/bin/sh

CONFOPTS=--disable-static

auto=yes
get=
agen=
full=
conf=
make=auto
inst=auto

log ()
{
file=$1

echo =========================== | tee -a "$file"
echo "$2" | tee -a "$file"
echo =========================== | tee -a "$file"
}

fixauto ()
{
if test $auto = no; then
test "$make" = auto && make=
test "$inst" = auto && inst=
fi
}

dolib ()
{
lib=$1
src=$2
tla=$3

test "$tla" || tla=cvs

fixauto

dir=`pwd`
logfile=$dir/$lib.log
rm -rf $logfile
log $logfile $lib
mkdir $lib 2> /dev/null
cd $lib
cd $src
test "$get" && log $logfile "$tla up"
test "$get" && $tla up 2>&1 | tee -a $logfile
test "$agen" && log $logfile "./autogen.sh"
test "$agen" && ./autogen.sh 2>&1 | tee -a $logfile
cd $dir
test "$full" && log $logfile "rm -rf $lib"
test "$full" && rm -rf $lib 2>&1 | tee -a $logfile
test "$full" && mkdir $lib
cd $lib
test "$conf" && log $logfile "$src/configure $cacheopt $CONFOPTS"
test "$conf" && eval "CONFIG_SHELL=/usr/bin/dash /usr/bin/dash $src/configure $cacheopt $CONFOPTS; echo "'"exit status: $?"' 2>&1 | tee -a $logfile
test "$make" && log $logfile "make"
test "$make" && make 2>&1 | tee -a $logfile
test "$inst" && log $logfile "make install"
test "$inst" && make install 2>&1 | tee -a $logfile
cd $dir
}

for opt in $@; do

case $opt in
--all)
auto=no
get=yes
agen=yes
full=yes
conf=yes
make=yes
inst=yes
;;

--get) auto=no; get=yes ;;
--noget) auto=no; get= ;;
--autogen) auto=no; agen=yes ;;
--noautogen) auto=no; agen= ;;
--full) auto=no; full=yes ;;
--nofull) auto=no; full= ;;
--conf) auto=no; conf=yes ;;
--noconf) auto=no; conf= ;;
--make) auto=no; make=yes ;;
--nomake) auto=no; make= ;;
--inst) auto=no; inst=yes ;;
--noinst) auto=no; inst= ;;

test)
fixauto

test "$get" && echo cvs up
test "$agen" && echo autogen.sh
test "$full" && echo rm -rf
test "$conf" && echo "configure $CONFOPTS"
test "$make" && echo make
test "$inst" && echo make install
;;

gg) dolib gg ../../ggi-core/libgg ;;
gii) dolib gii ../../ggi-core/libgii ;;
ggi) dolib ggi ../../ggi-core/libggi ;;
wmh) dolib wmh ../../misc/libggiwmh ;;
gcp) dolib gcp ../../misc/libggigcp ;;
misc) dolib misc ../../lowlevel/libggimisc ;;
gic) dolib gic ../../ggi-libs/libggigic ;;

galloc) dolib galloc ../../ggi-core/libgalloc ;;
blt) dolib blt ../../lowlevel/libggiblt ;;
buf) dolib buf ../../lowlevel/libggibuf ;;
ovl) dolib ovl ../../lowlevel/libggiovl ;;
bse) dolib bse ../../ggi-libs/libggibse ;;
gpf) dolib gpf ../../ggi-libs/libggigpf ;;
3d) dolib 3d ../../highlevel/libggi3d ;;
gl) dolib gl ../../highlevel/libggigl ;;
video) dolib video ../../highlevel/libvideo ;;
xmi) dolib xmi ../../highlevel/libxmi ;;
svga) dolib svga ../../wrappers/svgalib ;;

widget) dolib widget ../../widget svn ;;

esac

done
EOF
chmod +x build.sh
# here you need to go to each of ggi-core/libgg ggi-core/libgii
# and ggi-core/libggi, and run their respective autogen.sh
# scripts from some environment that has autotools, and then
# back to Cygwin-64 with
./build.sh --full --conf --make --inst gg gii ggi

And this last command is what I'm running again and again. It
will install libgg, libgii and libggi under /usr/local. You
should be able to run a demo with ggi/programs/demos/flying_ggis
if you manage to get it built.

Ahhhhrrrgg, you also need to patch the w32api headers with this
patch for it to build:
https://sourceforge.net/tracker/?func=detail&atid=983356&aid=3605977&group_id=202880

Cheers,
Peter

Corinna Vinschen

2013-03-11 16:46:27 UTC

Post by Peter Rosin

Post by Peter Rosin
Second, there's the non-crash exit (where dash sometimes exits w/o
Basically, this time config.status didn't complete, and the configure
config.status: creating m4/Makefile
config.status: creating libgii.conf
config.status: creating dist/Makefile
config.status: creating dist/rpm/Makefile
config.status: creating dist/rpm/libgii.spec
config.status: creating config.h
config.status: executing depfiles commands
config.status: executing libtool commands
...
Making all in input
make[2]: Entering directory `/cygdrive/c/Cygwin/home/peda/ggi/cyg64/gii/input'
Making all in directx
make[3]: Entering directory `/cygdrive/c/Cygwin/home/peda/ggi/cyg64/gii/input/directx'
Makefile:314: .deps/di.Plo: No such file or directory
Makefile:315: .deps/dxguid.Plo: No such file or directory
Makefile:316: .deps/input.Plo: No such file or directory
...
due to the deps not being there.
This bug is really troublesome, it does not give me a cozy feeling when
scripts only get half-done w/o any notice...

[I wrote most of this before you asked me to test some assumption]
I have now finally managed to collect the exit status after such
a premature exit. Zero (or it might not get propagated back to the
calling shell by configure). This time, configure output ended
checking for as... as
checking for dlltool... (cached) dlltool
checking for objdump... (cached) objdump
checking for objdir... .libs
checking if gcc supports -fno-rtti -fno-exceptions... no
But it seems to happen more often when configure runs config.status
near the end. Oh, right, I sometimes see a fourth class of errors,

Oh no, please don't. This is getting confusing since I don't know
where to start anymore. Can we try to stick with one error at a time
please?

Post by Peter Rosin
It might be time to reveal exactly what I'm doing, even if it isn't
anything special...

I'll try this at one point this week.

Corinna

--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat

Peter Rosin

2013-03-11 18:02:36 UTC

Post by Corinna Vinschen

Post by Peter Rosin

Post by Peter Rosin
Second, there's the non-crash exit (where dash sometimes exits w/o
Basically, this time config.status didn't complete, and the configure
config.status: creating m4/Makefile
config.status: creating libgii.conf
config.status: creating dist/Makefile
config.status: creating dist/rpm/Makefile
config.status: creating dist/rpm/libgii.spec
config.status: creating config.h
config.status: executing depfiles commands
config.status: executing libtool commands
...
Making all in input
make[2]: Entering directory `/cygdrive/c/Cygwin/home/peda/ggi/cyg64/gii/input'
Making all in directx
make[3]: Entering directory `/cygdrive/c/Cygwin/home/peda/ggi/cyg64/gii/input/directx'
Makefile:314: .deps/di.Plo: No such file or directory
Makefile:315: .deps/dxguid.Plo: No such file or directory
Makefile:316: .deps/input.Plo: No such file or directory
...
due to the deps not being there.
This bug is really troublesome, it does not give me a cozy feeling when
scripts only get half-done w/o any notice...

[I wrote most of this before you asked me to test some assumption]
I have now finally managed to collect the exit status after such
a premature exit. Zero (or it might not get propagated back to the
calling shell by configure). This time, configure output ended
checking for as... as
checking for dlltool... (cached) dlltool
checking for objdump... (cached) objdump
checking for objdir... .libs
checking if gcc supports -fno-rtti -fno-exceptions... no
But it seems to happen more often when configure runs config.status
near the end. Oh, right, I sometimes see a fourth class of errors,

Oh no, please don't. This is getting confusing since I don't know
where to start anymore. Can we try to stick with one error at a time
please?

Ok, let's call it classes of symptoms then, because I don't know how
to distinguish the different errors if there indeed are more than one
error.

The classes are:

1. crash into gdb, but limited bt info.
2. premature exit, no message, exit code zero (I think).
3. crash into gdb, but failing to attach to process.
4. premature exit, "Segmentation fault", exit code 0 (I think).
5. premature exit, "Hangup", exit code 0 (I think).

Cheers,
Peter

Corinna Vinschen

2013-03-13 10:45:54 UTC

Hi Peter,

Post by Peter Rosin

Post by Corinna Vinschen
Oh no, please don't. This is getting confusing since I don't know
where to start anymore. Can we try to stick with one error at a time
please?

Ok, let's call it classes of symptoms then, because I don't know how
to distinguish the different errors if there indeed are more than one
error.
1. crash into gdb, but limited bt info.
2. premature exit, no message, exit code zero (I think).
3. crash into gdb, but failing to attach to process.
4. premature exit, "Segmentation fault", exit code 0 (I think).
5. premature exit, "Hangup", exit code 0 (I think).

I've just uploaded a cygwin-1.7.18-4 package to
ftp://ftp.cygwin.com/pub/64bit/release/cygwin, which is supposed to
fix at least the worst of it. WOuld you mind to give it a whirl?

Thanks,
Corinna

--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat

Peter Rosin

2013-03-13 14:29:17 UTC

Post by Corinna Vinschen
Hi Peter,

Post by Peter Rosin

Post by Corinna Vinschen
Oh no, please don't. This is getting confusing since I don't know
where to start anymore. Can we try to stick with one error at a time
please?

Ok, let's call it classes of symptoms then, because I don't know how
to distinguish the different errors if there indeed are more than one
error.
1. crash into gdb, but limited bt info.
2. premature exit, no message, exit code zero (I think).
3. crash into gdb, but failing to attach to process.
4. premature exit, "Segmentation fault", exit code 0 (I think).
5. premature exit, "Hangup", exit code 0 (I think).

I've just uploaded a cygwin-1.7.18-4 package to
ftp://ftp.cygwin.com/pub/64bit/release/cygwin, which is supposed to

(that's ftp://ftp.cygwin.com/pub/cygwin/64bit/release/cygwin)

Post by Corinna Vinschen
fix at least the worst of it. WOuld you mind to give it a whirl?

Might you have fixed all of it? No crap so far after three builds
anyway, so, from here it looks like whatever you did for the 1.7.18-4
update nailed it. Kudos!

But wait, let's see how sending this mail affects things...

Cheers,
Peter

Corinna Vinschen

2013-03-13 14:47:28 UTC

Post by Peter Rosin

Post by Corinna Vinschen
Hi Peter,

Post by Peter Rosin

Post by Corinna Vinschen
Oh no, please don't. This is getting confusing since I don't know
where to start anymore. Can we try to stick with one error at a time
please?

Ok, let's call it classes of symptoms then, because I don't know how
to distinguish the different errors if there indeed are more than one
error.
1. crash into gdb, but limited bt info.
2. premature exit, no message, exit code zero (I think).
3. crash into gdb, but failing to attach to process.
4. premature exit, "Segmentation fault", exit code 0 (I think).
5. premature exit, "Hangup", exit code 0 (I think).

I've just uploaded a cygwin-1.7.18-4 package to
ftp://ftp.cygwin.com/pub/64bit/release/cygwin, which is supposed to

(that's ftp://ftp.cygwin.com/pub/cygwin/64bit/release/cygwin)

Post by Corinna Vinschen
fix at least the worst of it. WOuld you mind to give it a whirl?

Might you have fixed all of it? No crap so far after three builds
anyway, so, from here it looks like whatever you did for the 1.7.18-4
update nailed it. Kudos!

Kudo's to Kai in the first place. He was the one who suddenly realized
that a function call embedded into hand-crafted assembler code will
overwrite the arguments given to any arbitrary Cygwin function.
(Un)Fortunately the inlaid function isn't called very often so this
results in a kind of random pattern of unexpected arguments to a Cygwin
function.

Post by Peter Rosin
But wait, let's see how sending this mail affects things...

Yes, this is an important test. The aforementioned patch was definitely
required, but there's probably more strange stuff lurking in dark
corners of the code...

Corinna

--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat

Corinna Vinschen

2013-03-15 15:51:51 UTC

Hi Peter,
Hi all other 64 bit Cygwin testers,

Post by Corinna Vinschen

Post by Peter Rosin

Post by Corinna Vinschen
I've just uploaded a cygwin-1.7.18-4 package to
ftp://ftp.cygwin.com/pub/64bit/release/cygwin, which is supposed to

(that's ftp://ftp.cygwin.com/pub/cygwin/64bit/release/cygwin)

Post by Corinna Vinschen
fix at least the worst of it. WOuld you mind to give it a whirl?

Might you have fixed all of it? No crap so far after three builds
anyway, so, from here it looks like whatever you did for the 1.7.18-4
update nailed it. Kudos!

Kudo's to Kai in the first place. He was the one who suddenly realized
that a function call embedded into hand-crafted assembler code will
overwrite the arguments given to any arbitrary Cygwin function.
(Un)Fortunately the inlaid function isn't called very often so this
results in a kind of random pattern of unexpected arguments to a Cygwin
function.

Post by Peter Rosin
But wait, let's see how sending this mail affects things...

Yes, this is an important test. The aforementioned patch was definitely
required, but there's probably more strange stuff lurking in dark
corners of the code...

I just uploaded a new 64 bit Cygwin package 1.7.18-7.

I'm carefully hopeful that this finally fixes the random crashes we all
encountered in various scenarios. Today it occured to me that all "my"
crashes happen in forked processes. I discussed this with Kai and while
looking into this issue, Kai pointed out that the pseudo relocator might
be called too late in a forked child. That seemed to be exactly our
issue, since the too-late call would end up relocating already relocated
data, so data and bss segment might contain random pointers in a forked
child.

After moving the call to an earlier point, the crashes were gone. While
on the way, I also added a few other changes, mostly centered around
reliability issues, as well as stack alignment issues in the
auto-generated assembler code.

So I'd like to urge you to update to the latest 1.7.18-7 package and try
your various scenarios again over the weekend.

Thanks to all of you for your endurance,
Corinna

--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat

Corinna Vinschen

2013-03-15 16:29:09 UTC

Post by Corinna Vinschen
I just uploaded a new 64 bit Cygwin package 1.7.18-7.
I'm carefully hopeful that this finally fixes the random crashes we all
encountered in various scenarios. Today it occured to me that all "my"
crashes happen in forked processes. I discussed this with Kai and while
looking into this issue, Kai pointed out that the pseudo relocator might
be called too late in a forked child. That seemed to be exactly our
issue, since the too-late call would end up relocating already relocated
data, so data and bss segment might contain random pointers in a forked
child.
After moving the call to an earlier point, the crashes were gone. While
on the way, I also added a few other changes, mostly centered around
reliability issues, as well as stack alignment issues in the
auto-generated assembler code.
So I'd like to urge you to update to the latest 1.7.18-7 package and try
your various scenarios again over the weekend.

Or not. Please wait a while. I'll upload a -8 package soon. I found
an issue with perl right after uploading and I have a strange problem
with bash right now. Talk about reliability changes... sigh.

Corinna

--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat

Ryan Johnson

2013-03-10 20:11:07 UTC

Post by Peter Rosin

Post by Peter Rosin

Post by Corinna Vinschen

Post by Corinna Vinschen

Post by Corinna Vinschen

Post by Peter Rosin

Post by Corinna Vinschen

Post by Peter Rosin
Hi!
I doubt there is a shortage of obscure things to track down in the land
of 64-bit, but while building a package using the stuff from install/release
I noticed a segfault in dash when it ran a libtool script to generate a
dll. Retrying got the dll built correctly.
Fact is, I do see segfaults once in a while, but retrying has always helped
so far, so I haven't pursued it.
How do I set up a debugger to get more info than the below stackdump?

I added a 64 bit Cygwin GDB package to the install area a couple
of days ago. I guess a debug version of dash (especially built w/o
optimization) won't hurt either.

Ok, I recompiled dash locally (.../configure CFLAGS=-g --prefix=/usr)
and used CYGWIN='error_start=C:\...\bin\dumper.exe' and got myself a
core file...
Not much appears to be going on though, suggestions are welcome...

Hmm. What about error_start=C:\...\gdb.exe? Maybe that gives you a bit
more "life" information.

Btw., I just checked the RIP value in the stackdump output you sent.
Assuming you're using cygwin1.dll from the base package, this would be
ptmalloc3.cc, line 792. This in turn would point to a call of free() on
something not a valid pointer.
Assuming you're using cygwin1.dll from the cygwin-1.7.18-2.tar.bz2
package in the 64bit/release area, that would be malloc-private.h, line 88.
That would be a mutex_unlock call from within the ptmalloc3 code.
The missing stack is a pity, though, since that leaves us with no
trace about the cicumstances. If you reproduce the same with a
non-optimized debug version of dash, does the stackdump contain a
stack backtrace?

And, another btw., you should definitely use the cygwin-1.7.18-2.tar.bz2
version. It fixes a serious bug present in the base package's Cygwin
DLL.

I got the below with gdb as error_start.
CYGWIN_NT-6.1 PEDA-PC 1.7.18(0.263/5/3) 2013-03-07 13:54 x86_64 Cygwin
So I guess the one from install. However, I did untar the release/cygwin
one as well, but, I did use "tar xkf". I did it from 32-bit Cygwin with
a "find .... | xargs -n 1 tar xzf" invocation after mirroring the install
and release areas. I didn't really expect clashes...
I'm now going to install the dll from release/cygwin (for real) and retry.

Ok, here's a crash with the dll from release, still with home-built dash
w/o -O2.

BTW, gdb's "t a a bt" command is your friend (thread apply all: bt)

Ryan

Peter Rosin

2013-03-10 20:15:36 UTC

Post by Ryan Johnson

Post by Peter Rosin

Post by Peter Rosin
So I guess the one from install. However, I did untar the release/cygwin
one as well, but, I did use "tar xkf". I did it from 32-bit Cygwin with
a "find .... | xargs -n 1 tar xzf" invocation after mirroring the install

It seems xzf is etched deeply into the fingers, that should have been:
"find .... | xargs -n 1 tar xkf"

Post by Ryan Johnson

Post by Peter Rosin

Post by Peter Rosin
and release areas. I didn't really expect clashes...
I'm now going to install the dll from release/cygwin (for real) and retry.

Ok, here's a crash with the dll from release, still with home-built dash
w/o -O2.

BTW, gdb's "t a a bt" command is your friend (thread apply all: bt)

Thanks for the hint!

Cheers,
Peter

32 Replies
7 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Peter Rosin 2013-03-08 22:13:15 UTC

Corinna Vinschen 2013-03-09 12:50:15 UTC

Peter Rosin 2013-03-09 23:05:04 UTC

Corinna Vinschen 2013-03-10 10:18:12 UTC

Corinna Vinschen 2013-03-10 11:45:02 UTC

Corinna Vinschen 2013-03-10 12:03:02 UTC

Peter Rosin 2013-03-10 18:31:23 UTC

Peter Rosin 2013-03-10 19:20:23 UTC

Peter Rosin 2013-03-10 19:38:21 UTC

Peter Rosin 2013-03-10 20:10:43 UTC

Corinna Vinschen 2013-03-10 20:41:18 UTC

Peter Rosin 2013-03-10 20:55:35 UTC

Teemu Nätkinniemi 2013-03-10 21:05:24 UTC

Peter Rosin 2013-03-11 05:51:37 UTC

Peter Rosin 2013-03-11 08:20:07 UTC

Corinna Vinschen 2013-03-11 09:46:15 UTC

Peter Rosin 2013-03-11 10:57:56 UTC

Corinna Vinschen 2013-03-11 12:32:03 UTC

Peter Rosin 2013-03-11 14:35:04 UTC

Corinna Vinschen 2013-03-11 15:39:30 UTC

Peter Rosin 2013-03-11 16:16:49 UTC

Corinna Vinschen 2013-03-11 16:43:07 UTC

Peter Rosin 2013-03-11 17:54:02 UTC

Peter Rosin 2013-03-11 16:27:44 UTC

Corinna Vinschen 2013-03-11 16:46:27 UTC

Peter Rosin 2013-03-11 18:02:36 UTC

Corinna Vinschen 2013-03-13 10:45:54 UTC

Peter Rosin 2013-03-13 14:29:17 UTC

Corinna Vinschen 2013-03-13 14:47:28 UTC

Corinna Vinschen 2013-03-15 15:51:51 UTC

Corinna Vinschen 2013-03-15 16:29:09 UTC

Ryan Johnson 2013-03-10 20:11:07 UTC

Peter Rosin 2013-03-10 20:15:36 UTC

about - legalese

Loading...