On 26/03/2025 18:10, Camm Maguire via Cygwin-apps wrote:
Greetings, and thanks so much for the feedback!

I watched the build, and it hung on a single step early on for 6h.
There is an instability forking children reliably.  I've seen this come
and go on my machine.

Here is the relevant code, back from several years ago when cygwin was
32bit.  The first branch should work unmodified, but posix_spawnp fails
with 'no error', hence the ifdef.  I have likewise seen the
CreateProcess fail, but much less frequently.

At the point of failure, the program was trying to run 'ar x libfoo.a'
in a child and simply hung.

How sure are you about that?  The last lines are:

ar x libpre_gcl.a $(ar t libpre_gcl.a |grep ^gcl_)
/cygdrive/d/a/scallywag/playground/gcl27-2.7.0-1.x86_64/src/gcl/gcl/unixport/raw_pre_gcl.exe
 /cygdrive/d/a/scallywag/playground/gcl27-2.7.0-1.x86_64/src/gcl/gcl/unixport/ 
-libdir /cygdrive/d/a/scallywag/playground/gcl27-2.7.0-1.x86_64/src/gcl/gcl/ < 
foo
The assertion !mbrk(v) on line 410 of alloc.c in function alloc_page failed: No 
error

The assertion failure seems relevant.

Locally, I get the same thing, and attaching to raw_pre_gcl with gdb I can obtain the following backtrace:

#0  0x00007ffb1542e044 in ntdll!ZwWaitForMultipleObjects () from 
/cygdrive/c/WINDOWS/SYSTEM32/ntdll.dll
#1  0x00007ffb12f33ec0 in WaitForMultipleObjectsEx () from 
/cygdrive/c/WINDOWS/System32/KERNELBASE.dll
#2  0x00007ffb12f33dbe in WaitForMultipleObjects () from 
/cygdrive/c/WINDOWS/System32/KERNELBASE.dll
#3  0x00007ffac1da603c in cygwait (object=<optimized out>, 
timeout=timeout@entry=0x7fffab7b0, mask=mask@entry=32) at 
/usr/src/debug/cygwin-3.6.0-1/winsup/cygwin/cygwait.cc:79
#4  0x00007ffac1e1afc7 in cygwait (mask=32, howlong=60000, h=<optimized out>) 
at /usr/src/debug/cygwin-3.6.0-1/winsup/cygwin/local_includes/cygwait.h:45
#5  sig_send (p=<optimized out>, p@entry=0x0, si=..., 
tls=tls@entry=0x7ffface00) at 
/usr/src/debug/cygwin-3.6.0-1/winsup/cygwin/sigproc.cc:846
#6  0x00007ffac1dbf7fb in exception::handle (e=0x7fffac770, frame=<optimized out>, 
in=<optimized out>, dispatch=<optimized out>) at 
/usr/src/debug/cygwin-3.6.0-1/winsup/cygwin/exceptions.cc:825
#7  0x00007ffb154328bf in ntdll!.chkstk () from 
/cygdrive/c/WINDOWS/SYSTEM32/ntdll.dll
#8  0x00007ffb153e2554 in ntdll!RtlRaiseException () from 
/cygdrive/c/WINDOWS/SYSTEM32/ntdll.dll
#9  0x00007ffb154313ce in ntdll!KiUserExceptionDispatcher () from 
/cygdrive/c/WINDOWS/SYSTEM32/ntdll.dll
#10 0x00000001004799db in assert_error (a=a@entry=0x1004c0208 <__FUNCTION__.3+1968> "!mbrk(v)", 
l=l@entry=410, f=f@entry=0x1004c0200 <__FUNCTION__.3+1960> "alloc.c",
    n=n@entry=0x1004c07f0 <__FUNCTION__.9> "alloc_page") at error.c:44
#11 0x000000010045e57f in alloc_page (n=<optimized out>) at alloc.c:410
#12 0x0000000100462201 in gcl_init_alloc (cs_start=<optimized out>) at 
alloc.c:1368
#13 0x00000001004623fc in malloc_internal (size=<optimized out>, 
size@entry=248) at alloc.c:1682
#14 0x0000000100462a65 in malloc (size=248) at alloc.c:1699
#15 calloc (nelem=<optimized out>, elsize=<optimized out>) at alloc.c:1782
#16 0x00007ffac1ed0222 in calloc (nmemb=1, size=248) at 
/usr/src/debug/cygwin-3.6.0-1/winsup/cygwin/mm/malloc_wrapper.cc:133
#17 0x00007ffac1e01907 in pthread::init_mainthread () at 
/usr/src/debug/cygwin-3.6.0-1/winsup/cygwin/thread.cc:369
#18 0x00007ffac1da711f in dll_crt0_1 () at 
/usr/src/debug/cygwin-3.6.0-1/winsup/cygwin/dcrt0.cc:849
#19 0x00007ffac1da5d05 in _cygtls::call2 (this=0x7ffface00, func=0x7ffac1da7010 
<dll_crt0_1(void*)>, arg=0x0, buf=buf@entry=0x7fffacdf0) at 
/usr/src/debug/cygwin-3.6.0-1/winsup/cygwin/cygtls.cc:41
#20 0x00007ffac1da5dba in _cygtls::call (func=<optimized out>, arg=<optimized 
out>) at /usr/src/debug/cygwin-3.6.0-1/winsup/cygwin/cygtls.cc:28
#21 0x0000000000000000 in ?? ()

Which I'm loosely interpreting as:

The executable raw_pre_gcl provides it's own versions of the malloc routines.

These end up getting called during DLL initialization, before main is called (see [1])

Maybe they aren't prepared for that, because an assertion occurs?

assert_error() tries to gcl_abort(), which segfaults.

Then something horrible goes wrong inside Cygwin, maybe because we're trying to send a SIGSEGV before we've initialized properly, and we end up hanging?

[1] https://cygwin.com/faq.html#faq.programming.own-malloc

Reply via email to