On Wed, Nov 27, 2013 at 10:27:20AM -0700, Eric Blake wrote: > [adding glibc] > > On 11/27/2013 09:58 AM, Michal Privoznik wrote: > > Hey guys, > > > > I've just discovered a bug, well a hang in conftest. This is what I ran: > > > > libvirt.git $ git clean -fxd; ./autogen.sh --system > > > > and all looked good until this: > > > > checking whether readlink signature is correct... yes > > checking whether readlink handles trailing slash correctly... yes > > checking for working re_compile_pattern... > > > > When the configure script hang and didn't continue. Attaching a debugger to > > hanging conftest process showed: > > > > > __lll_lock_wait_private () at > > ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:93 > > 93 ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file > > or directory. > > (gdb) bt > > #0 __lll_lock_wait_private () at > > ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:93 > > #1 0x0000003274e7fbd3 in _L_lock_11326 () at malloc.c:5236 > > #2 0x0000003274e7dd55 in __GI___libc_malloc (bytes=53) at malloc.c:2921 > > Sounds like glibc is trying to obtain the malloc lock... > > > #3 0x0000003274a0533a in local_strdup (s=0x7feab7a0bf21 > > "/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.2/libgcc_s.so.1") at dl-load.c:162 > > #4 0x0000003274a08588 in _dl_map_object > > (loader=loader@entry=0x7feab7c5c658, name=name@entry=0x3274f60e30 > > "libgcc_s.so.1", type=type@entry=2, trace_mode=trace_mode@entry=0, > > mode=mode@entry=-1879048191, nsid=<optimized out>) > > at dl-load.c:2249 > > #5 0x0000003274a12a2c in dl_open_worker (a=a@entry=0x7fff58d2b7c0) at > > dl-open.c:225 > > #6 0x0000003274a0e8c4 in _dl_catch_error > > (objname=objname@entry=0x7fff58d2b7b0, > > errstring=errstring@entry=0x7fff58d2b7b8, > > mallocedp=mallocedp@entry=0x7fff58d2b7af, > > operate=operate@entry=0x3274a12900 <dl_open_worker>, > > args=args@entry=0x7fff58d2b7c0) at dl-error.c:178 > > #7 0x0000003274a124b1 in _dl_open (file=0x3274f60e30 "libgcc_s.so.1", > > mode=-2147483647, caller_dlopen=<optimized out>, nsid=-2, argc=1, > > argv=0x7fff58d2c9d8, env=0x7fff58d2c9e8) at dl-open.c:639 > > #8 0x0000003274f1b202 in do_dlopen (ptr=ptr@entry=0x7fff58d2b9e0) at > > dl-libc.c:89 > > #9 0x0000003274a0e8c4 in _dl_catch_error (objname=0x7fff58d2b9c0, > > errstring=0x7fff58d2b9c8, mallocedp=0x7fff58d2b9bf, operate=0x3274f1b1c0 > > <do_dlopen>, args=0x7fff58d2b9e0) at dl-error.c:178 > > #10 0x0000003274f1b29f in dlerror_run (operate=operate@entry=0x3274f1b1c0 > > <do_dlopen>, args=args@entry=0x7fff58d2b9e0) at dl-libc.c:48 > > #11 0x0000003274f1b311 in __GI___libc_dlopen_mode > > (name=name@entry=0x3274f60e30 "libgcc_s.so.1", mode=mode@entry=-2147483647) > > at dl-libc.c:165 > > #12 0x0000003274ef7895 in init () at > > ../sysdeps/x86_64/../ia64/backtrace.c:53 > > #13 0x0000003274ef79e5 in __GI___backtrace > > (array=array@entry=0x7fff58d2bc80, size=size@entry=64) at > > ../sysdeps/x86_64/../ia64/backtrace.c:104 > > #14 0x0000003274e74364 in __libc_message (do_abort=do_abort@entry=2, > > fmt=fmt@entry=0x3274f65470 "*** glibc detected *** %s: %s: 0x%s ***\n") at > > ../sysdeps/unix/sysv/linux/libc_fatal.c:178 > > ...in order to report malloc arena corruption... > > > #15 0x0000003274e79d2e in malloc_printerr (action=3, str=0x3274f6248b > > "malloc(): memory corruption", ptr=<optimized out>) at malloc.c:5007 > > #16 0x0000003274e7b7e4 in _int_malloc (av=0x32751a1620 <main_arena>, > > bytes=<optimized out>) at malloc.c:3555 > > #17 0x0000003274e7ed90 in __libc_calloc (n=216713008672, n@entry=88, > > elem_size=0, elem_size@entry=1) at malloc.c:3274 > > ...detected while the malloc lock is already held. That explains the > deadlock. Sounds like a glibc bug worth fixing (if it isn't already) - > if glibc is going to go the the effort of informing the user about > memory corruption, it should not use malloc() in the attempt.
I think there's a bug report for this out there already. I believe the best way to fix this would be to make malloc use mmap if it finds that the heap is corrupt and the program is exiting. For a workaround, you could export MALLOC_CHECK_ to 2 so that the program only aborts and does not try to print a backtrace. Siddhesh