Ludovic Courtès <l...@gnu.org> skribis: > Through a dichotomy I tried to see how far it goes. The info I have so > far is that ld.so errors out from elf/rtld.c:563 (line 565 is not > reached): > > 558: if (bootstrap_map.l_addr || ! > bootstrap_map.l_info[VALIDX(DT_GNU_PRELINKED)]) > 559: { > 560: /* Relocate ourselves so we can do normal function calls and > 561: data access using the global offset table. */ > 562: > 563: ELF_DYNAMIC_RELOCATE (&bootstrap_map, 0, 0, 0); > 564: } > 565: bootstrap_map.l_relocated = 1; > ... > 578: __rtld_malloc_init_stubs ();
Via brute force¹, I found that ‘__assert_fail’ is hit, with its first argument in $eax being: --8<---------------cut here---------------start------------->8--- db> x/c 0x28604,80 ELF32_R_TYPE (reloc->r_info) == R_386_RELATIVE\000\000map->l_in fo[VERSYMIDX (DT_VERSYM)] != NULL\000\000Fatal glibc error: Too many audit mo --8<---------------cut here---------------end--------------->8--- This comes from i386/dl-machine.h: --8<---------------cut here---------------start------------->8--- auto inline void __attribute ((always_inline)) elf_machine_rel_relative (Elf32_Addr l_addr, const Elf32_Rel *reloc, void *const reloc_addr_arg) { Elf32_Addr *const reloc_addr = reloc_addr_arg; assert (ELF32_R_TYPE (reloc->r_info) == R_386_RELATIVE); *reloc_addr += l_addr; } --8<---------------cut here---------------end--------------->8--- How can we get there? Looking at ‘_dl_start’, it could be that ‘elf_machine_load_address’ returns a bogus value and we end up reading wrong ELF data? Or it could be memory corruption somewhere. Or…? Thing is, it’s not fully deterministic (happens 9 times out of 10 with KVM, never happens without KVM). Ideas? :-) Ludo’. ¹ Building with ‘-fno-optimize-sibling-calls’ didn’t help get nicer backtraces, but that’s prolly because all that early relocation code is inlined.