Hi! Samuel Thibault <samuel.thiba...@gnu.org> skribis:
> Ludovic Courtès, le ven. 07 oct. 2022 00:10:15 +0200, a ecrit: [...] >> Of course, the thing boots just fine on that machine when using >> ‘exec.static’. > > Uh. At least you have a workaround :) Yup. :-) >> So the issue might be somewhere in ld.so, or triggered by ld.so. >> Any debugging tricks here? > > maybe for a start check with > > show map $map2 > > what is actually mapped, whether it's just ld.so, or also exec, etc. On the first trap (page fault) I see: --8<---------------cut here---------------start------------->8--- db> show all tasks ID TASK NAME [THREADS] 0 f5f7cf00 gnumach [7] 1 f5f7ce40 ext2fs [6] 2 f5f7cd80 exec [1] db> show map $map2 Map 0xf5f6ff30: name="exec", pmap=0xf5f71fa8,ref=1,nentries=5 size=290816,resident:290816,wired=0 version=14 map entry 0xf625ec08: start=0x0, end=0x1000 prot=1/7/copy, object=0xf5f6a7d0, offset=0x0 Object 0xf5f6a7d0: size=0x1000, 1 references 1 resident pages, 0 absent pages, 0 paging ops memory object=0x0 (offset=0x0),control=0x0, name=0xf5938968 uninitialized,temporary internal,copy_strategy=0 shadow=0x0 (offset=0x0),copy=0x0 map entry 0xf625ebb0: start=0x1000, end=0x26000 prot=5/7/copy, object=0xf5f6ad70, offset=0x0 Object 0xf5f6ad70: size=0x25000, 1 references 37 resident pages, 0 absent pages, 0 paging ops memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82780 uninitialized,temporary internal,copy_strategy=0 shadow=0x0 (offset=0x0),copy=0x0 map entry 0xf625eb58: start=0x26000, end=0x34000 prot=1/7/copy, object=0xf5f6ad20, offset=0x0 Object 0xf5f6ad20: size=0xe000, 1 references 14 resident pages, 0 absent pages, 0 paging ops memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82730 uninitialized,temporary internal,copy_strategy=0 shadow=0x0 (offset=0x0),copy=0x0 map entry 0xf625eb00: start=0x34000, end=0x37000 prot=3/7/copy, object=0xf5f6acd0, offset=0x0 Object 0xf5f6acd0: size=0x3000, 1 references 3 resident pages, 0 absent pages, 0 paging ops memory object=0x0 (offset=0x0),control=0x0, name=0xf5f826e0 uninitialized,temporary internal,copy_strategy=0 shadow=0x0 (offset=0x0),copy=0x0 map entry 0xf625eaa8: start=0xbfff0000, end=0xc0000000 prot=3/7/copy, object=0xf5f6ac80, offset=0x0 Object 0xf5f6ac80: size=0x10000, 1 references 16 resident pages, 0 absent pages, 0 paging ops memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82690 uninitialized,temporary internal,copy_strategy=0 shadow=0x0 (offset=0x0),copy=0x0 --8<---------------cut here---------------end--------------->8--- The mappings appear to match the PT_LOAD sections of ld.so: --8<---------------cut here---------------start------------->8--- $ objdump -x /gnu/store/m8afvcgwmrfhvjpd7b0xllk8vv5isd6j-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1|head -16 /gnu/store/m8afvcgwmrfhvjpd7b0xllk8vv5isd6j-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1: file format elf32-i386 /gnu/store/m8afvcgwmrfhvjpd7b0xllk8vv5isd6j-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1 architecture: i386, flags 0x00000150: HAS_SYMS, DYNAMIC, D_PAGED start address 0x000011b0 Program Header: LOAD off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**12 filesz 0x00000dd8 memsz 0x00000dd8 flags r-- LOAD off 0x00001000 vaddr 0x00001000 paddr 0x00001000 align 2**12 filesz 0x000244a1 memsz 0x000244a1 flags r-x LOAD off 0x00026000 vaddr 0x00026000 paddr 0x00026000 align 2**12 filesz 0x0000d5e8 memsz 0x0000d5e8 flags r-- LOAD off 0x00033f60 vaddr 0x00034f60 paddr 0x00034f60 align 2**12 filesz 0x00001910 memsz 0x00001a6c flags rw- --8<---------------cut here---------------end--------------->8--- … so ‘exec_load’ is doing its job, it seems. I also tried to set up a breakpoint on ‘task_terminate’ to see what’s going on when the exec task vanishes: --8<---------------cut here---------------start------------->8--- module 0: ext2fs --multiboot-command-line=${kernel-command-line} --host-priv-por t=${host-port} --device-master-port=${device-port} --exec-server-task=${exec-tas k} --store-type=typed --x-xattr-translator-records ${root} $(task-create) $(task -resume) module 1: exec /gnu/store/0na7ic689glydngxyb7pazjixz9b6629-hurd-0.9-1.91a5167/hu rd/exec $(exec-task=task-create) 2 multiboot modules task loaded: ext2fs --multiboot-command-line=root=device:hd0s1 root=3367134b-cf6 6-e06b-ee4b-4b933367134b gnu.system=/gnu/store/g7kssjfrxqjpbr6r31idiasgglfph2y5- system gnu.load=/gnu/store/g7kssjfrxqjpbr6r31idiasgglfph2y5-system/boot --host-p riv-port=1 --device-master-port=2 --exec-server-task=3 --store-type=typed --x-xa ttr-translator-records device:hd0s1 task loaded: exec /gnu/store/0na7ic689glydngxyb7pazjixz9b6629-hurd-0.9-1.91a5167 /hurd/exec start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1] execKernel Breakpoint trap, eip 0xc10305c1 Breakpoint at task_terminate: pushl %ebp db> show all threads TASK THREADS 0 gnumach (f5f7cf00): 7 threads: 0 (f5f7be18) .W..N. 0xc11dac04 1 (f5f7bcd0) R..O..(idle_thread_continue) 2 (f5f7bb88) .W.ON.(reaper_thread_continue) 0xc12015d4 3 (f5f7ba40) .W.ON.(swapin_thread_continue) 0xc11f8e2c 4 (f5f7b8f8) .W.ON.(sched_thread_continue) 0 5 (f5f7b7b0) .W.ON.(io_done_thread_continue) 0xc1201f74 6 (f5f7b668) .W.ON.(net_thread_continue) 0xc11db0a8 1 ext2fs (f5f7ce40): 6 threads: 0 (f5f7b520) .W.O.F(mach_msg_continue) 0 1 (f5f7b290) .W.O..(mach_msg_receive_continue) 0 2 (f5f7b148) .W.O..(mach_msg_receive_continue) 0 3 (f5f7b000) .W.O..(mach_msg_continue) 0 4 (f67d4e20) .W.O..(mach_msg_receive_continue) 0 5 (f67d4cd8) .W.O..(mach_msg_continue) 0 2 exec (f5f7cd80): (f5f7b3d8) R..... db> trace/t 0xf5f7b3d8 task_terminate(f625eb10,0,f5f7cd80,f5f7b3d8,c11da940) exception_try_task(1,1,bffefffc,ffffffff,c1202b4c)+0x58 exception(1,1,bffefffc,c10096da,f5957fbc)+0x7a interrupted_pc(1,1,bffefffc,c102ce99,c1202b40) trap_name(1,f5957f80,f5f73f4c,f5f73f58) vm_fault(f5f6ff30,bffef000,3,0,0,c1008ee4,f5f82550,fb7d9000)+0x74a user_trap(f5f7a718)+0x2df >>>>> Page fault (14) at 0x1000 <<<<< >>>>> >>>>> user space <<<<< >>>>> db> show map $map2 Map 0xf5f6ff30: name="exec", pmap=0xf5f71fa8,ref=1,nentries=5 size=290816,resident:290816,wired=0 version=14 map entry 0xf625ec08: start=0x0, end=0x1000 prot=1/7/copy, object=0xf5f6a7d0, offset=0x0 Object 0xf5f6a7d0: size=0x1000, 1 references 1 resident pages, 0 absent pages, 0 paging ops memory object=0x0 (offset=0x0),control=0x0, name=0xf5938968 uninitialized,temporary internal,copy_strategy=0 shadow=0x0 (offset=0x0),copy=0x0 map entry 0xf625ebb0: start=0x1000, end=0x26000 prot=5/7/copy, object=0xf5f6ad70, offset=0x0 Object 0xf5f6ad70: size=0x25000, 1 references 37 resident pages, 0 absent pages, 0 paging ops memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82780 uninitialized,temporary internal,copy_strategy=0 shadow=0x0 (offset=0x0),copy=0x0 --8<---------------cut here---------------end--------------->8--- It says “page fault at 0x1000” but there is apparently a valid mapping at that address. Funny thing: if I set a breakpoint on ‘read_exec’ and continue each time it’s hit, the ‘exec’ process starts just fine. Could it be a synchronization issue somewhere? Thanks, Ludo’.