On Tue, Mar 9, 2021 at 8:36 PM Dmitry Vyukov <dvyu...@google.com> wrote: > FWIW the code looks reasonable: > > All code > ======== > 0: 00 00 add %al,(%rax) > 2: 00 00 add %al,(%rax) > 4: 41 57 push %r15 > 6: 41 56 push %r14 > 8: 41 55 push %r13 > a: 41 54 push %r12 > c: 55 push %rbp > d: 53 push %rbx > e: 89 fd mov %edi,%ebp > 10: 48 81 ec 48 01 00 00 sub $0x148,%rsp > 17: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax > 1e: 00 00 > 20: 48 89 84 24 38 01 00 mov %rax,0x138(%rsp) > 27: 00 > 28: 31 c0 xor %eax,%eax > 2a:* e8 f5 bf f7 ff callq 0xfffffffffff7c024 <-- trapping > instruction > 2f: 83 f8 01 cmp $0x1,%eax > 32: 0f 84 b7 00 00 00 je 0xef > 38: 48 rex.W > 39: 8d .byte 0x8d > 3a: 9c pushfq > 3b: 40 rex > > This is a PC-relative call to a reasonable address, right? > I wonder if it always traps on this instruction or not. Maybe the > executable is corrupted and has a page missing in the image or > something similar. But also if we suspect a badly corrupted image, is > it worth pursuing it?...
I copied over a new systemd binary from a fresh disk image generated using tools/create-image.sh in syzkaller (debootstrap) and the bug was still reproducible. root@sandbox:~# md5sum /lib/systemd/systemd 12b20bfd8321ef7884b4dbf974a91213 /lib/systemd/systemd root@sandbox:~# md5sum /lib/systemd/systemd_orig 12b20bfd8321ef7884b4dbf974a91213 /lib/systemd/systemd_orig root@sandbox:~# gcc -pthread hax.c -o repro root@sandbox:~# ./repro [ 115.515840] got to 221 [ 115.515853] got to 183 [ 115.516400] got to 201 [ 115.516935] got to 208 [ 115.517475] got to 210 [ 115.521008] got to 270 [ 115.544984] systemd[1]: segfault at 7ffe972adfb8 ip 00005560fb079466 sp 00007ffe972adfc0 error 6 in systemd[5560fafcd000+ed000] [ 115.546554] Code: 00 00 00 00 41 57 41 56 41 55 41 54 55 53 89 fd 48 81 ec 48 01 00 00 64 48 8b 04 25 28 00 00 00 48 89 84 24 38 01 00 00 31 c0 <e8> f5 bf f7 ff 83 f8 01 0f 84 b7 00 00 00 48 8d 9c 240 [ 115.548575] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 115.549352] CPU: 0 PID: 1 Comm: systemd Not tainted 5.11.2+ #22 [ 115.549994] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1 04/01/2014 [ 115.550834] Call Trace: [ 115.551090] dump_stack+0xb2/0xe4 [ 115.551438] panic+0x196/0x502 [ 115.551798] do_exit.cold+0x70/0x108 [ 115.552170] do_group_exit+0x78/0x120 [ 115.552552] get_signal+0x22e/0xd60 [ 115.552916] arch_do_signal_or_restart+0xef/0x890 [ 115.553407] exit_to_user_mode_prepare+0x102/0x190 [ 115.553920] irqentry_exit_to_user_mode+0x9/0x20 [ 115.554412] irqentry_exit+0x19/0x30 [ 115.554781] exc_page_fault+0xc3/0x240 [ 115.555168] ? asm_exc_page_fault+0x8/0x30 [ 115.555626] asm_exc_page_fault+0x1e/0x30 [ 115.556092] RIP: 0033:0x5560fb079466 [ 115.556476] Code: 00 00 00 00 41 57 41 56 41 55 41 54 55 53 89 fd 48 81 ec 48 01 00 00 64 48 8b 04 25 28 00 00 00 48 89 84 24 38 01 00 00 31 c0 <e8> f5 bf f7 ff 83 f8 01 0f 84 b7 00 00 00 48 8d 9c 240 [ 115.558399] RSP: 002b:00007ffe972adfc0 EFLAGS: 00010246 [ 115.558947] RAX: 0000000000000000 RBX: 00005560fcaa7f40 RCX: 00007ff6fb1c22e3 [ 115.559720] RDX: 00007ffe972ae140 RSI: 00007ffe972ae270 RDI: 0000000000000007 [ 115.560475] RBP: 0000000000000007 R08: 431bde82d7b634db R09: 000000000000000b [ 115.561219] R10: 00000000ffffffff R11: 0000000000000246 R12: 00007ffe97aad190 [ 115.561963] R13: 0000000000000001 R14: ffffffffffffffff R15: 0000000000000002 [ 115.562768] Kernel Offset: disabled [ 115.563148] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]--- For sanity, I created a new disk image altogether, made a replica of the image and ran syzkaller on the first copy of the image to find a new reproducer for this bug. [NEW IMAGE] [NEW IMAGE REPLICA] Used by syzkaller Used for testing the reproducer manually After discovering the new reproducer for this fresh image, I triggered the new reproducer on the *untainted* replica of the image and the bug was reproducible. This would invalidate the assumption that the image/binaries on the image are corrupted.