On Thu, 2024-11-21 at 22:03 +0100, Salvatore Bonaccorso wrote:
> Control: tags -1 + moreinfo
> 
> Hi Benjamin,
> 
> On Wed, Nov 20, 2024 at 02:22:42AM +0100, Benjamin Drung wrote:
> > Package: linux
> > Version: 6.11.9-1
> > Severity: normal
> > X-Debbugs-Cc: bdr...@debian.org
> > 
> > Dear Maintainer,
> > 
> > Running the dracut test TEST-60-NFS on Debian unstable with
> > linux-image-6.11.9-amd64 fails with following kernel crash:
> > 
> > ```
> > [   15.600535] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state 
> > recovery directory
> > [   15.602863] NFSD: Using legacy client tracking operations.
> > [   15.603059] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state 
> > recovery directory
> > [   15.603569] ------------[ cut here ]------------
> > [   15.603706] kernel BUG at fs/nfsd/nfs4recover.c:534!
> > [   15.604360] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> > [   15.604743] CPU: 0 UID: 0 PID: 471 Comm: rpc.nfsd Not tainted 
> > 6.11.9-amd64 #1  Debian 6.11.9-1
> > [   15.605019] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
> > 1.16.3-debian-1.16.3-2 04/01/2014
> > [   15.605337] RIP: 0010:nfsd4_legacy_tracking_init+0x17d/0x1b0 [nfsd]
> > [   15.606083] Code: 19 48 89 de 48 c7 c7 10 90 9c c0 e8 6d fb ff ff 89 c5 
> > 85 c0 0f 85 30 60 00 00 48 c7 c7 c0 af a3 c0 31 ed e8 25 b0 ca d2 eb 07 
> > <0f> 0b bd f4 ff ff ff 48 8b 44 24 08 65 48 2b 04 25 28 00 00 00 75
> > [   15.606343] RSP: 0018:ff345c4e803fbb60 EFLAGS: 00010286
> > [   15.606343] RAX: 0000000000000049 RBX: ff2fd43447182000 RCX: 
> > 0000000000000003
> > [   15.606343] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 
> > 0000000000000001
> > [   15.606343] RBP: ffffffff9525dd40 R08: 0000000000000000 R09: 
> > ff345c4e803fb9f0
> > [   15.606343] R10: ffffffff946b41e8 R11: 0000000000000003 R12: 
> > ff2fd43447182000
> > [   15.606343] R13: ff2fd43447182000 R14: ff2fd43469336c00 R15: 
> > ff2fd43447182000
> > [   15.606343] FS:  00007fe05a5e9740(0000) GS:ff2fd4347ce00000(0000) 
> > knlGS:0000000000000000
> > [   15.606343] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   15.606343] CR2: 0000559addf39db0 CR3: 000000002836e000 CR4: 
> > 0000000000751ef0
> > [   15.606343] PKRU: 55555554
> > [   15.606343] Call Trace:
> > [   15.606343]  <TASK>
> > [   15.606343]  ? __die_body.cold+0x19/0x27
> > [   15.606343]  ? die+0x2e/0x50
> > [   15.606343]  ? do_trap+0xca/0x110
> > [   15.606343]  ? do_error_trap+0x6a/0x90
> > [   15.606343]  ? nfsd4_legacy_tracking_init+0x17d/0x1b0 [nfsd]
> > [   15.606343]  ? exc_invalid_op+0x50/0x70
> > [   15.606343]  ? nfsd4_legacy_tracking_init+0x17d/0x1b0 [nfsd]
> > [   15.606343]  ? asm_exc_invalid_op+0x1a/0x20
> > [   15.606343]  ? nfsd4_legacy_tracking_init+0x17d/0x1b0 [nfsd]
> > [   15.606343]  nfsd4_client_tracking_init+0x57/0x1b0 [nfsd]
> > [   15.606343]  nfs4_state_start_net+0x2f9/0x3a0 [nfsd]
> > [   15.606343]  nfsd_svc+0x1b9/0x340 [nfsd]
> > [   15.606343]  write_threads+0xfc/0x1c0 [nfsd]
> > [   15.606343]  ? __pfx_write_threads+0x10/0x10 [nfsd]
> > [   15.606343]  nfsctl_transaction_write+0x4d/0x80 [nfsd]
> > [   15.606343]  vfs_write+0xfe/0x460
> > [   15.606343]  ksys_write+0x6d/0xf0
> > [   15.606343]  do_syscall_64+0x82/0x190
> > [   15.606343]  ? syscall_exit_to_user_mode+0x4d/0x210
> > [   15.606343]  ? do_syscall_64+0x8e/0x190
> > [   15.606343]  ? __x64_sys_getdents64+0xfa/0x130
> > [   15.606343]  ? __pfx_filldir64+0x10/0x10
> > [   15.606343]  ? syscall_exit_to_user_mode+0x4d/0x210
> > [   15.606343]  ? do_syscall_64+0x8e/0x190
> > [   15.606343]  ? __count_memcg_events+0x58/0xf0
> > [   15.606343]  ? count_memcg_events.constprop.0+0x1a/0x30
> > [   15.606343]  ? handle_mm_fault+0x1bb/0x2c0
> > [   15.606343]  ? do_user_addr_fault+0x36c/0x620
> > [   15.606343]  ? exc_page_fault+0x7e/0x180
> > [   15.606343]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [   15.606343] RIP: 0033:0x7fe05a6f0210
> > [   15.606343] Code: 2c 0e 00 64 c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 
> > 0f 1f 84 00 00 00 00 00 80 3d 59 ae 0e 00 00 74 17 b8 01 00 00 00 0f 05 
> > <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89
> > [   15.606343] RSP: 002b:00007fff649d2b08 EFLAGS: 00000202 ORIG_RAX: 
> > 0000000000000001
> > [   15.606343] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 
> > 00007fe05a6f0210
> > [   15.606343] RDX: 0000000000000002 RSI: 000056540dbbb340 RDI: 
> > 0000000000000003
> > [   15.606343] RBP: 000056540dbbb340 R08: 0000000000000064 R09: 
> > 00000000ffffffff
> > [   15.606343] R10: 0000000000000000 R11: 0000000000000202 R12: 
> > 0000000000020000
> > [   15.606343] R13: 000056540dbb7116 R14: 000056543353a2a0 R15: 
> > 0000000000000000
> > [   15.606343]  </TASK>
> > [   15.606343] Modules linked in: nfsd auth_rpcgss nfs_acl lockd grace ext4 
> > crc16 mbcache jbd2 crc32c_generic sd_mod ahci libahci libata virtio_scsi 
> > scsi_mod crc32_pclmul crc32c_intel scsi_common virtio_net net_failover 
> > failover i6300esb watchdog sunrpc qemu_fw_cfg virtio_rng autofs4
> > [   15.618032] ---[ end trace 0000000000000000 ]---
> > [   15.618166] RIP: 0010:nfsd4_legacy_tracking_init+0x17d/0x1b0 [nfsd]
> > [   15.618718] Code: 19 48 89 de 48 c7 c7 10 90 9c c0 e8 6d fb ff ff 89 c5 
> > 85 c0 0f 85 30 60 00 00 48 c7 c7 c0 af a3 c0 31 ed e8 25 b0 ca d2 eb 07 
> > <0f> 0b bd f4 ff ff ff 48 8b 44 24 08 65 48 2b 04 25 28 00 00 00 75
> > [   15.619086] RSP: 0018:ff345c4e803fbb60 EFLAGS: 00010286
> > [   15.619198] RAX: 0000000000000049 RBX: ff2fd43447182000 RCX: 
> > 0000000000000003
> > [   15.619336] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 
> > 0000000000000001
> > [   15.619472] RBP: ffffffff9525dd40 R08: 0000000000000000 R09: 
> > ff345c4e803fb9f0
> > [   15.619609] R10: ffffffff946b41e8 R11: 0000000000000003 R12: 
> > ff2fd43447182000
> > [   15.619746] R13: ff2fd43447182000 R14: ff2fd43469336c00 R15: 
> > ff2fd43447182000
> > [   15.619888] FS:  00007fe05a5e9740(0000) GS:ff2fd4347ce00000(0000) 
> > knlGS:0000000000000000
> > [   15.620045] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   15.620158] CR2: 0000559addf39db0 CR3: 000000002836e000 CR4: 
> > 0000000000751ef0
> > [   15.620296] PKRU: 55555554
> > [   15.620469] Kernel panic - not syncing: Fatal exception
> > [   15.621342] Kernel Offset: 0x11a00000 from 0xffffffff81000000 
> > (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > ```
> > 
> > This crash is 100% reproducible and I can easily test different kernels.
> > The TEST-60-NFS works fine on Ubuntu oracular.
> > linux-image-6.12-rc6-amd64 6.12~rc6-1~exp1 from experimental is affected
> > as well.
> 
> Just to be clear, is this something you freshly hit with those version
> or was the problem present before? If you have a last good version,
> would you be able to bisect the changes to identify the culprit
> introducing the issue?

I hit this bug when I tried to introduce the nfs autopkgtest. I don't
know a good version in Debian. I pushed the this upstream-dracut-
network-nfs autopkgtest for dracut to the debian-nfs branch:
https://salsa.debian.org/debian/dracut/-/commits/debian-nfs?ref_type=heads
Test:
https://salsa.debian.org/debian/dracut/-/commit/a5b1da9ff33d412cc886408c3e6cafec265d6e29
So you should be able to reproduce it.

The same test case upstream-dracut-network-nfs works on Ubuntu with
linux 6.11.0-8.8:
https://autopkgtest.ubuntu.com/results/autopkgtest-plucky/plucky/amd64/d/dracut/20241121_232300_a5f72@/log.gz

> I have so far not found an already known regression report specific to
> this recently but there is a report back in august we found as 
> https://lore.kernel.org/all/23faefd973c63f9b0ec8a735acb1ff1409776163.ca...@linuxfoundation.org/

Yes, that looks similar.

> In any case since you can reliably reproduce the issue, can you please
> report it to upstream (linux-nfs list and relevant maintainers)?

I can do that.

-- 
Benjamin Drung
Debian & Ubuntu Developer

Reply via email to