I just tried 6.12.16 and as with 6.12.15 the system locks up / becomes completely unresponsive and needs to be hard reset when attempting a large transfer from a network share to the local computer, so this probably is a separate issue. If there is something more I can do to capture this behavior please let me know.
Paul On Sun, Feb 23, 2025 at 10:36 AM Salvatore Bonaccorso <car...@debian.org> wrote: > Control: tags -1 + moreinfo > > Hi Paul, > > On Sat, Feb 22, 2025 at 04:26:50PM -0500, Paul DeKraker wrote: > > Source: linux > > Severity: important > > Tags: upstream > > X-Debbugs-Cc: pdekraker+deb...@gmail.com > > > > Dear Maintainer, > > > > I am experiencing an issue where my system completely locks up when > attempting > > a large network file transfer from a mounted smb share. When copying a > file > > above 1 GB I am consistienly experiencing this behavior. I tried going > back to > > the 6.12.3 kernel which is the oldest I have on the system and the > probelm is > > there as well. Looking at the dump below my guess is that it was > introduced > > with 6.12 and netfs/read_collect.c. I have been unable to get a dump > with > > 6.12.15, but the behavior is consistient. The transfer starts, but after > a few > > seconds the whole system locks up. > > > > > > 2/22/25 8:38 AM ------------[ cut here ]------------ > > 2/22/25 8:38 AM WARNING CPU: 4 PID: 291 at fs/netfs/read_collect.c:110 > > netfs_consume_read_data.isra.0+0x67f/0xb50 [netfs] > > 2/22/25 8:38 AM Modules linked in ccm nls_utf8 cifs cifs_arc4 > > nls_ucs2_utils cifs_md4 dns_resolver netfs snd_seq_dummy snd_hrtimer > snd_seq > > snd_seq_device xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT > nf_reject_ipv4 > > xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 > > nf_defrag_ipv4 nf_tables bridge stp llc rfcomm cmac algif_hash > algif_skcipher > > af_alg overlay qrtr bnep amd_atl intel_rapl_msr intel_rapl_common sunrpc > > edac_mce_amd binfmt_misc kvm_amd snd_hda_codec_realtek > snd_hda_codec_generic > > snd_hda_scodec_component kvm snd_hda_codec_hdmi nls_ascii > crct10dif_pclmul > > nls_cp437 snd_hda_intel crc32_pclmul ghash_clmulni_intel > snd_intel_dspcfg btusb > > snd_intel_sdw_acpi sha512_ssse3 vfat fat snd_hda_codec sha256_ssse3 btrtl > > sha1_ssse3 btintel snd_hda_core aesni_intel btbcm ahci btmtk snd_hwdep > gf128mul > > r8169 crypto_simd libahci snd_pcm bluetooth cryptd realtek snd_timer > libata > > rapl sp5100_tco snd watchdog wmi_bmof mdio_devres gigabyte_wmi soundcore > > i2c_piix4 pcspkr i2c_smbus rfkill libphy scsi_mod ccp k10temp > > 2/22/25 8:38 AM scsi_common button lm92 msr dm_mod parport_pc > ppdev lp > > parport efi_pstore configfs nfnetlink ip_tables x_tables autofs4 ext4 > mbcache > > jbd2 razerkbd(OE) efivarfs raid10 raid456 libcrc32c crc32c_generic > > async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq > raid1 > > raid0 md_mod evdev joydev razermouse(OE) hid_generic usbhid hid amdgpu > video > > amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec gpu_sched > drm_suballoc_helper > > drm_buddy drm_display_helper xhci_pci xhci_hcd drm_kms_helper drm nvme > usbcore > > cec rc_core nvme_core crc32c_intel crc16 usb_common wmi gpio_amdpt > gpio_generic > > 2/22/25 8:38 AM CPU 4 UID: 0 PID: 291 Comm: kworker/4:2 Tainted: G OE > > 6.12.3-amd64 #1 Debian 6.12.3-1 > > 2/22/25 8:38 AM Tainted [O]=OOT_MODULE, [E]=UNSIGNED_MODULE > > 2/22/25 8:38 AM Hardware name Gigabyte Technology Co., Ltd. B550M AORUS > > PRO-P/B550M AORUS PRO-P, BIOS F13 07/08/2021 > > 2/22/25 8:38 AM Workqueue cifsiod smb2_readv_worker [cifs] > > 2/22/25 8:38 AM RIP 0010:netfs_consume_read_data.isra.0+0x67f/0xb50 > [netfs] > > 2/22/25 8:38 AM Code 43 28 48 39 c8 0f 84 04 02 00 00 4c 89 40 58 0f > 1f 44 > > 00 00 0f 1f 44 00 00 48 8b 43 78 48 89 43 68 48 89 43 70 e9 6e fe ff ff > <0f> 0b > > 49 8b 47 70 48 8b 74 24 30 8b 7c 24 38 41 0f b7 97 96 00 00 > > 2/22/25 8:38 AM RSP 0018:ffffab8240b07dd8 EFLAGS: 00010246 > > 2/22/25 8:38 AM RAX 0000000000000000 RBX: 0000000000000000 RCX: > > 000000003b200000 > > 2/22/25 8:38 AM RDX 000000003b600000 RSI: 000000003b600000 RDI: > > ffffdd69cfb90000 > > 2/22/25 8:38 AM RBP 0000000000000004 R08: 0000000000000002 R09: > > 0000000000400000 > > 2/22/25 8:38 AM R10 0000000000000008 R11: 0000000000000008 R12: > > ffff9cbba2abdaa8 > > 2/22/25 8:38 AM R13 0000000000200000 R14: 000000003b400000 R15: > > ffff9cbd00ce2280 > > 2/22/25 8:38 AM FS 0000000000000000(0000) GS:ffff9cc93ee00000(0000) > > knlGS:0000000000000000 > > 2/22/25 8:38 AM CS 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > 2/22/25 8:38 AM CR2 00007f724be0412c CR3: 000000010a98e000 CR4: > > 0000000000f50ef0 > > 2/22/25 8:38 AM PKRU 55555554 > > 2/22/25 8:38 AM Call Trace > > 2/22/25 8:38 AM <TASK> > > 2/22/25 8:38 AM ? netfs_consume_read_data.isra.0+0x67f/0xb50 > [netfs] > > 2/22/25 8:38 AM ? __warn.cold+0x93/0xf6 > > 2/22/25 8:38 AM ? netfs_consume_read_data.isra.0+0x67f/0xb50 > [netfs] > > 2/22/25 8:38 AM ? report_bug+0xff/0x140 > > 2/22/25 8:38 AM ? handle_bug+0x58/0x90 > > 2/22/25 8:38 AM ? exc_invalid_op+0x17/0x70 > > 2/22/25 8:38 AM ? asm_exc_invalid_op+0x1a/0x20 > > 2/22/25 8:38 AM ? netfs_consume_read_data.isra.0+0x67f/0xb50 > [netfs] > > 2/22/25 8:38 AM ? netfs_consume_read_data.isra.0+0x48b/0xb50 > [netfs] > > 2/22/25 8:38 AM ? finish_task_switch.isra.0+0x97/0x2c0 > > 2/22/25 8:38 AM netfs_read_subreq_terminated+0x2ab/0x3f0 [netfs] > > 2/22/25 8:38 AM process_one_work+0x177/0x330 > > 2/22/25 8:38 AM worker_thread+0x252/0x390 > > 2/22/25 8:38 AM ? __pfx_worker_thread+0x10/0x10 > > 2/22/25 8:38 AM kthread+0xd2/0x100 > > 2/22/25 8:38 AM ? __pfx_kthread+0x10/0x10 > > 2/22/25 8:38 AM ret_from_fork+0x34/0x50 > > 2/22/25 8:38 AM ? __pfx_kthread+0x10/0x10 > > 2/22/25 8:38 AM ret_from_fork_asm+0x1a/0x30 > > 2/22/25 8:38 AM </TASK> > > 2/22/25 8:38 AM ---[ end trace 0000000000000000 ]--- > > 2/22/25 8:38 AM netfs R=0000003e[2] s=3b200000-3b7fffff > > ctl=400000/600000/600000 sl=4 > > 2/22/25 8:38 AM netfs folioq: orders=09090909 > > 2/22/25 8:38 AM BUG kernel NULL pointer dereference, address: > > 0000000000000000 > > 2/22/25 8:38 AM #PF supervisor write access in kernel mode > > 2/22/25 8:38 AM #PF error_code(0x0002) - not-present page > > 2/22/25 8:38 AM PGD 0 P4D 0 > > 2/22/25 8:38 AM Oops Oops: 0002 [#1] PREEMPT SMP NOPTI > > 2/22/25 8:38 AM CPU 4 UID: 0 PID: 291 Comm: kworker/4:2 Tainted: G W > OE > > 6.12.3-amd64 #1 Debian 6.12.3-1 > > 2/22/25 8:38 AM Tainted [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE > > 2/22/25 8:38 AM Hardware name Gigabyte Technology Co., Ltd. B550M AORUS > > PRO-P/B550M AORUS PRO-P, BIOS F13 07/08/2021 > > 2/22/25 8:38 AM Workqueue cifsiod smb2_readv_worker [cifs] > > 2/22/25 8:38 AM RIP 0010:netfs_consume_read_data.isra.0+0x2db/0xb50 > [netfs] > > 2/22/25 8:38 AM Code c4 40 5b 5d 41 5c 41 5d 41 5e 41 5f e9 e9 86 41 > e8 8b > > 6c 24 38 48 8b 44 24 28 48 89 f3 49 2b 5f 60 49 89 5f 78 4c 8b 6c e8 08 > <f0> 41 > > 80 4d 00 08 48 8b 44 24 30 48 8b 80 58 02 00 00 a9 00 00 00 > > 2/22/25 8:38 AM RSP 0018:ffffab8240b07dd8 EFLAGS: 00010206 > > 2/22/25 8:38 AM RAX ffff9cbdf4cb7200 RBX: 0000000000600000 RCX: > > 0000000000000027 > > 2/22/25 8:38 AM RDX 0000000000000000 RSI: 000000003b800000 RDI: > > ffff9cc93ee21780 > > 2/22/25 8:38 AM RBP 0000000000000004 R08: 0000000000000000 R09: > > ffffab8240b07c50 > > 2/22/25 8:38 AM R10 ffffffffab4b42c8 R11: 0000000000000003 R12: > > ffff9cbba2abdaa8 > > 2/22/25 8:38 AM R13 0000000000000000 R14: 000000003b600000 R15: > > ffff9cbd00ce2280 > > 2/22/25 8:38 AM FS 0000000000000000(0000) GS:ffff9cc93ee00000(0000) > > knlGS:0000000000000000 > > 2/22/25 8:38 AM CS 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > 2/22/25 8:38 AM CR2 0000000000000000 CR3: 000000010a98e000 CR4: > > 0000000000f50ef0 > > 2/22/25 8:38 AM PKRU 55555554 > > 2/22/25 8:38 AM Call Trace > > 2/22/25 8:38 AM <TASK> > > 2/22/25 8:38 AM ? __die_body.cold+0x19/0x27 > > 2/22/25 8:38 AM ? page_fault_oops+0x15a/0x2d0 > > 2/22/25 8:38 AM ? exc_page_fault+0x7e/0x180 > > 2/22/25 8:38 AM ? asm_exc_page_fault+0x26/0x30 > > 2/22/25 8:38 AM ? netfs_consume_read_data.isra.0+0x2db/0xb50 > [netfs] > > 2/22/25 8:38 AM ? finish_task_switch.isra.0+0x97/0x2c0 > > 2/22/25 8:38 AM netfs_read_subreq_terminated+0x2ab/0x3f0 [netfs] > > 2/22/25 8:38 AM process_one_work+0x177/0x330 > > 2/22/25 8:38 AM worker_thread+0x252/0x390 > > 2/22/25 8:38 AM ? __pfx_worker_thread+0x10/0x10 > > 2/22/25 8:38 AM kthread+0xd2/0x100 > > 2/22/25 8:38 AM ? __pfx_kthread+0x10/0x10 > > 2/22/25 8:38 AM ret_from_fork+0x34/0x50 > > 2/22/25 8:38 AM ? __pfx_kthread+0x10/0x10 > > 2/22/25 8:38 AM ret_from_fork_asm+0x1a/0x30 > > 2/22/25 8:38 AM </TASK> > > 2/22/25 8:38 AM Modules linked in ccm nls_utf8 cifs cifs_arc4 > > nls_ucs2_utils cifs_md4 dns_resolver netfs snd_seq_dummy snd_hrtimer > snd_seq > > snd_seq_device xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT > nf_reject_ipv4 > > xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 > > nf_defrag_ipv4 nf_tables bridge stp llc rfcomm cmac algif_hash > algif_skcipher > > af_alg overlay qrtr bnep amd_atl intel_rapl_msr intel_rapl_common sunrpc > > edac_mce_amd binfmt_misc kvm_amd snd_hda_codec_realtek > snd_hda_codec_generic > > snd_hda_scodec_component kvm snd_hda_codec_hdmi nls_ascii > crct10dif_pclmul > > nls_cp437 snd_hda_intel crc32_pclmul ghash_clmulni_intel > snd_intel_dspcfg btusb > > snd_intel_sdw_acpi sha512_ssse3 vfat fat snd_hda_codec sha256_ssse3 btrtl > > sha1_ssse3 btintel snd_hda_core aesni_intel btbcm ahci btmtk snd_hwdep > gf128mul > > r8169 crypto_simd libahci snd_pcm bluetooth cryptd realtek snd_timer > libata > > rapl sp5100_tco snd watchdog wmi_bmof mdio_devres gigabyte_wmi soundcore > > i2c_piix4 pcspkr i2c_smbus rfkill libphy scsi_mod ccp k10temp > > 2/22/25 8:38 AM scsi_common button lm92 msr dm_mod parport_pc > ppdev lp > > parport efi_pstore configfs nfnetlink ip_tables x_tables autofs4 ext4 > mbcache > > jbd2 razerkbd(OE) efivarfs raid10 raid456 libcrc32c crc32c_generic > > async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq > raid1 > > raid0 md_mod evdev joydev razermouse(OE) hid_generic usbhid hid amdgpu > video > > amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec gpu_sched > drm_suballoc_helper > > drm_buddy drm_display_helper xhci_pci xhci_hcd drm_kms_helper drm nvme > usbcore > > cec rc_core nvme_core crc32c_intel crc16 usb_common wmi gpio_amdpt > gpio_generic > > 2/22/25 8:38 AM CR2 0000000000000000 > > 2/22/25 8:38 AM ---[ end trace 0000000000000000 ]--- > > 2/22/25 8:38 AM RIP 0010:netfs_consume_read_data.isra.0+0x2db/0xb50 > [netfs] > > 2/22/25 8:38 AM Code c4 40 5b 5d 41 5c 41 5d 41 5e 41 5f e9 e9 86 41 > e8 8b > > 6c 24 38 48 8b 44 24 28 48 89 f3 49 2b 5f 60 49 89 5f 78 4c 8b 6c e8 08 > <f0> 41 > > 80 4d 00 08 48 8b 44 24 30 48 8b 80 58 02 00 00 a9 00 00 00 > > 2/22/25 8:38 AM RSP 0018:ffffab8240b07dd8 EFLAGS: 00010206 > > 2/22/25 8:38 AM RAX ffff9cbdf4cb7200 RBX: 0000000000600000 RCX: > > 0000000000000027 > > 2/22/25 8:38 AM RDX 0000000000000000 RSI: 000000003b800000 RDI: > > ffff9cc93ee21780 > > 2/22/25 8:38 AM RBP 0000000000000004 R08: 0000000000000000 R09: > > ffffab8240b07c50 > > 2/22/25 8:38 AM R10 ffffffffab4b42c8 R11: 0000000000000003 R12: > > ffff9cbba2abdaa8 > > 2/22/25 8:38 AM R13 0000000000000000 R14: 000000003b600000 R15: > > ffff9cbd00ce2280 > > 2/22/25 8:38 AM FS 0000000000000000(0000) GS:ffff9cc93ee00000(0000) > > knlGS:0000000000000000 > > 2/22/25 8:38 AM CS 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > 2/22/25 8:38 AM CR2 0000000000000000 CR3: 000000010a98e000 CR4: > > 0000000000f50ef0 > > 2/22/25 8:38 AM PKRU 55555554 > > 2/22/25 8:38 AM note kworker/4:2[291] exited with irqs disabled > > Thanks for the report. This very much sounded at first like > > https://lore.kernel.org/all/CANT5p=qBwjBm-D8soFVVtswGEfmMtQXVW83=tnfutvyhefq...@mail.gmail.com/ > which has a fix c8b90d40d5bb ("netfs: Fix non-contiguous donation > between completed reads") which OTOH has landed in 6.13-rc7 and > 6.12.11. > > So can you confirm: With the most recent kernel in unstable you do not > get anymore above trace, but you observe stalls in transfering a large > file. In which case this might be orthogonal to the above. > > Regards, > Salvatore >