Control: tags -1 + moreinfo Hi Paul,
On Sat, Feb 22, 2025 at 04:26:50PM -0500, Paul DeKraker wrote: > Source: linux > Severity: important > Tags: upstream > X-Debbugs-Cc: pdekraker+deb...@gmail.com > > Dear Maintainer, > > I am experiencing an issue where my system completely locks up when attempting > a large network file transfer from a mounted smb share. When copying a file > above 1 GB I am consistienly experiencing this behavior. I tried going back to > the 6.12.3 kernel which is the oldest I have on the system and the probelm is > there as well. Looking at the dump below my guess is that it was introduced > with 6.12 and netfs/read_collect.c. I have been unable to get a dump with > 6.12.15, but the behavior is consistient. The transfer starts, but after a few > seconds the whole system locks up. > > > 2/22/25 8:38 AM ------------[ cut here ]------------ > 2/22/25 8:38 AM WARNING CPU: 4 PID: 291 at fs/netfs/read_collect.c:110 > netfs_consume_read_data.isra.0+0x67f/0xb50 [netfs] > 2/22/25 8:38 AM Modules linked in ccm nls_utf8 cifs cifs_arc4 > nls_ucs2_utils cifs_md4 dns_resolver netfs snd_seq_dummy snd_hrtimer snd_seq > snd_seq_device xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT > nf_reject_ipv4 > xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 > nf_defrag_ipv4 nf_tables bridge stp llc rfcomm cmac algif_hash algif_skcipher > af_alg overlay qrtr bnep amd_atl intel_rapl_msr intel_rapl_common sunrpc > edac_mce_amd binfmt_misc kvm_amd snd_hda_codec_realtek snd_hda_codec_generic > snd_hda_scodec_component kvm snd_hda_codec_hdmi nls_ascii crct10dif_pclmul > nls_cp437 snd_hda_intel crc32_pclmul ghash_clmulni_intel snd_intel_dspcfg > btusb > snd_intel_sdw_acpi sha512_ssse3 vfat fat snd_hda_codec sha256_ssse3 btrtl > sha1_ssse3 btintel snd_hda_core aesni_intel btbcm ahci btmtk snd_hwdep > gf128mul > r8169 crypto_simd libahci snd_pcm bluetooth cryptd realtek snd_timer libata > rapl sp5100_tco snd watchdog wmi_bmof mdio_devres gigabyte_wmi soundcore > i2c_piix4 pcspkr i2c_smbus rfkill libphy scsi_mod ccp k10temp > 2/22/25 8:38 AM scsi_common button lm92 msr dm_mod parport_pc ppdev lp > parport efi_pstore configfs nfnetlink ip_tables x_tables autofs4 ext4 mbcache > jbd2 razerkbd(OE) efivarfs raid10 raid456 libcrc32c crc32c_generic > async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 > raid0 md_mod evdev joydev razermouse(OE) hid_generic usbhid hid amdgpu video > amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec gpu_sched drm_suballoc_helper > drm_buddy drm_display_helper xhci_pci xhci_hcd drm_kms_helper drm nvme usbcore > cec rc_core nvme_core crc32c_intel crc16 usb_common wmi gpio_amdpt > gpio_generic > 2/22/25 8:38 AM CPU 4 UID: 0 PID: 291 Comm: kworker/4:2 Tainted: G OE > 6.12.3-amd64 #1 Debian 6.12.3-1 > 2/22/25 8:38 AM Tainted [O]=OOT_MODULE, [E]=UNSIGNED_MODULE > 2/22/25 8:38 AM Hardware name Gigabyte Technology Co., Ltd. B550M AORUS > PRO-P/B550M AORUS PRO-P, BIOS F13 07/08/2021 > 2/22/25 8:38 AM Workqueue cifsiod smb2_readv_worker [cifs] > 2/22/25 8:38 AM RIP 0010:netfs_consume_read_data.isra.0+0x67f/0xb50 > [netfs] > 2/22/25 8:38 AM Code 43 28 48 39 c8 0f 84 04 02 00 00 4c 89 40 58 0f 1f 44 > 00 00 0f 1f 44 00 00 48 8b 43 78 48 89 43 68 48 89 43 70 e9 6e fe ff ff <0f> > 0b > 49 8b 47 70 48 8b 74 24 30 8b 7c 24 38 41 0f b7 97 96 00 00 > 2/22/25 8:38 AM RSP 0018:ffffab8240b07dd8 EFLAGS: 00010246 > 2/22/25 8:38 AM RAX 0000000000000000 RBX: 0000000000000000 RCX: > 000000003b200000 > 2/22/25 8:38 AM RDX 000000003b600000 RSI: 000000003b600000 RDI: > ffffdd69cfb90000 > 2/22/25 8:38 AM RBP 0000000000000004 R08: 0000000000000002 R09: > 0000000000400000 > 2/22/25 8:38 AM R10 0000000000000008 R11: 0000000000000008 R12: > ffff9cbba2abdaa8 > 2/22/25 8:38 AM R13 0000000000200000 R14: 000000003b400000 R15: > ffff9cbd00ce2280 > 2/22/25 8:38 AM FS 0000000000000000(0000) GS:ffff9cc93ee00000(0000) > knlGS:0000000000000000 > 2/22/25 8:38 AM CS 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > 2/22/25 8:38 AM CR2 00007f724be0412c CR3: 000000010a98e000 CR4: > 0000000000f50ef0 > 2/22/25 8:38 AM PKRU 55555554 > 2/22/25 8:38 AM Call Trace > 2/22/25 8:38 AM <TASK> > 2/22/25 8:38 AM ? netfs_consume_read_data.isra.0+0x67f/0xb50 [netfs] > 2/22/25 8:38 AM ? __warn.cold+0x93/0xf6 > 2/22/25 8:38 AM ? netfs_consume_read_data.isra.0+0x67f/0xb50 [netfs] > 2/22/25 8:38 AM ? report_bug+0xff/0x140 > 2/22/25 8:38 AM ? handle_bug+0x58/0x90 > 2/22/25 8:38 AM ? exc_invalid_op+0x17/0x70 > 2/22/25 8:38 AM ? asm_exc_invalid_op+0x1a/0x20 > 2/22/25 8:38 AM ? netfs_consume_read_data.isra.0+0x67f/0xb50 [netfs] > 2/22/25 8:38 AM ? netfs_consume_read_data.isra.0+0x48b/0xb50 [netfs] > 2/22/25 8:38 AM ? finish_task_switch.isra.0+0x97/0x2c0 > 2/22/25 8:38 AM netfs_read_subreq_terminated+0x2ab/0x3f0 [netfs] > 2/22/25 8:38 AM process_one_work+0x177/0x330 > 2/22/25 8:38 AM worker_thread+0x252/0x390 > 2/22/25 8:38 AM ? __pfx_worker_thread+0x10/0x10 > 2/22/25 8:38 AM kthread+0xd2/0x100 > 2/22/25 8:38 AM ? __pfx_kthread+0x10/0x10 > 2/22/25 8:38 AM ret_from_fork+0x34/0x50 > 2/22/25 8:38 AM ? __pfx_kthread+0x10/0x10 > 2/22/25 8:38 AM ret_from_fork_asm+0x1a/0x30 > 2/22/25 8:38 AM </TASK> > 2/22/25 8:38 AM ---[ end trace 0000000000000000 ]--- > 2/22/25 8:38 AM netfs R=0000003e[2] s=3b200000-3b7fffff > ctl=400000/600000/600000 sl=4 > 2/22/25 8:38 AM netfs folioq: orders=09090909 > 2/22/25 8:38 AM BUG kernel NULL pointer dereference, address: > 0000000000000000 > 2/22/25 8:38 AM #PF supervisor write access in kernel mode > 2/22/25 8:38 AM #PF error_code(0x0002) - not-present page > 2/22/25 8:38 AM PGD 0 P4D 0 > 2/22/25 8:38 AM Oops Oops: 0002 [#1] PREEMPT SMP NOPTI > 2/22/25 8:38 AM CPU 4 UID: 0 PID: 291 Comm: kworker/4:2 Tainted: G W OE > 6.12.3-amd64 #1 Debian 6.12.3-1 > 2/22/25 8:38 AM Tainted [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE > 2/22/25 8:38 AM Hardware name Gigabyte Technology Co., Ltd. B550M AORUS > PRO-P/B550M AORUS PRO-P, BIOS F13 07/08/2021 > 2/22/25 8:38 AM Workqueue cifsiod smb2_readv_worker [cifs] > 2/22/25 8:38 AM RIP 0010:netfs_consume_read_data.isra.0+0x2db/0xb50 > [netfs] > 2/22/25 8:38 AM Code c4 40 5b 5d 41 5c 41 5d 41 5e 41 5f e9 e9 86 41 e8 8b > 6c 24 38 48 8b 44 24 28 48 89 f3 49 2b 5f 60 49 89 5f 78 4c 8b 6c e8 08 <f0> > 41 > 80 4d 00 08 48 8b 44 24 30 48 8b 80 58 02 00 00 a9 00 00 00 > 2/22/25 8:38 AM RSP 0018:ffffab8240b07dd8 EFLAGS: 00010206 > 2/22/25 8:38 AM RAX ffff9cbdf4cb7200 RBX: 0000000000600000 RCX: > 0000000000000027 > 2/22/25 8:38 AM RDX 0000000000000000 RSI: 000000003b800000 RDI: > ffff9cc93ee21780 > 2/22/25 8:38 AM RBP 0000000000000004 R08: 0000000000000000 R09: > ffffab8240b07c50 > 2/22/25 8:38 AM R10 ffffffffab4b42c8 R11: 0000000000000003 R12: > ffff9cbba2abdaa8 > 2/22/25 8:38 AM R13 0000000000000000 R14: 000000003b600000 R15: > ffff9cbd00ce2280 > 2/22/25 8:38 AM FS 0000000000000000(0000) GS:ffff9cc93ee00000(0000) > knlGS:0000000000000000 > 2/22/25 8:38 AM CS 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > 2/22/25 8:38 AM CR2 0000000000000000 CR3: 000000010a98e000 CR4: > 0000000000f50ef0 > 2/22/25 8:38 AM PKRU 55555554 > 2/22/25 8:38 AM Call Trace > 2/22/25 8:38 AM <TASK> > 2/22/25 8:38 AM ? __die_body.cold+0x19/0x27 > 2/22/25 8:38 AM ? page_fault_oops+0x15a/0x2d0 > 2/22/25 8:38 AM ? exc_page_fault+0x7e/0x180 > 2/22/25 8:38 AM ? asm_exc_page_fault+0x26/0x30 > 2/22/25 8:38 AM ? netfs_consume_read_data.isra.0+0x2db/0xb50 [netfs] > 2/22/25 8:38 AM ? finish_task_switch.isra.0+0x97/0x2c0 > 2/22/25 8:38 AM netfs_read_subreq_terminated+0x2ab/0x3f0 [netfs] > 2/22/25 8:38 AM process_one_work+0x177/0x330 > 2/22/25 8:38 AM worker_thread+0x252/0x390 > 2/22/25 8:38 AM ? __pfx_worker_thread+0x10/0x10 > 2/22/25 8:38 AM kthread+0xd2/0x100 > 2/22/25 8:38 AM ? __pfx_kthread+0x10/0x10 > 2/22/25 8:38 AM ret_from_fork+0x34/0x50 > 2/22/25 8:38 AM ? __pfx_kthread+0x10/0x10 > 2/22/25 8:38 AM ret_from_fork_asm+0x1a/0x30 > 2/22/25 8:38 AM </TASK> > 2/22/25 8:38 AM Modules linked in ccm nls_utf8 cifs cifs_arc4 > nls_ucs2_utils cifs_md4 dns_resolver netfs snd_seq_dummy snd_hrtimer snd_seq > snd_seq_device xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT > nf_reject_ipv4 > xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 > nf_defrag_ipv4 nf_tables bridge stp llc rfcomm cmac algif_hash algif_skcipher > af_alg overlay qrtr bnep amd_atl intel_rapl_msr intel_rapl_common sunrpc > edac_mce_amd binfmt_misc kvm_amd snd_hda_codec_realtek snd_hda_codec_generic > snd_hda_scodec_component kvm snd_hda_codec_hdmi nls_ascii crct10dif_pclmul > nls_cp437 snd_hda_intel crc32_pclmul ghash_clmulni_intel snd_intel_dspcfg > btusb > snd_intel_sdw_acpi sha512_ssse3 vfat fat snd_hda_codec sha256_ssse3 btrtl > sha1_ssse3 btintel snd_hda_core aesni_intel btbcm ahci btmtk snd_hwdep > gf128mul > r8169 crypto_simd libahci snd_pcm bluetooth cryptd realtek snd_timer libata > rapl sp5100_tco snd watchdog wmi_bmof mdio_devres gigabyte_wmi soundcore > i2c_piix4 pcspkr i2c_smbus rfkill libphy scsi_mod ccp k10temp > 2/22/25 8:38 AM scsi_common button lm92 msr dm_mod parport_pc ppdev lp > parport efi_pstore configfs nfnetlink ip_tables x_tables autofs4 ext4 mbcache > jbd2 razerkbd(OE) efivarfs raid10 raid456 libcrc32c crc32c_generic > async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 > raid0 md_mod evdev joydev razermouse(OE) hid_generic usbhid hid amdgpu video > amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec gpu_sched drm_suballoc_helper > drm_buddy drm_display_helper xhci_pci xhci_hcd drm_kms_helper drm nvme usbcore > cec rc_core nvme_core crc32c_intel crc16 usb_common wmi gpio_amdpt > gpio_generic > 2/22/25 8:38 AM CR2 0000000000000000 > 2/22/25 8:38 AM ---[ end trace 0000000000000000 ]--- > 2/22/25 8:38 AM RIP 0010:netfs_consume_read_data.isra.0+0x2db/0xb50 > [netfs] > 2/22/25 8:38 AM Code c4 40 5b 5d 41 5c 41 5d 41 5e 41 5f e9 e9 86 41 e8 8b > 6c 24 38 48 8b 44 24 28 48 89 f3 49 2b 5f 60 49 89 5f 78 4c 8b 6c e8 08 <f0> > 41 > 80 4d 00 08 48 8b 44 24 30 48 8b 80 58 02 00 00 a9 00 00 00 > 2/22/25 8:38 AM RSP 0018:ffffab8240b07dd8 EFLAGS: 00010206 > 2/22/25 8:38 AM RAX ffff9cbdf4cb7200 RBX: 0000000000600000 RCX: > 0000000000000027 > 2/22/25 8:38 AM RDX 0000000000000000 RSI: 000000003b800000 RDI: > ffff9cc93ee21780 > 2/22/25 8:38 AM RBP 0000000000000004 R08: 0000000000000000 R09: > ffffab8240b07c50 > 2/22/25 8:38 AM R10 ffffffffab4b42c8 R11: 0000000000000003 R12: > ffff9cbba2abdaa8 > 2/22/25 8:38 AM R13 0000000000000000 R14: 000000003b600000 R15: > ffff9cbd00ce2280 > 2/22/25 8:38 AM FS 0000000000000000(0000) GS:ffff9cc93ee00000(0000) > knlGS:0000000000000000 > 2/22/25 8:38 AM CS 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > 2/22/25 8:38 AM CR2 0000000000000000 CR3: 000000010a98e000 CR4: > 0000000000f50ef0 > 2/22/25 8:38 AM PKRU 55555554 > 2/22/25 8:38 AM note kworker/4:2[291] exited with irqs disabled Thanks for the report. This very much sounded at first like https://lore.kernel.org/all/CANT5p=qBwjBm-D8soFVVtswGEfmMtQXVW83=tnfutvyhefq...@mail.gmail.com/ which has a fix c8b90d40d5bb ("netfs: Fix non-contiguous donation between completed reads") which OTOH has landed in 6.13-rc7 and 6.12.11. So can you confirm: With the most recent kernel in unstable you do not get anymore above trace, but you observe stalls in transfering a large file. In which case this might be orthogonal to the above. Regards, Salvatore