Control: tags -1 + moreinfo

Hi Paul,

On Sat, Feb 22, 2025 at 04:26:50PM -0500, Paul DeKraker wrote:
> Source: linux
> Severity: important
> Tags: upstream
> X-Debbugs-Cc: pdekraker+deb...@gmail.com
> 
> Dear Maintainer,
> 
> I am experiencing an issue where my system completely locks up when attempting
> a large network file transfer from a mounted smb share. When copying a file
> above 1 GB I am consistienly experiencing this behavior. I tried going back to
> the 6.12.3 kernel which is the oldest I have on the system and the probelm is
> there as well. Looking at the dump below my guess is that it was introduced
> with 6.12 and netfs/read_collect.c.  I have been unable to get a dump with
> 6.12.15, but the behavior is consistient. The transfer starts, but after a few
> seconds the whole system locks up.
> 
> 
> 2/22/25 8:38 AM         ------------[ cut here ]------------
> 2/22/25 8:38 AM WARNING CPU: 4 PID: 291 at fs/netfs/read_collect.c:110
> netfs_consume_read_data.isra.0+0x67f/0xb50 [netfs]
> 2/22/25 8:38 AM Modules linked in       ccm nls_utf8 cifs cifs_arc4
> nls_ucs2_utils cifs_md4 dns_resolver netfs snd_seq_dummy snd_hrtimer snd_seq
> snd_seq_device xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT 
> nf_reject_ipv4
> xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6
> nf_defrag_ipv4 nf_tables bridge stp llc rfcomm cmac algif_hash algif_skcipher
> af_alg overlay qrtr bnep amd_atl intel_rapl_msr intel_rapl_common sunrpc
> edac_mce_amd binfmt_misc kvm_amd snd_hda_codec_realtek snd_hda_codec_generic
> snd_hda_scodec_component kvm snd_hda_codec_hdmi nls_ascii crct10dif_pclmul
> nls_cp437 snd_hda_intel crc32_pclmul ghash_clmulni_intel snd_intel_dspcfg 
> btusb
> snd_intel_sdw_acpi sha512_ssse3 vfat fat snd_hda_codec sha256_ssse3 btrtl
> sha1_ssse3 btintel snd_hda_core aesni_intel btbcm ahci btmtk snd_hwdep 
> gf128mul
> r8169 crypto_simd libahci snd_pcm bluetooth cryptd realtek snd_timer libata
> rapl sp5100_tco snd watchdog wmi_bmof mdio_devres gigabyte_wmi soundcore
> i2c_piix4 pcspkr i2c_smbus rfkill libphy scsi_mod ccp k10temp
> 2/22/25 8:38 AM         scsi_common button lm92 msr dm_mod parport_pc ppdev lp
> parport efi_pstore configfs nfnetlink ip_tables x_tables autofs4 ext4 mbcache
> jbd2 razerkbd(OE) efivarfs raid10 raid456 libcrc32c crc32c_generic
> async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1
> raid0 md_mod evdev joydev razermouse(OE) hid_generic usbhid hid amdgpu video
> amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec gpu_sched drm_suballoc_helper
> drm_buddy drm_display_helper xhci_pci xhci_hcd drm_kms_helper drm nvme usbcore
> cec rc_core nvme_core crc32c_intel crc16 usb_common wmi gpio_amdpt 
> gpio_generic
> 2/22/25 8:38 AM CPU     4 UID: 0 PID: 291 Comm: kworker/4:2 Tainted: G OE
> 6.12.3-amd64 #1 Debian 6.12.3-1
> 2/22/25 8:38 AM Tainted [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> 2/22/25 8:38 AM Hardware name   Gigabyte Technology Co., Ltd. B550M AORUS
> PRO-P/B550M AORUS PRO-P, BIOS F13 07/08/2021
> 2/22/25 8:38 AM Workqueue       cifsiod smb2_readv_worker [cifs]
> 2/22/25 8:38 AM RIP     0010:netfs_consume_read_data.isra.0+0x67f/0xb50 
> [netfs]
> 2/22/25 8:38 AM Code    43 28 48 39 c8 0f 84 04 02 00 00 4c 89 40 58 0f 1f 44
> 00 00 0f 1f 44 00 00 48 8b 43 78 48 89 43 68 48 89 43 70 e9 6e fe ff ff <0f> 
> 0b
> 49 8b 47 70 48 8b 74 24 30 8b 7c 24 38 41 0f b7 97 96 00 00
> 2/22/25 8:38 AM RSP     0018:ffffab8240b07dd8 EFLAGS: 00010246
> 2/22/25 8:38 AM RAX     0000000000000000 RBX: 0000000000000000 RCX:
> 000000003b200000
> 2/22/25 8:38 AM RDX     000000003b600000 RSI: 000000003b600000 RDI:
> ffffdd69cfb90000
> 2/22/25 8:38 AM RBP     0000000000000004 R08: 0000000000000002 R09:
> 0000000000400000
> 2/22/25 8:38 AM R10     0000000000000008 R11: 0000000000000008 R12:
> ffff9cbba2abdaa8
> 2/22/25 8:38 AM R13     0000000000200000 R14: 000000003b400000 R15:
> ffff9cbd00ce2280
> 2/22/25 8:38 AM FS      0000000000000000(0000) GS:ffff9cc93ee00000(0000)
> knlGS:0000000000000000
> 2/22/25 8:38 AM CS      0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> 2/22/25 8:38 AM CR2     00007f724be0412c CR3: 000000010a98e000 CR4:
> 0000000000f50ef0
> 2/22/25 8:38 AM PKRU    55555554
> 2/22/25 8:38 AM Call Trace
> 2/22/25 8:38 AM         <TASK>
> 2/22/25 8:38 AM         ? netfs_consume_read_data.isra.0+0x67f/0xb50 [netfs]
> 2/22/25 8:38 AM         ? __warn.cold+0x93/0xf6
> 2/22/25 8:38 AM         ? netfs_consume_read_data.isra.0+0x67f/0xb50 [netfs]
> 2/22/25 8:38 AM         ? report_bug+0xff/0x140
> 2/22/25 8:38 AM         ? handle_bug+0x58/0x90
> 2/22/25 8:38 AM         ? exc_invalid_op+0x17/0x70
> 2/22/25 8:38 AM         ? asm_exc_invalid_op+0x1a/0x20
> 2/22/25 8:38 AM         ? netfs_consume_read_data.isra.0+0x67f/0xb50 [netfs]
> 2/22/25 8:38 AM         ? netfs_consume_read_data.isra.0+0x48b/0xb50 [netfs]
> 2/22/25 8:38 AM         ? finish_task_switch.isra.0+0x97/0x2c0
> 2/22/25 8:38 AM         netfs_read_subreq_terminated+0x2ab/0x3f0 [netfs]
> 2/22/25 8:38 AM         process_one_work+0x177/0x330
> 2/22/25 8:38 AM         worker_thread+0x252/0x390
> 2/22/25 8:38 AM         ? __pfx_worker_thread+0x10/0x10
> 2/22/25 8:38 AM         kthread+0xd2/0x100
> 2/22/25 8:38 AM         ? __pfx_kthread+0x10/0x10
> 2/22/25 8:38 AM         ret_from_fork+0x34/0x50
> 2/22/25 8:38 AM         ? __pfx_kthread+0x10/0x10
> 2/22/25 8:38 AM         ret_from_fork_asm+0x1a/0x30
> 2/22/25 8:38 AM         </TASK>
> 2/22/25 8:38 AM         ---[ end trace 0000000000000000 ]---
> 2/22/25 8:38 AM netfs   R=0000003e[2] s=3b200000-3b7fffff
> ctl=400000/600000/600000 sl=4
> 2/22/25 8:38 AM netfs   folioq: orders=09090909
> 2/22/25 8:38 AM BUG     kernel NULL pointer dereference, address:
> 0000000000000000
> 2/22/25 8:38 AM #PF     supervisor write access in kernel mode
> 2/22/25 8:38 AM #PF     error_code(0x0002) - not-present page
> 2/22/25 8:38 AM         PGD 0 P4D 0
> 2/22/25 8:38 AM Oops    Oops: 0002 [#1] PREEMPT SMP NOPTI
> 2/22/25 8:38 AM CPU     4 UID: 0 PID: 291 Comm: kworker/4:2 Tainted: G W OE
> 6.12.3-amd64 #1 Debian 6.12.3-1
> 2/22/25 8:38 AM Tainted [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> 2/22/25 8:38 AM Hardware name   Gigabyte Technology Co., Ltd. B550M AORUS
> PRO-P/B550M AORUS PRO-P, BIOS F13 07/08/2021
> 2/22/25 8:38 AM Workqueue       cifsiod smb2_readv_worker [cifs]
> 2/22/25 8:38 AM RIP     0010:netfs_consume_read_data.isra.0+0x2db/0xb50 
> [netfs]
> 2/22/25 8:38 AM Code    c4 40 5b 5d 41 5c 41 5d 41 5e 41 5f e9 e9 86 41 e8 8b
> 6c 24 38 48 8b 44 24 28 48 89 f3 49 2b 5f 60 49 89 5f 78 4c 8b 6c e8 08 <f0> 
> 41
> 80 4d 00 08 48 8b 44 24 30 48 8b 80 58 02 00 00 a9 00 00 00
> 2/22/25 8:38 AM RSP     0018:ffffab8240b07dd8 EFLAGS: 00010206
> 2/22/25 8:38 AM RAX     ffff9cbdf4cb7200 RBX: 0000000000600000 RCX:
> 0000000000000027
> 2/22/25 8:38 AM RDX     0000000000000000 RSI: 000000003b800000 RDI:
> ffff9cc93ee21780
> 2/22/25 8:38 AM RBP     0000000000000004 R08: 0000000000000000 R09:
> ffffab8240b07c50
> 2/22/25 8:38 AM R10     ffffffffab4b42c8 R11: 0000000000000003 R12:
> ffff9cbba2abdaa8
> 2/22/25 8:38 AM R13     0000000000000000 R14: 000000003b600000 R15:
> ffff9cbd00ce2280
> 2/22/25 8:38 AM FS      0000000000000000(0000) GS:ffff9cc93ee00000(0000)
> knlGS:0000000000000000
> 2/22/25 8:38 AM CS      0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> 2/22/25 8:38 AM CR2     0000000000000000 CR3: 000000010a98e000 CR4:
> 0000000000f50ef0
> 2/22/25 8:38 AM PKRU    55555554
> 2/22/25 8:38 AM Call Trace
> 2/22/25 8:38 AM         <TASK>
> 2/22/25 8:38 AM         ? __die_body.cold+0x19/0x27
> 2/22/25 8:38 AM         ? page_fault_oops+0x15a/0x2d0
> 2/22/25 8:38 AM         ? exc_page_fault+0x7e/0x180
> 2/22/25 8:38 AM         ? asm_exc_page_fault+0x26/0x30
> 2/22/25 8:38 AM         ? netfs_consume_read_data.isra.0+0x2db/0xb50 [netfs]
> 2/22/25 8:38 AM         ? finish_task_switch.isra.0+0x97/0x2c0
> 2/22/25 8:38 AM         netfs_read_subreq_terminated+0x2ab/0x3f0 [netfs]
> 2/22/25 8:38 AM         process_one_work+0x177/0x330
> 2/22/25 8:38 AM         worker_thread+0x252/0x390
> 2/22/25 8:38 AM         ? __pfx_worker_thread+0x10/0x10
> 2/22/25 8:38 AM         kthread+0xd2/0x100
> 2/22/25 8:38 AM         ? __pfx_kthread+0x10/0x10
> 2/22/25 8:38 AM         ret_from_fork+0x34/0x50
> 2/22/25 8:38 AM         ? __pfx_kthread+0x10/0x10
> 2/22/25 8:38 AM         ret_from_fork_asm+0x1a/0x30
> 2/22/25 8:38 AM         </TASK>
> 2/22/25 8:38 AM Modules linked in       ccm nls_utf8 cifs cifs_arc4
> nls_ucs2_utils cifs_md4 dns_resolver netfs snd_seq_dummy snd_hrtimer snd_seq
> snd_seq_device xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT 
> nf_reject_ipv4
> xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6
> nf_defrag_ipv4 nf_tables bridge stp llc rfcomm cmac algif_hash algif_skcipher
> af_alg overlay qrtr bnep amd_atl intel_rapl_msr intel_rapl_common sunrpc
> edac_mce_amd binfmt_misc kvm_amd snd_hda_codec_realtek snd_hda_codec_generic
> snd_hda_scodec_component kvm snd_hda_codec_hdmi nls_ascii crct10dif_pclmul
> nls_cp437 snd_hda_intel crc32_pclmul ghash_clmulni_intel snd_intel_dspcfg 
> btusb
> snd_intel_sdw_acpi sha512_ssse3 vfat fat snd_hda_codec sha256_ssse3 btrtl
> sha1_ssse3 btintel snd_hda_core aesni_intel btbcm ahci btmtk snd_hwdep 
> gf128mul
> r8169 crypto_simd libahci snd_pcm bluetooth cryptd realtek snd_timer libata
> rapl sp5100_tco snd watchdog wmi_bmof mdio_devres gigabyte_wmi soundcore
> i2c_piix4 pcspkr i2c_smbus rfkill libphy scsi_mod ccp k10temp
> 2/22/25 8:38 AM         scsi_common button lm92 msr dm_mod parport_pc ppdev lp
> parport efi_pstore configfs nfnetlink ip_tables x_tables autofs4 ext4 mbcache
> jbd2 razerkbd(OE) efivarfs raid10 raid456 libcrc32c crc32c_generic
> async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1
> raid0 md_mod evdev joydev razermouse(OE) hid_generic usbhid hid amdgpu video
> amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec gpu_sched drm_suballoc_helper
> drm_buddy drm_display_helper xhci_pci xhci_hcd drm_kms_helper drm nvme usbcore
> cec rc_core nvme_core crc32c_intel crc16 usb_common wmi gpio_amdpt 
> gpio_generic
> 2/22/25 8:38 AM CR2     0000000000000000
> 2/22/25 8:38 AM         ---[ end trace 0000000000000000 ]---
> 2/22/25 8:38 AM RIP     0010:netfs_consume_read_data.isra.0+0x2db/0xb50 
> [netfs]
> 2/22/25 8:38 AM Code    c4 40 5b 5d 41 5c 41 5d 41 5e 41 5f e9 e9 86 41 e8 8b
> 6c 24 38 48 8b 44 24 28 48 89 f3 49 2b 5f 60 49 89 5f 78 4c 8b 6c e8 08 <f0> 
> 41
> 80 4d 00 08 48 8b 44 24 30 48 8b 80 58 02 00 00 a9 00 00 00
> 2/22/25 8:38 AM RSP     0018:ffffab8240b07dd8 EFLAGS: 00010206
> 2/22/25 8:38 AM RAX     ffff9cbdf4cb7200 RBX: 0000000000600000 RCX:
> 0000000000000027
> 2/22/25 8:38 AM RDX     0000000000000000 RSI: 000000003b800000 RDI:
> ffff9cc93ee21780
> 2/22/25 8:38 AM RBP     0000000000000004 R08: 0000000000000000 R09:
> ffffab8240b07c50
> 2/22/25 8:38 AM R10     ffffffffab4b42c8 R11: 0000000000000003 R12:
> ffff9cbba2abdaa8
> 2/22/25 8:38 AM R13     0000000000000000 R14: 000000003b600000 R15:
> ffff9cbd00ce2280
> 2/22/25 8:38 AM FS      0000000000000000(0000) GS:ffff9cc93ee00000(0000)
> knlGS:0000000000000000
> 2/22/25 8:38 AM CS      0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> 2/22/25 8:38 AM CR2     0000000000000000 CR3: 000000010a98e000 CR4:
> 0000000000f50ef0
> 2/22/25 8:38 AM PKRU    55555554
> 2/22/25 8:38 AM note    kworker/4:2[291] exited with irqs disabled

Thanks for the report. This very much sounded at first like
https://lore.kernel.org/all/CANT5p=qBwjBm-D8soFVVtswGEfmMtQXVW83=tnfutvyhefq...@mail.gmail.com/
which has a fix c8b90d40d5bb ("netfs: Fix non-contiguous donation
between completed reads") which OTOH has landed in 6.13-rc7 and
6.12.11.

So can you confirm: With the most recent kernel in unstable you do not
get anymore above trace, but you observe stalls in transfering a large
file. In which case this might be orthogonal to the above.

Regards,
Salvatore

Reply via email to