I just tried 6.12.16 and as with 6.12.15 the system locks up / becomes
completely unresponsive and needs to be hard reset when attempting a large
transfer from a network share to the local computer, so this probably is a
separate issue. If there is something more I can do to capture this
behavior please let me know.

Paul

On Sun, Feb 23, 2025 at 10:36 AM Salvatore Bonaccorso <car...@debian.org>
wrote:

> Control: tags -1 + moreinfo
>
> Hi Paul,
>
> On Sat, Feb 22, 2025 at 04:26:50PM -0500, Paul DeKraker wrote:
> > Source: linux
> > Severity: important
> > Tags: upstream
> > X-Debbugs-Cc: pdekraker+deb...@gmail.com
> >
> > Dear Maintainer,
> >
> > I am experiencing an issue where my system completely locks up when
> attempting
> > a large network file transfer from a mounted smb share. When copying a
> file
> > above 1 GB I am consistienly experiencing this behavior. I tried going
> back to
> > the 6.12.3 kernel which is the oldest I have on the system and the
> probelm is
> > there as well. Looking at the dump below my guess is that it was
> introduced
> > with 6.12 and netfs/read_collect.c.  I have been unable to get a dump
> with
> > 6.12.15, but the behavior is consistient. The transfer starts, but after
> a few
> > seconds the whole system locks up.
> >
> >
> > 2/22/25 8:38 AM         ------------[ cut here ]------------
> > 2/22/25 8:38 AM WARNING CPU: 4 PID: 291 at fs/netfs/read_collect.c:110
> > netfs_consume_read_data.isra.0+0x67f/0xb50 [netfs]
> > 2/22/25 8:38 AM Modules linked in       ccm nls_utf8 cifs cifs_arc4
> > nls_ucs2_utils cifs_md4 dns_resolver netfs snd_seq_dummy snd_hrtimer
> snd_seq
> > snd_seq_device xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT
> nf_reject_ipv4
> > xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6
> > nf_defrag_ipv4 nf_tables bridge stp llc rfcomm cmac algif_hash
> algif_skcipher
> > af_alg overlay qrtr bnep amd_atl intel_rapl_msr intel_rapl_common sunrpc
> > edac_mce_amd binfmt_misc kvm_amd snd_hda_codec_realtek
> snd_hda_codec_generic
> > snd_hda_scodec_component kvm snd_hda_codec_hdmi nls_ascii
> crct10dif_pclmul
> > nls_cp437 snd_hda_intel crc32_pclmul ghash_clmulni_intel
> snd_intel_dspcfg btusb
> > snd_intel_sdw_acpi sha512_ssse3 vfat fat snd_hda_codec sha256_ssse3 btrtl
> > sha1_ssse3 btintel snd_hda_core aesni_intel btbcm ahci btmtk snd_hwdep
> gf128mul
> > r8169 crypto_simd libahci snd_pcm bluetooth cryptd realtek snd_timer
> libata
> > rapl sp5100_tco snd watchdog wmi_bmof mdio_devres gigabyte_wmi soundcore
> > i2c_piix4 pcspkr i2c_smbus rfkill libphy scsi_mod ccp k10temp
> > 2/22/25 8:38 AM         scsi_common button lm92 msr dm_mod parport_pc
> ppdev lp
> > parport efi_pstore configfs nfnetlink ip_tables x_tables autofs4 ext4
> mbcache
> > jbd2 razerkbd(OE) efivarfs raid10 raid456 libcrc32c crc32c_generic
> > async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq
> raid1
> > raid0 md_mod evdev joydev razermouse(OE) hid_generic usbhid hid amdgpu
> video
> > amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec gpu_sched
> drm_suballoc_helper
> > drm_buddy drm_display_helper xhci_pci xhci_hcd drm_kms_helper drm nvme
> usbcore
> > cec rc_core nvme_core crc32c_intel crc16 usb_common wmi gpio_amdpt
> gpio_generic
> > 2/22/25 8:38 AM CPU     4 UID: 0 PID: 291 Comm: kworker/4:2 Tainted: G OE
> > 6.12.3-amd64 #1 Debian 6.12.3-1
> > 2/22/25 8:38 AM Tainted [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> > 2/22/25 8:38 AM Hardware name   Gigabyte Technology Co., Ltd. B550M AORUS
> > PRO-P/B550M AORUS PRO-P, BIOS F13 07/08/2021
> > 2/22/25 8:38 AM Workqueue       cifsiod smb2_readv_worker [cifs]
> > 2/22/25 8:38 AM RIP     0010:netfs_consume_read_data.isra.0+0x67f/0xb50
> [netfs]
> > 2/22/25 8:38 AM Code    43 28 48 39 c8 0f 84 04 02 00 00 4c 89 40 58 0f
> 1f 44
> > 00 00 0f 1f 44 00 00 48 8b 43 78 48 89 43 68 48 89 43 70 e9 6e fe ff ff
> <0f> 0b
> > 49 8b 47 70 48 8b 74 24 30 8b 7c 24 38 41 0f b7 97 96 00 00
> > 2/22/25 8:38 AM RSP     0018:ffffab8240b07dd8 EFLAGS: 00010246
> > 2/22/25 8:38 AM RAX     0000000000000000 RBX: 0000000000000000 RCX:
> > 000000003b200000
> > 2/22/25 8:38 AM RDX     000000003b600000 RSI: 000000003b600000 RDI:
> > ffffdd69cfb90000
> > 2/22/25 8:38 AM RBP     0000000000000004 R08: 0000000000000002 R09:
> > 0000000000400000
> > 2/22/25 8:38 AM R10     0000000000000008 R11: 0000000000000008 R12:
> > ffff9cbba2abdaa8
> > 2/22/25 8:38 AM R13     0000000000200000 R14: 000000003b400000 R15:
> > ffff9cbd00ce2280
> > 2/22/25 8:38 AM FS      0000000000000000(0000) GS:ffff9cc93ee00000(0000)
> > knlGS:0000000000000000
> > 2/22/25 8:38 AM CS      0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > 2/22/25 8:38 AM CR2     00007f724be0412c CR3: 000000010a98e000 CR4:
> > 0000000000f50ef0
> > 2/22/25 8:38 AM PKRU    55555554
> > 2/22/25 8:38 AM Call Trace
> > 2/22/25 8:38 AM         <TASK>
> > 2/22/25 8:38 AM         ? netfs_consume_read_data.isra.0+0x67f/0xb50
> [netfs]
> > 2/22/25 8:38 AM         ? __warn.cold+0x93/0xf6
> > 2/22/25 8:38 AM         ? netfs_consume_read_data.isra.0+0x67f/0xb50
> [netfs]
> > 2/22/25 8:38 AM         ? report_bug+0xff/0x140
> > 2/22/25 8:38 AM         ? handle_bug+0x58/0x90
> > 2/22/25 8:38 AM         ? exc_invalid_op+0x17/0x70
> > 2/22/25 8:38 AM         ? asm_exc_invalid_op+0x1a/0x20
> > 2/22/25 8:38 AM         ? netfs_consume_read_data.isra.0+0x67f/0xb50
> [netfs]
> > 2/22/25 8:38 AM         ? netfs_consume_read_data.isra.0+0x48b/0xb50
> [netfs]
> > 2/22/25 8:38 AM         ? finish_task_switch.isra.0+0x97/0x2c0
> > 2/22/25 8:38 AM         netfs_read_subreq_terminated+0x2ab/0x3f0 [netfs]
> > 2/22/25 8:38 AM         process_one_work+0x177/0x330
> > 2/22/25 8:38 AM         worker_thread+0x252/0x390
> > 2/22/25 8:38 AM         ? __pfx_worker_thread+0x10/0x10
> > 2/22/25 8:38 AM         kthread+0xd2/0x100
> > 2/22/25 8:38 AM         ? __pfx_kthread+0x10/0x10
> > 2/22/25 8:38 AM         ret_from_fork+0x34/0x50
> > 2/22/25 8:38 AM         ? __pfx_kthread+0x10/0x10
> > 2/22/25 8:38 AM         ret_from_fork_asm+0x1a/0x30
> > 2/22/25 8:38 AM         </TASK>
> > 2/22/25 8:38 AM         ---[ end trace 0000000000000000 ]---
> > 2/22/25 8:38 AM netfs   R=0000003e[2] s=3b200000-3b7fffff
> > ctl=400000/600000/600000 sl=4
> > 2/22/25 8:38 AM netfs   folioq: orders=09090909
> > 2/22/25 8:38 AM BUG     kernel NULL pointer dereference, address:
> > 0000000000000000
> > 2/22/25 8:38 AM #PF     supervisor write access in kernel mode
> > 2/22/25 8:38 AM #PF     error_code(0x0002) - not-present page
> > 2/22/25 8:38 AM         PGD 0 P4D 0
> > 2/22/25 8:38 AM Oops    Oops: 0002 [#1] PREEMPT SMP NOPTI
> > 2/22/25 8:38 AM CPU     4 UID: 0 PID: 291 Comm: kworker/4:2 Tainted: G W
> OE
> > 6.12.3-amd64 #1 Debian 6.12.3-1
> > 2/22/25 8:38 AM Tainted [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> > 2/22/25 8:38 AM Hardware name   Gigabyte Technology Co., Ltd. B550M AORUS
> > PRO-P/B550M AORUS PRO-P, BIOS F13 07/08/2021
> > 2/22/25 8:38 AM Workqueue       cifsiod smb2_readv_worker [cifs]
> > 2/22/25 8:38 AM RIP     0010:netfs_consume_read_data.isra.0+0x2db/0xb50
> [netfs]
> > 2/22/25 8:38 AM Code    c4 40 5b 5d 41 5c 41 5d 41 5e 41 5f e9 e9 86 41
> e8 8b
> > 6c 24 38 48 8b 44 24 28 48 89 f3 49 2b 5f 60 49 89 5f 78 4c 8b 6c e8 08
> <f0> 41
> > 80 4d 00 08 48 8b 44 24 30 48 8b 80 58 02 00 00 a9 00 00 00
> > 2/22/25 8:38 AM RSP     0018:ffffab8240b07dd8 EFLAGS: 00010206
> > 2/22/25 8:38 AM RAX     ffff9cbdf4cb7200 RBX: 0000000000600000 RCX:
> > 0000000000000027
> > 2/22/25 8:38 AM RDX     0000000000000000 RSI: 000000003b800000 RDI:
> > ffff9cc93ee21780
> > 2/22/25 8:38 AM RBP     0000000000000004 R08: 0000000000000000 R09:
> > ffffab8240b07c50
> > 2/22/25 8:38 AM R10     ffffffffab4b42c8 R11: 0000000000000003 R12:
> > ffff9cbba2abdaa8
> > 2/22/25 8:38 AM R13     0000000000000000 R14: 000000003b600000 R15:
> > ffff9cbd00ce2280
> > 2/22/25 8:38 AM FS      0000000000000000(0000) GS:ffff9cc93ee00000(0000)
> > knlGS:0000000000000000
> > 2/22/25 8:38 AM CS      0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > 2/22/25 8:38 AM CR2     0000000000000000 CR3: 000000010a98e000 CR4:
> > 0000000000f50ef0
> > 2/22/25 8:38 AM PKRU    55555554
> > 2/22/25 8:38 AM Call Trace
> > 2/22/25 8:38 AM         <TASK>
> > 2/22/25 8:38 AM         ? __die_body.cold+0x19/0x27
> > 2/22/25 8:38 AM         ? page_fault_oops+0x15a/0x2d0
> > 2/22/25 8:38 AM         ? exc_page_fault+0x7e/0x180
> > 2/22/25 8:38 AM         ? asm_exc_page_fault+0x26/0x30
> > 2/22/25 8:38 AM         ? netfs_consume_read_data.isra.0+0x2db/0xb50
> [netfs]
> > 2/22/25 8:38 AM         ? finish_task_switch.isra.0+0x97/0x2c0
> > 2/22/25 8:38 AM         netfs_read_subreq_terminated+0x2ab/0x3f0 [netfs]
> > 2/22/25 8:38 AM         process_one_work+0x177/0x330
> > 2/22/25 8:38 AM         worker_thread+0x252/0x390
> > 2/22/25 8:38 AM         ? __pfx_worker_thread+0x10/0x10
> > 2/22/25 8:38 AM         kthread+0xd2/0x100
> > 2/22/25 8:38 AM         ? __pfx_kthread+0x10/0x10
> > 2/22/25 8:38 AM         ret_from_fork+0x34/0x50
> > 2/22/25 8:38 AM         ? __pfx_kthread+0x10/0x10
> > 2/22/25 8:38 AM         ret_from_fork_asm+0x1a/0x30
> > 2/22/25 8:38 AM         </TASK>
> > 2/22/25 8:38 AM Modules linked in       ccm nls_utf8 cifs cifs_arc4
> > nls_ucs2_utils cifs_md4 dns_resolver netfs snd_seq_dummy snd_hrtimer
> snd_seq
> > snd_seq_device xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT
> nf_reject_ipv4
> > xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6
> > nf_defrag_ipv4 nf_tables bridge stp llc rfcomm cmac algif_hash
> algif_skcipher
> > af_alg overlay qrtr bnep amd_atl intel_rapl_msr intel_rapl_common sunrpc
> > edac_mce_amd binfmt_misc kvm_amd snd_hda_codec_realtek
> snd_hda_codec_generic
> > snd_hda_scodec_component kvm snd_hda_codec_hdmi nls_ascii
> crct10dif_pclmul
> > nls_cp437 snd_hda_intel crc32_pclmul ghash_clmulni_intel
> snd_intel_dspcfg btusb
> > snd_intel_sdw_acpi sha512_ssse3 vfat fat snd_hda_codec sha256_ssse3 btrtl
> > sha1_ssse3 btintel snd_hda_core aesni_intel btbcm ahci btmtk snd_hwdep
> gf128mul
> > r8169 crypto_simd libahci snd_pcm bluetooth cryptd realtek snd_timer
> libata
> > rapl sp5100_tco snd watchdog wmi_bmof mdio_devres gigabyte_wmi soundcore
> > i2c_piix4 pcspkr i2c_smbus rfkill libphy scsi_mod ccp k10temp
> > 2/22/25 8:38 AM         scsi_common button lm92 msr dm_mod parport_pc
> ppdev lp
> > parport efi_pstore configfs nfnetlink ip_tables x_tables autofs4 ext4
> mbcache
> > jbd2 razerkbd(OE) efivarfs raid10 raid456 libcrc32c crc32c_generic
> > async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq
> raid1
> > raid0 md_mod evdev joydev razermouse(OE) hid_generic usbhid hid amdgpu
> video
> > amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec gpu_sched
> drm_suballoc_helper
> > drm_buddy drm_display_helper xhci_pci xhci_hcd drm_kms_helper drm nvme
> usbcore
> > cec rc_core nvme_core crc32c_intel crc16 usb_common wmi gpio_amdpt
> gpio_generic
> > 2/22/25 8:38 AM CR2     0000000000000000
> > 2/22/25 8:38 AM         ---[ end trace 0000000000000000 ]---
> > 2/22/25 8:38 AM RIP     0010:netfs_consume_read_data.isra.0+0x2db/0xb50
> [netfs]
> > 2/22/25 8:38 AM Code    c4 40 5b 5d 41 5c 41 5d 41 5e 41 5f e9 e9 86 41
> e8 8b
> > 6c 24 38 48 8b 44 24 28 48 89 f3 49 2b 5f 60 49 89 5f 78 4c 8b 6c e8 08
> <f0> 41
> > 80 4d 00 08 48 8b 44 24 30 48 8b 80 58 02 00 00 a9 00 00 00
> > 2/22/25 8:38 AM RSP     0018:ffffab8240b07dd8 EFLAGS: 00010206
> > 2/22/25 8:38 AM RAX     ffff9cbdf4cb7200 RBX: 0000000000600000 RCX:
> > 0000000000000027
> > 2/22/25 8:38 AM RDX     0000000000000000 RSI: 000000003b800000 RDI:
> > ffff9cc93ee21780
> > 2/22/25 8:38 AM RBP     0000000000000004 R08: 0000000000000000 R09:
> > ffffab8240b07c50
> > 2/22/25 8:38 AM R10     ffffffffab4b42c8 R11: 0000000000000003 R12:
> > ffff9cbba2abdaa8
> > 2/22/25 8:38 AM R13     0000000000000000 R14: 000000003b600000 R15:
> > ffff9cbd00ce2280
> > 2/22/25 8:38 AM FS      0000000000000000(0000) GS:ffff9cc93ee00000(0000)
> > knlGS:0000000000000000
> > 2/22/25 8:38 AM CS      0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > 2/22/25 8:38 AM CR2     0000000000000000 CR3: 000000010a98e000 CR4:
> > 0000000000f50ef0
> > 2/22/25 8:38 AM PKRU    55555554
> > 2/22/25 8:38 AM note    kworker/4:2[291] exited with irqs disabled
>
> Thanks for the report. This very much sounded at first like
>
> https://lore.kernel.org/all/CANT5p=qBwjBm-D8soFVVtswGEfmMtQXVW83=tnfutvyhefq...@mail.gmail.com/
> which has a fix c8b90d40d5bb ("netfs: Fix non-contiguous donation
> between completed reads") which OTOH has landed in 6.13-rc7 and
> 6.12.11.
>
> So can you confirm: With the most recent kernel in unstable you do not
> get anymore above trace, but you observe stalls in transfering a large
> file. In which case this might be orthogonal to the above.
>
> Regards,
> Salvatore
>

Reply via email to