Public bug reported: Steps to reproduce :
1. Power on the NVMeoF enclosure. 2. Discover and connect the drives. 3. Create 2 zpools with even and odd drives. 4. Start the IO on both pools created. Observation : 1. Observed call trace while running ZFS IO Able to see "failed to send request-5" and drive went continuously reconnected state. 2. The issue is seen with Ubuntu 24.04.1 with kernel 6.8.0-49.generic kernel. From Kernel ring buffer logs (dmesg) : [Tue Feb 11 05:25:55 2025] ------------[ cut here ]------------ [Tue Feb 11 05:25:55 2025] WARNING: CPU: 10 PID: 114873 at net/core/skbuff.c:7006 skb_splice_from_iter+0x139/0x370 [Tue Feb 11 05:25:55 2025] Modules linked in: nvme_tcp nvme_keyring nvme xt_tcpudp nft_compat nf_tables qrtr cfg80211 binfmt_misc zfs(PO) spl(O) intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel dell_wmi dell_smbios dell_wmi_descriptor kvm video mgag200 ledtrig_audio irqbypass sparse_keymap dcdbas joydev input_leds mei_me i2c_algo_bit mei acpi_power_meter rapl intel_cstate lpc_ich ipmi_ssif mac_hid acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mxm_wmi sch_fq_codel dm_multipath nvme_fabrics msr nvme_core nvme_auth efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 mlx5_ib ib_uverbs macsec ib_core mlx5_core crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 mlxfw psample tls pci_hyperv_intf tg3 pata_acpi wmi hid_generic usbhid hid aesni_intel [Tue Feb 11 05:25:55 2025] crypto_simd cryptd [Tue Feb 11 05:25:55 2025] CPU: 10 PID: 114873 Comm: kworker/10:2H Tainted: P O 6.8.0-49-generic #49-Ubuntu [Tue Feb 11 05:25:55 2025] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.7.1 001/22/2018 [Tue Feb 11 05:25:55 2025] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp] [Tue Feb 11 05:25:55 2025] RIP: 0010:skb_splice_from_iter+0x139/0x370 [Tue Feb 11 05:25:55 2025] Code: 39 e1 48 8b 53 08 49 0f 47 cc 49 89 cd f6 c2 01 0f 85 c0 01 00 00 66 90 48 89 da 48 8b 12 80 e6 08 0f 84 8e 00 00 00 4d 89 fe <0f> 0b 49 c7 c0 fb ff ff ff 48 8b 85 68 ff ff ff 41 01 46 70 41 01 [Tue Feb 11 05:25:55 2025] RSP: 0018:ffffb216769d7a38 EFLAGS: 00010202 [Tue Feb 11 05:25:55 2025] RAX: 0000000000000000 RBX: fffff74820347000 RCX: 0000000000001000 [Tue Feb 11 05:25:55 2025] RDX: 0017ffffc0000840 RSI: 0000000000000000 RDI: 0000000000000000 [Tue Feb 11 05:25:55 2025] RBP: ffffb216769d7ae0 R08: 0000000000000000 R09: 0000000000000000 [Tue Feb 11 05:25:55 2025] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000001000 [Tue Feb 11 05:25:55 2025] R13: 0000000000001000 R14: ffff9c22fccbfe00 R15: ffff9c22fccbfe00 [Tue Feb 11 05:25:55 2025] FS: 0000000000000000(0000) GS:ffff9c347f680000(0000) knlGS:0000000000000000 [Tue Feb 11 05:25:55 2025] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Tue Feb 11 05:25:55 2025] CR2: 00007d79ae7af000 CR3: 0000002a226e4001 CR4: 00000000003706f0 [Tue Feb 11 05:25:55 2025] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [Tue Feb 11 05:25:55 2025] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [Tue Feb 11 05:25:55 2025] Call Trace: [Tue Feb 11 05:25:55 2025] <TASK> [Tue Feb 11 05:25:55 2025] ? show_regs+0x6d/0x80 [Tue Feb 11 05:25:55 2025] ? __warn+0x89/0x160 [Tue Feb 11 05:25:55 2025] ? skb_splice_from_iter+0x139/0x370 [Tue Feb 11 05:25:55 2025] ? report_bug+0x17e/0x1b0 [Tue Feb 11 05:25:55 2025] ? handle_bug+0x51/0xa0 [Tue Feb 11 05:25:55 2025] ? exc_invalid_op+0x18/0x80 [Tue Feb 11 05:25:55 2025] ? asm_exc_invalid_op+0x1b/0x20 [Tue Feb 11 05:25:55 2025] ? skb_splice_from_iter+0x139/0x370 [Tue Feb 11 05:25:55 2025] ? skb_splice_from_iter+0xd5/0x370 [Tue Feb 11 05:25:55 2025] tcp_sendmsg_locked+0x352/0xd70 [Tue Feb 11 05:25:55 2025] ? tcp_push+0x159/0x190 [Tue Feb 11 05:25:55 2025] ? tcp_sendmsg_locked+0x9c4/0xd70 [Tue Feb 11 05:25:55 2025] tcp_sendmsg+0x2c/0x50 [Tue Feb 11 05:25:55 2025] inet_sendmsg+0x42/0x80 [Tue Feb 11 05:25:55 2025] sock_sendmsg+0x118/0x150 [Tue Feb 11 05:25:55 2025] nvme_tcp_try_send_data+0x16e/0x4d0 [nvme_tcp] [Tue Feb 11 05:25:55 2025] nvme_tcp_try_send+0x23c/0x300 [nvme_tcp] [Tue Feb 11 05:25:55 2025] nvme_tcp_io_work+0x40/0xe0 [nvme_tcp] [Tue Feb 11 05:25:55 2025] process_one_work+0x178/0x350 [Tue Feb 11 05:25:55 2025] worker_thread+0x306/0x440 [Tue Feb 11 05:25:55 2025] ? __pfx_worker_thread+0x10/0x10 [Tue Feb 11 05:25:55 2025] kthread+0xf2/0x120 [Tue Feb 11 05:25:55 2025] ? __pfx_kthread+0x10/0x10 [Tue Feb 11 05:25:55 2025] ret_from_fork+0x47/0x70 [Tue Feb 11 05:25:55 2025] ? __pfx_kthread+0x10/0x10 [Tue Feb 11 05:25:55 2025] ret_from_fork_asm+0x1b/0x30 [Tue Feb 11 05:25:55 2025] </TASK> [Tue Feb 11 05:25:55 2025] ---[ end trace 0000000000000000 ]--- [Tue Feb 11 05:25:55 2025] nvme nvme8: failed to send request -5 [Tue Feb 11 05:26:25 2025] nvme nvme8: I/O tag 5 (9005) type 4 opcode 0x2 (I/O Cmd) QID 11 timeout [Tue Feb 11 05:26:25 2025] nvme nvme8: starting error recovery [Tue Feb 11 05:26:25 2025] nvme nvme8: I/O tag 6 (c006) type 4 opcode 0x1 (I/O Cmd) QID 11 timeout [Tue Feb 11 05:26:25 2025] nvme nvme8: I/O tag 7 (d007) type 4 opcode 0x2 (I/O Cmd) QID 11 timeout [Tue Feb 11 05:26:25 2025] nvme nvme8: I/O tag 11 (700b) type 4 opcode 0x1 (I/O Cmd) QID 11 timeout [Tue Feb 11 05:26:25 2025] nvme nvme8: I/O tag 12 (300c) type 4 opcode 0x1 (I/O Cmd) QID 11 timeout [Tue Feb 11 05:26:25 2025] nvme nvme32: failed to send request -5 [Tue Feb 11 05:26:25 2025] nvme nvme8: Reconnecting in 10 seconds... [Tue Feb 11 05:26:25 2025] nvme nvme32: starting error recovery [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] nvme nvme32: Reconnecting in 10 seconds... [Tue Feb 11 05:26:36 2025] nvme nvme8: queue_size 128 > ctrl sqsize 16, clamping down [Tue Feb 11 05:26:36 2025] nvme nvme8: creating 16 I/O queues. [Tue Feb 11 05:26:36 2025] nvme nvme32: queue_size 128 > ctrl sqsize 16, clamping down [Tue Feb 11 05:26:36 2025] nvme nvme32: creating 16 I/O queues. [Tue Feb 11 05:26:36 2025] nvme nvme8: mapped 16/0/0 default/read/poll queues. [Tue Feb 11 05:26:36 2025] nvme nvme8: Successfully reconnected (1 attempt) [Tue Feb 11 05:26:36 2025] nvme nvme8: failed to send request -5 [Tue Feb 11 05:26:36 2025] nvme nvme32: mapped 16/0/0 default/read/poll queues. [Tue Feb 11 05:26:36 2025] nvme nvme8: starting error recovery [Tue Feb 11 05:26:36 2025] nvme_ns_head_submit_bio: 55 callbacks suppressed [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] nvme nvme32: Successfully reconnected (1 attempt) [Tue Feb 11 05:26:36 2025] nvme nvme8: reading non-mdts-limits failed: -4 ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Description changed: Steps to reproduce : - Power on the NVMeofF enclosure. - Discover and connect the drives. - Create 2-zpool with even and odd drive. - Start the IO on both pool created.. + 1. Power on the NVMeoF enclosure. + 2. Discover and connect the drives. + 3. Create 2 zpools with even and odd drives. + 4. Start the IO on both pools created. Observation : - Observed call trace while running ZFS IO - Able to see "failed to send request -5" and drive went continuous reconnected state. + 1. Observed call trace while running ZFS IO Able to see "failed to send request-5" and drive went continuously reconnected state. + 2. The issue is seen with Ubuntu 24.04.1 with kernel 6.8.0-49.generic kernel. - Dmesg : + From Kernel ring buffer logs (dmesg) : [Tue Feb 11 05:25:55 2025] ------------[ cut here ]------------ [Tue Feb 11 05:25:55 2025] WARNING: CPU: 10 PID: 114873 at net/core/skbuff.c:7006 skb_splice_from_iter+0x139/0x370 [Tue Feb 11 05:25:55 2025] Modules linked in: nvme_tcp nvme_keyring nvme xt_tcpudp nft_compat nf_tables qrtr cfg80211 binfmt_misc zfs(PO) spl(O) intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel dell_wmi dell_smbios dell_wmi_descriptor kvm video mgag200 ledtrig_audio irqbypass sparse_keymap dcdbas joydev input_leds mei_me i2c_algo_bit mei acpi_power_meter rapl intel_cstate lpc_ich ipmi_ssif mac_hid acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mxm_wmi sch_fq_codel dm_multipath nvme_fabrics msr nvme_core nvme_auth efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 mlx5_ib ib_uverbs macsec ib_core mlx5_core crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 mlxfw psample tls pci_hyperv_intf tg3 pata_acpi wmi hid_generic usbhid hid aesni_intel [Tue Feb 11 05:25:55 2025] crypto_simd cryptd [Tue Feb 11 05:25:55 2025] CPU: 10 PID: 114873 Comm: kworker/10:2H Tainted: P O 6.8.0-49-generic #49-Ubuntu [Tue Feb 11 05:25:55 2025] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.7.1 001/22/2018 [Tue Feb 11 05:25:55 2025] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp] [Tue Feb 11 05:25:55 2025] RIP: 0010:skb_splice_from_iter+0x139/0x370 [Tue Feb 11 05:25:55 2025] Code: 39 e1 48 8b 53 08 49 0f 47 cc 49 89 cd f6 c2 01 0f 85 c0 01 00 00 66 90 48 89 da 48 8b 12 80 e6 08 0f 84 8e 00 00 00 4d 89 fe <0f> 0b 49 c7 c0 fb ff ff ff 48 8b 85 68 ff ff ff 41 01 46 70 41 01 [Tue Feb 11 05:25:55 2025] RSP: 0018:ffffb216769d7a38 EFLAGS: 00010202 [Tue Feb 11 05:25:55 2025] RAX: 0000000000000000 RBX: fffff74820347000 RCX: 0000000000001000 [Tue Feb 11 05:25:55 2025] RDX: 0017ffffc0000840 RSI: 0000000000000000 RDI: 0000000000000000 [Tue Feb 11 05:25:55 2025] RBP: ffffb216769d7ae0 R08: 0000000000000000 R09: 0000000000000000 [Tue Feb 11 05:25:55 2025] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000001000 [Tue Feb 11 05:25:55 2025] R13: 0000000000001000 R14: ffff9c22fccbfe00 R15: ffff9c22fccbfe00 [Tue Feb 11 05:25:55 2025] FS: 0000000000000000(0000) GS:ffff9c347f680000(0000) knlGS:0000000000000000 [Tue Feb 11 05:25:55 2025] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Tue Feb 11 05:25:55 2025] CR2: 00007d79ae7af000 CR3: 0000002a226e4001 CR4: 00000000003706f0 [Tue Feb 11 05:25:55 2025] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [Tue Feb 11 05:25:55 2025] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [Tue Feb 11 05:25:55 2025] Call Trace: [Tue Feb 11 05:25:55 2025] <TASK> [Tue Feb 11 05:25:55 2025] ? show_regs+0x6d/0x80 [Tue Feb 11 05:25:55 2025] ? __warn+0x89/0x160 [Tue Feb 11 05:25:55 2025] ? skb_splice_from_iter+0x139/0x370 [Tue Feb 11 05:25:55 2025] ? report_bug+0x17e/0x1b0 [Tue Feb 11 05:25:55 2025] ? handle_bug+0x51/0xa0 [Tue Feb 11 05:25:55 2025] ? exc_invalid_op+0x18/0x80 [Tue Feb 11 05:25:55 2025] ? asm_exc_invalid_op+0x1b/0x20 [Tue Feb 11 05:25:55 2025] ? skb_splice_from_iter+0x139/0x370 [Tue Feb 11 05:25:55 2025] ? skb_splice_from_iter+0xd5/0x370 [Tue Feb 11 05:25:55 2025] tcp_sendmsg_locked+0x352/0xd70 [Tue Feb 11 05:25:55 2025] ? tcp_push+0x159/0x190 [Tue Feb 11 05:25:55 2025] ? tcp_sendmsg_locked+0x9c4/0xd70 [Tue Feb 11 05:25:55 2025] tcp_sendmsg+0x2c/0x50 [Tue Feb 11 05:25:55 2025] inet_sendmsg+0x42/0x80 [Tue Feb 11 05:25:55 2025] sock_sendmsg+0x118/0x150 [Tue Feb 11 05:25:55 2025] nvme_tcp_try_send_data+0x16e/0x4d0 [nvme_tcp] [Tue Feb 11 05:25:55 2025] nvme_tcp_try_send+0x23c/0x300 [nvme_tcp] [Tue Feb 11 05:25:55 2025] nvme_tcp_io_work+0x40/0xe0 [nvme_tcp] [Tue Feb 11 05:25:55 2025] process_one_work+0x178/0x350 [Tue Feb 11 05:25:55 2025] worker_thread+0x306/0x440 [Tue Feb 11 05:25:55 2025] ? __pfx_worker_thread+0x10/0x10 [Tue Feb 11 05:25:55 2025] kthread+0xf2/0x120 [Tue Feb 11 05:25:55 2025] ? __pfx_kthread+0x10/0x10 [Tue Feb 11 05:25:55 2025] ret_from_fork+0x47/0x70 [Tue Feb 11 05:25:55 2025] ? __pfx_kthread+0x10/0x10 [Tue Feb 11 05:25:55 2025] ret_from_fork_asm+0x1b/0x30 [Tue Feb 11 05:25:55 2025] </TASK> [Tue Feb 11 05:25:55 2025] ---[ end trace 0000000000000000 ]--- [Tue Feb 11 05:25:55 2025] nvme nvme8: failed to send request -5 [Tue Feb 11 05:26:25 2025] nvme nvme8: I/O tag 5 (9005) type 4 opcode 0x2 (I/O Cmd) QID 11 timeout [Tue Feb 11 05:26:25 2025] nvme nvme8: starting error recovery [Tue Feb 11 05:26:25 2025] nvme nvme8: I/O tag 6 (c006) type 4 opcode 0x1 (I/O Cmd) QID 11 timeout [Tue Feb 11 05:26:25 2025] nvme nvme8: I/O tag 7 (d007) type 4 opcode 0x2 (I/O Cmd) QID 11 timeout [Tue Feb 11 05:26:25 2025] nvme nvme8: I/O tag 11 (700b) type 4 opcode 0x1 (I/O Cmd) QID 11 timeout [Tue Feb 11 05:26:25 2025] nvme nvme8: I/O tag 12 (300c) type 4 opcode 0x1 (I/O Cmd) QID 11 timeout [Tue Feb 11 05:26:25 2025] nvme nvme32: failed to send request -5 [Tue Feb 11 05:26:25 2025] nvme nvme8: Reconnecting in 10 seconds... [Tue Feb 11 05:26:25 2025] nvme nvme32: starting error recovery [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:25 2025] nvme nvme32: Reconnecting in 10 seconds... [Tue Feb 11 05:26:36 2025] nvme nvme8: queue_size 128 > ctrl sqsize 16, clamping down [Tue Feb 11 05:26:36 2025] nvme nvme8: creating 16 I/O queues. [Tue Feb 11 05:26:36 2025] nvme nvme32: queue_size 128 > ctrl sqsize 16, clamping down [Tue Feb 11 05:26:36 2025] nvme nvme32: creating 16 I/O queues. [Tue Feb 11 05:26:36 2025] nvme nvme8: mapped 16/0/0 default/read/poll queues. [Tue Feb 11 05:26:36 2025] nvme nvme8: Successfully reconnected (1 attempt) [Tue Feb 11 05:26:36 2025] nvme nvme8: failed to send request -5 [Tue Feb 11 05:26:36 2025] nvme nvme32: mapped 16/0/0 default/read/poll queues. [Tue Feb 11 05:26:36 2025] nvme nvme8: starting error recovery [Tue Feb 11 05:26:36 2025] nvme_ns_head_submit_bio: 55 callbacks suppressed [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O [Tue Feb 11 05:26:36 2025] nvme nvme32: Successfully reconnected (1 attempt) [Tue Feb 11 05:26:36 2025] nvme nvme8: reading non-mdts-limits failed: -4 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2098056 Title: RAID getting corrupted while running ZFS IOs . To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2098056/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs