Happy to provide either hardware or help testing a solution if needed! -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2015414
Title: 5.15.0-69 ice driver deadlocks with bonded e810 NICs Status in linux package in Ubuntu: Confirmed Bug description: The ice driver in the 5.15.0-69 kernel deadlocks on rtnl_lock() when adding e810 NICs to a bond interface. Booting with `sysctl.hung_task_panic=1` and `sysctl.hung_task_all_cpu_backtrace=1` added to the kernel command-line shows (among lots of other output): ``` [ 244.980100] INFO: task kworker/6:1:182 blocked for more than 120 seconds. [ 244.988431] Not tainted 5.15.0-69-generic #76-Ubuntu [ 244.995279] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 245.004826] task:kworker/6:1 state:D stack: 0 pid: 182 ppid: 2 flags:0x00004000 [ 245.015017] Workqueue: events linkwatch_event [ 245.020734] Call Trace: [ 245.024144] <TASK> [ 245.027137] __schedule+0x24e/0x590 [ 245.031848] schedule+0x69/0x110 [ 245.036228] schedule_preempt_disabled+0xe/0x20 [ 245.042066] __mutex_lock.constprop.0+0x267/0x490 [ 245.047993] __mutex_lock_slowpath+0x13/0x20 [ 245.053432] mutex_lock+0x38/0x50 [ 245.057714] rtnl_lock+0x15/0x20 [ 245.061901] linkwatch_event+0xe/0x30 [ 245.066571] process_one_work+0x228/0x3d0 [ 245.071607] worker_thread+0x53/0x420 [ 245.076260] ? process_one_work+0x3d0/0x3d0 [ 245.081493] kthread+0x127/0x150 [ 245.085592] ? set_kthread_struct+0x50/0x50 [ 245.090769] ret_from_fork+0x1f/0x30 [ 245.095266] </TASK> ``` and ``` [ 245.530629] INFO: task ifenslave:849 blocked for more than 121 seconds. [ 245.540433] Not tainted 5.15.0-69-generic #76-Ubuntu [ 245.549050] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 245.558960] task:ifenslave state:D stack: 0 pid: 849 ppid: 847 flags:0x00004002 [ 245.570930] Call Trace: [ 245.576175] <TASK> [ 245.581018] __schedule+0x24e/0x590 [ 245.587445] schedule+0x69/0x110 [ 245.593631] schedule_timeout+0x103/0x140 [ 245.600573] __wait_for_common+0xab/0x150 [ 245.607526] ? usleep_range_state+0x90/0x90 [ 245.614743] wait_for_completion+0x24/0x30 [ 245.621903] flush_workqueue+0x133/0x3e0 [ 245.628887] ib_cache_cleanup_one+0x21/0xf0 [ib_core] [ 245.637083] __ib_unregister_device+0x79/0xc0 [ib_core] [ 245.645398] ib_unregister_device+0x27/0x40 [ib_core] [ 245.653541] irdma_ib_unregister_device+0x4b/0x70 [irdma] [ 245.662105] irdma_remove+0x1f/0x70 [irdma] [ 245.669446] auxiliary_bus_remove+0x1d/0x40 [ 245.676688] __device_release_driver+0x1a8/0x2a0 [ 245.684241] device_release_driver+0x29/0x40 [ 245.691416] bus_remove_device+0xde/0x150 [ 245.698396] device_del+0x19c/0x400 [ **712178] ice_lag_link.isra.0+0xdd/0xf0 [ice] m] (3 of 5) A start job is runni[ 245.720683] ice_lag_changeupper_event+0xe1/0x130 [ice] ng for\u2026rk interfaces (3min 47s[ 245.729739] ice_lag_event_handler+0x5b/0x150 [ice] / 5min 3s) [ 245.738525] raw_notifier_call_chain+0x46/0x60 [ 245.746006] call_netdevice_notifiers_info+0x52/0xa0 [ 245.754123] __netdev_upper_dev_link+0x1b7/0x310 [ 245.761658] netdev_master_upper_dev_link+0x3e/0x60 [ 245.769627] bond_enslave+0xc3a/0x1720 [bonding] [ 245.777398] ? sscanf+0x4e/0x70 [ 245.783375] bond_option_slaves_set+0xca/0x170 [bonding] [ 245.791738] __bond_opt_set+0xbd/0x1a0 [bonding] [ 245.799505] __bond_opt_set_notify+0x30/0xb0 [bonding] [ 245.807860] bond_opt_tryset_rtnl+0x56/0xa0 [bonding] [ 245.816062] bonding_sysfs_store_option+0x52/0xa0 [bonding] [ 245.824750] dev_attr_store+0x14/0x30 [ 245.831443] sysfs_kf_write+0x3b/0x50 [ 245.837979] kernfs_fop_write_iter+0x138/0x1c0 [ 245.845469] new_sync_write+0x111/0x1a0 [ 245.852210] vfs_write+0x1d5/0x270 [ 245.858429] ksys_write+0x67/0xf0 [ 245.864624] __x64_sys_write+0x19/0x20 [ 245.871288] do_syscall_64+0x59/0xc0 [ 245.877715] ? handle_mm_fault+0xd8/0x2c0 [ 245.884566] ? do_user_addr_fault+0x1e7/0x670 [ 245.891990] ? filp_close+0x60/0x70 [ 245.898452] ? exit_to_user_mode_prepare+0x37/0xb0 [ 245.906272] ? irqentry_exit_to_user_mode+0x9/0x20 [ 245.914042] ? irqentry_exit+0x1d/0x30 [ 245.920703] ? exc_page_fault+0x89/0x170 [ 245.927555] entry_SYSCALL_64_after_hwframe+0x61/0xcb [ 245.935763] RIP: 0033:0x7f1e86855a37 [ 245.942153] RSP: 002b:00007fff8da477a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 245.953034] RAX: ffffffffffffffda RBX: 000000000000000a RCX: 00007f1e86855a37 [ 245.963554] RDX: 000000000000000a RSI: 0000556eff580510 RDI: 0000000000000001 [ 245.972468] RBP: 0000556eff580510 R08: 0000556eff582c5a R09: 0000000000000000 [ 245.983048] R10: 0000556eff582c59 R11: 0000000000000246 R12: 0000000000000001 [ 245.993402] R13: 000000000000000a R14: 0000000000000000 R15: 0000000000000000 [ 246.001700] </TASK> ``` This appears consistent with the underlying cause being the bug fixed by mainline commit 248401cb2c4612d83eb0c352ee8103b78b8eb365 (commit 87b9ac7bd301f53b122224fc8eddb1f4045e3f2c in the 5.15.y stable tree). The 5.15.0-67 kernel does not exhibit the problem; given that the 5.15.0-68 kernel apparently included the "RDMA/irdma: Report the correct link speed" patch listed in one of the "Fixes" tags in the above commit, I suspect that that's the culprit and that importing the above commit shoudl resolve the problem. ProblemType: Bug DistroRelease: Ubuntu 22.04 Package: linux-image-5.15.0-67-generic 5.15.0-67.74 ProcVersionSignature: Ubuntu 5.15.0-67.74-generic 5.15.85 Uname: Linux 5.15.0-67-generic x86_64 AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Apr 5 22:47 seq crw-rw---- 1 root audio 116, 33 Apr 5 22:47 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.11-0ubuntu82.3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: N/A CasperMD5CheckResult: unknown Date: Wed Apr 5 22:48:03 2023 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' Lsusb: Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 004: ID 0b1f:03ee Insyde Software Corp. RNDIS/Ethernet Gadget Bus 001 Device 003: ID 0557:9241 ATEN International Co., Ltd SMCI HID KM Bus 001 Device 002: ID 1d6b:0107 Linux Foundation USB Virtual Hub Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub MachineType: Supermicro SYS-510T-MR-EI018 PciMultimedia: ProcEnviron: TERM=vt220 PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=C.UTF-8 SHELL=/bin/bash ProcFB: 0 astdrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-67-generic root=UUID=0b21ae48-6315-4193-8c24-fc224a18170f ro console=tty0 console=ttyS1,115200n8 modprobe.blacklist=igb modprobe.blacklist=rndis_host RelatedPackageVersions: linux-restricted-modules-5.15.0-67-generic N/A linux-backports-modules-5.15.0-67-generic N/A linux-firmware 20220329.git681281e4-0ubuntu3.9 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 06/23/2022 dmi.bios.release: 5.22 dmi.bios.vendor: American Megatrends International, LLC. dmi.bios.version: 1.2 dmi.board.asset.tag: To be filled by O.E.M. dmi.board.name: X12STH-SYS dmi.board.vendor: Supermicro dmi.board.version: 1.01 dmi.chassis.asset.tag: To be filled by O.E.M. dmi.chassis.type: 1 dmi.chassis.vendor: Supermicro dmi.chassis.version: 0123456789 dmi.modalias: dmi:bvnAmericanMegatrendsInternational,LLC.:bvr1.2:bd06/23/2022:br5.22:svnSupermicro:pnSYS-510T-MR-EI018:pvr0123456789:rvnSupermicro:rnX12STH-SYS:rvr1.01:cvnSupermicro:ct1:cvr0123456789:skuTobefilledbyO.E.M.: dmi.product.family: To be filled by O.E.M. dmi.product.name: SYS-510T-MR-EI018 dmi.product.sku: To be filled by O.E.M. dmi.product.version: 0123456789 dmi.sys.vendor: Supermicro To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2015414/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp