On Thu, 18 Jul 2024, Kevin Oberman wrote:
I attempted to update my development system to today's head. After
installing the kernel, etcupdate -p, reboot, installworld, etcupdate,
check-old, delete-old, reboot,, the system panicked when the system tried
starting the network.
System is a T16-Gen1 with the Alder Lake wifi. When starting the network,
it panics with:
Fatal trap 12: page fault while in kernel mode
cpuid = 5; apic id = 12
fault virtual address = 0xc
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff8359afd3
stack pointer = 0x28:0xfffffe00f1341c80
frame pointer = 0x28:0xfffffe00f1341d00
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 0 (linuxkpi_short_wq_1)
rdi: fffffe016695c4f8 rsi: fffffe00f1341c48 rdx: ffffffff8118971b
rcx: 0000000000000000 r8: 0000000000000001 r9: ffffffffffffffff
rax: 0000000000000000 rbx: fffffe0167886e80 rbp: fffffe00f1341d00
r10: 0000000000010000 r11: 0000000000000001 r12: fffffe0167887478
r13: 0000000000000000 r14: fffffe016695c4c8 r15: fffff80003bec540
I can supply the full core.txt file. The backtrace shows the following
items:
iwl_mvm_bt_notif_iterator() at iwl_mvm_bt_notif_iterator+0xf3/frame
0xfffffe00f1341d00
linuxkpi_ieee80211_iterate_interfaces() at
linuxkpi_ieee80211_iterate_interfaces+0x84/frame 0xfffffe00f1341d40
iwl_mvm_bt_coex_notif_handle() at iwl_mvm_bt_coex_notif_handle+0x7c/frame
0xfffffe00f1341da0
iwl_mvm_async_handlers_wk() at iwl_mvm_async_handlers_wk+0x110/frame
0xfffffe00f1341df0
Should I open a ticket or add to an existing one? I didn't see one with a
quick look.
No, I am not aware of any of it either; have you hit this more than once?
This is a NULL deref somewhere in iwl_mvm_bt_notif_per_link() if my lldb thinks
the same...
280 link_conf = rcu_dereference(vif->link_conf[link_id]);
281 /* This can happen due to races: if we receive the notification
282 * and have the mutex held, while mac80211 is stuck on our mutex
283 * in the middle of removing the link.
284 */
285 if (!link_conf)
286 return;
287
288 chanctx_conf = rcu_dereference(link_conf->chanctx_conf);
289
290 /* If channel context is invalid or not on 2.4GHz .. */
291 if ((!chanctx_conf ||
292 chanctx_conf->def.chan->band != NL80211_BAND_2GHZ)) {
...
Seems chanctx_conf->def.chan was NULL as that's 0xc offset.
That means this likely happened right before the first SCAN->AUTH happened.
It seems we need to initialize the def.chan on vif creation as well.
For tracking purposes, yes, please file a PR; for simplicity feel free
to simply link to this mail in the archives.
/bz
--
Bjoern A. Zeeb r15:7