On Thu, 18 Jul 2024, Kevin Oberman wrote:

I attempted to update my development system to today's head. After
installing the kernel, etcupdate -p, reboot, installworld, etcupdate,
check-old, delete-old, reboot,, the system panicked when the system tried
starting the network.

System is a T16-Gen1 with the Alder Lake wifi. When starting the network,
it panics with:
Fatal trap 12: page fault while in kernel mode
cpuid = 5; apic id = 12
fault virtual address   = 0xc
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff8359afd3
stack pointer           = 0x28:0xfffffe00f1341c80
frame pointer           = 0x28:0xfffffe00f1341d00
code segment            = base 0x0, limit 0xfffff, type 0x1b
                       = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (linuxkpi_short_wq_1)
rdi: fffffe016695c4f8 rsi: fffffe00f1341c48 rdx: ffffffff8118971b
rcx: 0000000000000000  r8: 0000000000000001  r9: ffffffffffffffff
rax: 0000000000000000 rbx: fffffe0167886e80 rbp: fffffe00f1341d00
r10: 0000000000010000 r11: 0000000000000001 r12: fffffe0167887478
r13: 0000000000000000 r14: fffffe016695c4c8 r15: fffff80003bec540

I can supply the full core.txt file. The backtrace shows the following
items:
iwl_mvm_bt_notif_iterator() at iwl_mvm_bt_notif_iterator+0xf3/frame
0xfffffe00f1341d00
linuxkpi_ieee80211_iterate_interfaces() at
linuxkpi_ieee80211_iterate_interfaces+0x84/frame 0xfffffe00f1341d40
iwl_mvm_bt_coex_notif_handle() at iwl_mvm_bt_coex_notif_handle+0x7c/frame
0xfffffe00f1341da0
iwl_mvm_async_handlers_wk() at iwl_mvm_async_handlers_wk+0x110/frame
0xfffffe00f1341df0

Should I open a ticket or add to an existing one? I didn't see one with a
quick look.

No, I am not aware of any of it either;  have you hit this more than once?

This is a NULL deref somewhere in iwl_mvm_bt_notif_per_link() if my lldb thinks 
the same...


    280         link_conf = rcu_dereference(vif->link_conf[link_id]);
    281         /* This can happen due to races: if we receive the notification
    282          * and have the mutex held, while mac80211 is stuck on our mutex
    283          * in the middle of removing the link.
    284          */
    285         if (!link_conf)
    286                 return;
    287
    288         chanctx_conf = rcu_dereference(link_conf->chanctx_conf);
    289
    290         /* If channel context is invalid or not on 2.4GHz .. */
    291         if ((!chanctx_conf ||
    292              chanctx_conf->def.chan->band != NL80211_BAND_2GHZ)) {
...

Seems chanctx_conf->def.chan was NULL as that's 0xc offset.

That means this likely happened right before the first SCAN->AUTH happened.
It seems we need to initialize the def.chan on vif creation as well.

For tracking purposes, yes, please file a PR;  for simplicity feel free
to simply link to this mail in the archives.

/bz

--
Bjoern A. Zeeb                                                     r15:7

Reply via email to