> On Jun 27, 2025, at 11:02 PM, Bjoern A. Zeeb <bzeeb-li...@lists.zabbadoz.net> 
> wrote:
> 
> On Wed, 25 Jun 2025, Zhenlei Huang wrote:
> 
> Hi,
> 
> I appplied olce's change from the review but it didn't make a difference
> on my arm64 and now on a tree with local changes (wifi bits, user sapce
> bits, etc).
> 
> Now I netbooted that tree on X86 hardware (an old Lenovo Laptop) and ran
> into something else (the same tree boots in a bhyve instance on a
> different machine from a local disk image).
> 
> At the end of if_addgroup() I had added the following for local
> debugging (really crude sorry):
> 
> ...
> 
> +       atomic_thread_fence_seq_cst();
>        IF_ADDR_WLOCK(ifp);
>        CK_STAILQ_INSERT_TAIL(&ifg->ifg_members, ifgm, ifgm_next);
>        CK_STAILQ_INSERT_TAIL(&ifp->if_groups, ifgl, ifgl_next);
>        IF_ADDR_WUNLOCK(ifp);
> 
>        IFNET_WUNLOCK();       // excl unlock
> 
>        if (new)
>                EVENTHANDLER_INVOKE(group_attach_event, ifg);
>        EVENTHANDLER_INVOKE(group_change_event, groupname);
> 
> +       IFNET_RLOCK();  // shared, panic
> +       CK_STAILQ_FOREACH(ifgl, &ifp->if_groups, ifgl_next) {
> +               if (bz_debug_groups) if_printf(ifp, 
> "XXXXXXXXXXXXXXXXXXXXXXXXXXX-BZ %s:%d: ifgl %p, ifgl_group %p, ifg_group 
> %p\n", __func__, __LINE__, ifgl, (ifgl != NULL) ? ifgl->ifgl_group : NULL, 
> (ifgl != NULL && ifgl->ifgl_group != NULL) ? ifgl->ifgl_group->ifg_group : 
> NULL);
> +       }
> +       IFNET_RUNLOCK();
> +
>        return (0);
> }
> 
> 
> 
> You see the anotation //shared ?
> 
> I got a panic: excl->share with that.

Well, I applied identical patch with you and I can repeat that panic, but my 
screen freezes and the top most stack is 
```
_sx_slock_int() at _sx_slock_int+0x64/frame 0xff....
if_addgroup() at .....
....
device_attach() at ...
...
root_bus_configure() at ...
configure() at ...
mi_startup() at ..
```

I've no idea what's wrong. From the disassembly it appears the panic happens 
just after witness_checkorder .

> 
> The excl. is the
>        IFNET_WLOCK();          // excl
> at the top of the function after the groupname check.
> But that gets unlocked before the event handler above
> so how can this happen?

I checked the event handlers and I think that is not relevant.

> 
> Sadly I cannot even dump or anything as the keyboard is as dead
> as the rest of the laptop.  Have to power cycle it hard.
> 
> Apart from the debugging I added I have no local changes in sys/net
> in that tree.  sys/kern seems to have no relevant changes either
> (added a bus func, toggle link_elf_leak_locals default, and a printf
> got an extra argument to print %d error when modules fail to load).
> 
> 
> I'll try a plain main (hopefully tonight) on that machine too but I am
> really at a loss here now that it's also happening on X86 and only for me
> and always around the same code there...
> 
> I'll also try to boot this tree from a USB pen drive or something;  not
> that my problem comes in from netbooing...
> 

For the debugging purpose for ifgroup, I think you can omit the IFNET_RLOCK,
as at the moment adding group to the interface, there're no other threads
have opportunity to concurrently write to the interface.

> I'll keep you posted...
> /bz
> 
> -- 
> Bjoern A. Zeeb                                                     r15:7

Best regards,
Zhenlei


Reply via email to