Re: panic: ieee80211_set_link_state() calls rtm_80211info() from timeout context

Stefan Sperling Sun, 05 Dec 2021 09:23:08 -0800

On Sun, Dec 05, 2021 at 11:05:32AM -0600, Scott Cheloha wrote:
> > diff 0b61c8235787960f0010ef627ea5b2c6309a81f0 
> > de98c050ea709bdb8e26be40ab0cc82ef9afed80
> > blob - 7bb68194dd78417b06c59f81d1ebbff4165203d8
> > blob + 5b9a969258074fde29e21a33ac035cf170ec3b03
> > --- sys/net80211/ieee80211.c
> > +++ sys/net80211/ieee80211.c
> > @@ -193,6 +193,7 @@ ieee80211_ifattach(struct ifnet *ifp)
> >     if_addgroup(ifp, "wlan");
> >     ifp->if_priority = IF_WIRELESS_DEFAULT_PRIORITY;
> >  
> > +   task_set(&ic->ic_rtm_80211info_task, ieee80211_rtm_80211info_task, ic);
> >     ieee80211_set_link_state(ic, LINK_STATE_DOWN);
> >  
> >     timeout_set(&ic->ic_bgscan_timeout, ieee80211_bgscan_timeout, ifp);
> > @@ -203,6 +204,7 @@ ieee80211_ifdetach(struct ifnet *ifp)
> >  {
> >     struct ieee80211com *ic = (void *)ifp;
> >  
> > +   task_del(systq, &ic->ic_rtm_80211info_task);
> >     timeout_del(&ic->ic_bgscan_timeout);
> 
> Suppose the ic_bgscan_timeout timeout is running at the moment we're
> running ieee80211_ifdetach().  Ignore the kernel lock for the moment,
> think about the future.


There are many more places in the wireless stack that do such things.
But I am not interested in doing MP work on wireless myself. That is
simply asking too much on top of everything else I do.

> If we delete the task before we delete the timeout and the timeout
> then adds the task back onto the task queue, what happens?
> 
> My guess is you need to ensure the timeout is no longer running
> *before* you delete the task.  Can you do timeout_del_barrier()
> here?  See the attached patch.

Yes, sure we can. There are dozens of other timeouts in the wireless
stack and drivers, so your patch is a small step on a very long road.

> >     /*
> > blob - 447a2676bfb250b7f917206549679d6ae68de1f6
> > blob + 7e10fc1336067542c13d5607602e658ce2b3926b
> > --- sys/net80211/ieee80211_proto.c
> > +++ sys/net80211/ieee80211_proto.c
> > @@ -1288,6 +1288,31 @@ justcleanup:
> >  }
> >  
> >  void
> > +ieee80211_rtm_80211info_task(void *arg)
> > +{
> > +   struct ieee80211com *ic = arg;
> > +   struct ifnet *ifp = &ic->ic_if;
> > +   struct if_ieee80211_data ifie;
> > +   int s = splnet();
> > +
> > +   if (LINK_STATE_IS_UP(ifp->if_link_state)) {
> 
> Does this mean userspace can "miss" state transitions if the task runs
> and the state has already changed back to not-up?
> 
> I don't know whether this would matter in practice, but it would be a
> behavior change.

If this task doesn't find link state up for some reason then it is
running in an unexpected situation. What else should it do?

I think it makes sense for a scheduled task to check that its precondition
still applies. We have had several bugs in iwm(4) where tasks did their
thing even though their reason for being scheduled no longer applied.
In that case we ended up sending firmware commands that appeared our
of order from the device's point of view. I don't know if this really
matters here, it will depend on what userland expects.
 
> Unsure how `route monitor` exercises this path, but I've left it
> running, too.

That was just to see whether the routing message is still being generated.

Re: panic: ieee80211_set_link_state() calls rtm_80211info() from timeout context

Reply via email to