Hi Jakub, Alexander, Quoting Alexander Duyck (2020-12-19 02:41:08) > On Fri, Dec 18, 2020 at 4:30 PM Jakub Kicinski <k...@kernel.org> wrote: > > > > Two things: (a) is the datapath not exposed to a similar problem? > > __get_xps_queue_idx() uses dev->tc_num in a very similar fashion. > > I think we are shielded from this by the fact that if you change the > number of tc the Tx path has to be torn down and rebuilt since you are > normally changing the qdisc configuration anyway.
That's right. But there's nothing preventing users to call functions using the xps maps in between. There are a few functions being exposed. One (similar) example of that is another bug I reproduced, were the old and the new map in __netif_set_xps_queue do not have the same size, because num_tc was updated in between two calls to this function. The root cause is the same: the size of the map is not embedded in it and whenever we access it we can make an out of bound access. > > Should we perhaps make the "num_tcs" part of the XPS maps which is > > under RCU protection rather than accessing the netdev copy? Yes, I have a local patch (untested, still WIP) doing exactly that. The idea is we can't make sure a num_tc update will trigger an xps reallocation / reconfiguration of the map; but at least we can make sure the map won't be accessed out of bounds. It's a different issue though: not being able to access a map out of bound once it has been allocated whereas this patch wants to prevent an update of num_tc while the xps map allocation/setup is in progress. > So it looks like the issue is the fact that we really need to > synchronize netdev_reset_tc, netdev_set_tc_queue, and > netdev_set_num_tc with __netif_set_xps_queue. > > > (b) if we always take rtnl_lock, why have xps_map_mutex? Can we > > rearrange things so that xps_map_mutex is sufficient? > > It seems like the quick and dirty way would be to look at updating the > 3 functions I called out so that they were holding the xps_map_mutex > while they were updating things, and for __netif_set_xps_queue to > expand out the mutex to include the code starting at "if (dev->num_tc) > {". That should do the trick. The only downside is xps_map_mutex is only defined with CONFIG_XPS while netdev_set_num_tc is not, adding more ifdef to it. But that's probably a better compromise than taking the rtnl lock. Thanks for the review and suggestions! Antoine