On Mon, Dec 21, 2020 at 11:36 AM Antoine Tenart <aten...@kernel.org> wrote: > > Two race conditions can be triggered in xps, resulting in various oops > and invalid memory accesses: > > 1. Calling netdev_set_num_tc while netif_set_xps_queue: > > - netdev_set_num_tc sets dev->tc_num. > > - netif_set_xps_queue uses dev->tc_num as one of the parameters to > compute the size of new_dev_maps when allocating it. dev->tc_num is > also used to access the map, and the compiler may generate code to > retrieve this field multiple times in the function. > > If new_dev_maps is allocated using dev->tc_num and then dev->tc_num > is set to a higher value through netdev_set_num_tc, later accesses to > new_dev_maps in netif_set_xps_queue could lead to accessing memory > outside of new_dev_maps; triggering an oops. > > One way of triggering this is to set an iface up (for which the > driver uses netdev_set_num_tc in the open path, such as bnx2x) and > writing to xps_cpus or xps_rxqs in a concurrent thread. With the > right timing an oops is triggered. > > 2. Calling netif_set_xps_queue while netdev_set_num_tc is running: > > 2.1. netdev_set_num_tc starts by resetting the xps queues, > dev->tc_num isn't updated yet. > > 2.2. netif_set_xps_queue is called, setting up the maps with the > *old* dev->num_tc. > > 2.3. dev->tc_num is updated. > > 2.3. Later accesses to the map leads to out of bound accesses and > oops. > > A similar issue can be found with netdev_reset_tc. > > The fix can't be to only link the size of the maps to them, as > invalid configuration could still occur. The reset then set logic in > both netdev_set_num_tc and netdev_reset_tc must be protected by a > lock. > > Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc > and netdev_reset_tc should be mutually exclusive. > > This patch fixes those races by: > > - Reworking netif_set_xps_queue by moving the xps_map_mutex up so the > access of dev->num_tc is done under the lock. > > - Using xps_map_mutex in both netdev_set_num_tc and netdev_reset_tc for > the reset and set logic: > > + As xps_map_mutex was taken in the reset path, netif_reset_xps_queues > had to be reworked to offer an unlocked version (as well as > netdev_unbind_all_sb_channels which calls it). > > + cpus_read_lock was taken in the reset path as well, and is always > taken before xps_map_mutex. It had to be moved out of the unlocked > version as well. > > This is why the patch is a little bit longer, and moves > netdev_unbind_sb_channel up in the file. > > Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes") > Signed-off-by: Antoine Tenart <aten...@kernel.org>
Looking over this patch it seems kind of obvious that extending the xps_map_mutex is making things far more complex then they need to be. Applying the rtnl_mutex would probably be much simpler. Although as I think you have already discovered we need to apply it to the store, and show for this interface. In addition we probably need to perform similar locking around traffic_class_show in order to prevent it from generating a similar error.