PF reset can be triggered asynchronously, by tx_timeout or by a user. With some unfortunate timings both ice_vsi_rebuild() and .ndo_bpf will try to access and modify XDP rings at the same time, causing system crash.
The first patch factors out rtnl-locked code from VSI rebuild code to avoid deadlock. The following changes lock rebuild and .ndo_bpf() critical sections with an internal mutex as well and provide complementary fixes. v1: https://lore.kernel.org/netdev/20240610153716.31493-1-larysa.zare...@intel.com/ v1->v2: * use mutex for locking * redefine critical sections * account for short time between rebuild and VSI being open * add netif_queue_set_napi() patch, so ICE_RTNL_WAITS_FOR_RESET strategy can be dropped, no more rtnl-locked code in ice_vsi_rebuild() * change the test case from waiting for tx_timeout to happen to actively firing resets through sysfs, this adds more minor fixes on top Larysa Zaremba (6): ice: move netif_queue_set_napi to rtnl-protected sections ice: protect XDP configuration with a mutex ice: check for XDP rings instead of bpf program when unconfiguring ice: check ICE_VSI_DOWN under rtnl_lock when preparing for reset ice: remove ICE_CFG_BUSY locking from AF_XDP code ice: do not bring the VSI up, if it was down before the XDP setup drivers/net/ethernet/intel/ice/ice.h | 2 + drivers/net/ethernet/intel/ice/ice_base.c | 11 +- drivers/net/ethernet/intel/ice/ice_lib.c | 171 +++++++--------------- drivers/net/ethernet/intel/ice/ice_lib.h | 10 +- drivers/net/ethernet/intel/ice/ice_main.c | 47 ++++-- drivers/net/ethernet/intel/ice/ice_xsk.c | 18 +-- 6 files changed, 102 insertions(+), 157 deletions(-) -- 2.43.0