The shared IB device (sh) has per port data with filed for interrupt handler port_id. It used by shared interrupt handler to find the corresponding rte_eth device by IB port index. If value is equal or greater RTE_MAX_ETHPORTS it means there is no subhandler installed for specified IB port index.
When a few ports are created under same sh, the sh is created with the first port and the interrupt handler port_id is initialized to RTE_MAX_ETHPORTS for each port. In port creation, the interrupt handler port_id is updated with the correct value. Since this updating, the mlx5_dev_interrupt_nl_cb function uses this port and its priv structure. However, when the ports are closed, this filed isn't updated and the interrupt handler continue working until it is uninstalled in SH destruction. If mlx5_dev_interrupt_nl_cb is called between port closing and SH destruction, it uses invalid port causing a crash. This patch adds interrupt handler port_id updating to the close function and add memory barrier to make sure it is done before priv reset. Fixes: 655c3c26c11e ("net/mlx5: fix initial link status detection") Cc: dkozl...@nvidia.com Cc: sta...@dpdk.org Signed-off-by: Michael Baum <michae...@nvidia.com> Acked-by: Matan Azrad <ma...@nvidia.com> --- v2: fix typo in commit message. drivers/net/mlx5/linux/mlx5_os.c | 3 +++ drivers/net/mlx5/mlx5.c | 6 ++++++ 2 files changed, 9 insertions(+) diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c index 2b6741396d..a71474c90a 100644 --- a/drivers/net/mlx5/linux/mlx5_os.c +++ b/drivers/net/mlx5/linux/mlx5_os.c @@ -1676,6 +1676,9 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, return eth_dev; error: if (priv) { + priv->sh->port[priv->dev_port - 1].nl_ih_port_id = + RTE_MAX_ETHPORTS; + rte_io_wmb(); #ifdef HAVE_MLX5_HWS_SUPPORT if (eth_dev && priv->sh && diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index 1cf6df6049..95b0151fbc 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -2137,6 +2137,12 @@ mlx5_dev_close(struct rte_eth_dev *dev) if (!c) claim_zero(rte_eth_switch_domain_free(priv->domain_id)); } + priv->sh->port[priv->dev_port - 1].nl_ih_port_id = RTE_MAX_ETHPORTS; + /* + * The interrupt handler port id must be reset before priv is reset + * since 'mlx5_dev_interrupt_nl_cb' uses priv. + */ + rte_io_wmb(); memset(priv, 0, sizeof(*priv)); priv->domain_id = RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID; /* -- 2.25.1