The LIST_REMOVE macro only removes the entry from the list and updates list itself. The pointers of this entry are not reset to NULL to prevent the accessing for the 2nd time.
In the previous fix for the memory accessing, the "rxq_ctrl" was removed from the list in a device private data when the "refcnt" was decreased to 0. Under only shared or non-shared queues scenarios, this was safe since all the "rxq_ctrl" entries were freed or kept. There is one case that shared and non-shared Rx queues are configured simultaneously, for example, a hairpin Rx queue cannot be shared. When closing the port that allocated the shared Rx queues' "rxq_ctrl", if the next entry is hairpin "rxq_ctrl", the hairpin "rxq_ctrl" will be freed directly with other resources. When trying to close the another port sharing the "rxq_ctrl", the LIST_REMOVE will be called again and cause some UFA issue. If the memory is no longer mapped, there will be a SIGSEGV. Adding a flag in the Rx queue private structure to remove the "rxq_ctrl" from the list only on the port/queue that allocated it. Fixes: bcc220cb57d7 ("net/mlx5: fix shared Rx queue list management") Signed-off-by: Bing Zhao <bi...@nvidia.com> Acked-by: Viacheslav Ovsiienko <viachesl...@nvidia.com> --- drivers/net/mlx5/mlx5_rx.h | 1 + drivers/net/mlx5/mlx5_rxq.c | 5 ++++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h index 7d144921ab..9bcb43b007 100644 --- a/drivers/net/mlx5/mlx5_rx.h +++ b/drivers/net/mlx5/mlx5_rx.h @@ -173,6 +173,7 @@ struct mlx5_rxq_ctrl { /* RX queue private data. */ struct mlx5_rxq_priv { uint16_t idx; /* Queue index. */ + bool possessor; /* Shared rxq_ctrl allocated for the 1st time. */ RTE_ATOMIC(uint32_t) refcnt; /* Reference counter. */ struct mlx5_rxq_ctrl *ctrl; /* Shared Rx Queue. */ LIST_ENTRY(mlx5_rxq_priv) owner_entry; /* Entry in shared rxq_ctrl. */ diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index f13fc3b353..b9240deb97 100644 --- a/drivers/net/mlx5/mlx5_rxq.c +++ b/drivers/net/mlx5/mlx5_rxq.c @@ -938,6 +938,7 @@ mlx5_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc, rte_errno = ENOMEM; return -rte_errno; } + rxq->possessor = true; } rxq->priv = priv; rxq->idx = idx; @@ -2015,6 +2016,7 @@ mlx5_rxq_hairpin_new(struct rte_eth_dev *dev, struct mlx5_rxq_priv *rxq, tmpl->rxq.mr_ctrl.cache_bh = (struct mlx5_mr_btree) { 0 }; tmpl->rxq.idx = idx; rxq->hairpin_conf = *hairpin_conf; + rxq->possessor = true; mlx5_rxq_ref(dev, idx); LIST_INSERT_HEAD(&priv->rxqsctrl, tmpl, next); return tmpl; @@ -2282,7 +2284,8 @@ mlx5_rxq_release(struct rte_eth_dev *dev, uint16_t idx) RTE_ETH_QUEUE_STATE_STOPPED; } } else { /* Refcnt zero, closing device. */ - LIST_REMOVE(rxq_ctrl, next); + if (rxq->possessor == true) + LIST_REMOVE(rxq_ctrl, next); LIST_REMOVE(rxq, owner_entry); if (LIST_EMPTY(&rxq_ctrl->owners)) { if (!rxq_ctrl->is_hairpin) -- 2.34.1