Let's take an application with the following configuration: - It uses 2 ports. - Each port has 3 Rx queues and 3 Tx queues. - On each port, Rx queues have a following purposes: - Rx queue 0 - SW queue, - Rx queue 1 - hairpin queue, bound to Tx queue on the same port, - Rx queue 2 - hairpin queue, bound to Tx queue on another port. - On each port, Tx queues have a following purposes: - Tx queue 0 - SW queue, - Tx queue 1 - hairpin queue, bound to Rx queue on the same port, - Tx queue 2 - hairpin queue, bound to Rx queue on another port. - Application configured all of the hairpin queues for manual binding.
After ports are configured and queues are set up, if the application does the following API call sequence: 1. rte_eth_dev_start(port_id=0) 2. rte_eth_hairpin_bind(tx_port=0, rx_port=0) 3. rte_eth_hairpin_bind(tx_port=0, rx_port=1) mlx5 PMD fails to modify SQ and logs this error: mlx5_common: mlx5_devx_cmds.c:2079: mlx5_devx_cmd_modify_sq(): Failed to modify SQ using DevX This error was caused by an incorrect unbind operation taken during error handling inside call (3). (3) fails, because port 1 (Rx side of the hairpin) was not started. As a result of this failure, PMD goes into error handling, where all previously bound hairpin queues are unbound. This is incorrect, since this error handling procedure in rte_eth_hairpin_bind() implementation assumes that all hairpin queues are bound to the same rx_port, which is not the case. The following sequence of function calls appears: - rte_eth_hairpin_queue_peer_unbind(rx_port=**1**, rx_queue=1, 0), - mlx5_hairpin_queue_peer_unbind(dev=**port 0**, tx_queue=1, 1). Which violates the hairpin queue destroy flow, by unbinding Tx queue 1 on port 0, before unbinding Rx queue 1 on port 1. This patch fixes that behavior, by filtering Tx queues on which error handling is done to only affect: - hairpin queues (it also reduces unnecessary debug log messages), - hairpin queues connected to the rx_port which is currently processed. Fixes: 37cd4501e873 ("net/mlx5: support two ports hairpin mode") Cc: bi...@nvidia.com Cc: sta...@dpdk.org Signed-off-by: Dariusz Sosnowski <dsosnow...@nvidia.com> Acked-by: Viacheslav Ovsiienko <viachesl...@nvidia.com> --- drivers/net/mlx5/mlx5_trigger.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c index 88dc271a21..329fa7da3e 100644 --- a/drivers/net/mlx5/mlx5_trigger.c +++ b/drivers/net/mlx5/mlx5_trigger.c @@ -845,6 +845,11 @@ mlx5_hairpin_bind_single_port(struct rte_eth_dev *dev, uint16_t rx_port) txq_ctrl = mlx5_txq_get(dev, i); if (txq_ctrl == NULL) continue; + if (!txq_ctrl->is_hairpin || + txq_ctrl->hairpin_conf.peers[0].port != rx_port) { + mlx5_txq_release(dev, i); + continue; + } rx_queue = txq_ctrl->hairpin_conf.peers[0].queue; rte_eth_hairpin_queue_peer_unbind(rx_port, rx_queue, 0); mlx5_hairpin_queue_peer_unbind(dev, i, 1); -- 2.25.1