> -----Original Message-----
> From: Bing Zhao <bi...@mellanox.com>
> Sent: Wednesday, February 19, 2020 10:29 AM
> To: Ori Kam <or...@mellanox.com>; Slava Ovsiienko
> <viachesl...@mellanox.com>
> Cc: Raslan Darawsheh <rasl...@mellanox.com>; Matan Azrad
> <ma...@mellanox.com>; dev@dpdk.org; sta...@dpdk.org
> Subject: [PATCH v2] net/mlx5: fix the hairpin queue capacity
>
> The hairpin TX/RX queue depth and packet size is fixed in the past.
> When the firmware has some fix or improvement, the PMD will not
> make full use of it. And also, 32 packets for a single queue will not
> guarantee a good performance for hairpin flows. It will make the
> stride size larger and for small packets, it is a waste of memory.
> The recommended stride size is 64B now.
>
> The parameter of hairpin queue setup needs to be adjusted.
> 1. A proper buffer size should support the standard jumbo frame with
> 9KB, and also more than 1 jumbo frame packet for performance.
> 2. Number of packets of a single queue should be the maximum
> supported value (total buffer size / stride size).
>
> There is no need to support the max capacity of total buffer size
> because the memory consumption should also be taken into
> consideration.
>
> Fixes: e79c9be91515 ("net/mlx5: support Rx hairpin queues")
> Cc: or...@mellanox.com
> Cc: sta...@dpdk.org
>
> Signed-off-by: Bing Zhao <bi...@mellanox.com>
>
> ------------
>
Acked-by: Ori Kam <or...@mellanox.com>
Thanks,
Ori
> v2: change the capacity parameters and the commit details
>
> ---
> drivers/net/mlx5/mlx5_defs.h | 4 ++++
> drivers/net/mlx5/mlx5_rxq.c | 13 +++++++++----
> drivers/net/mlx5/mlx5_txq.c | 13 +++++++++----
> 3 files changed, 22 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
> index 9b392ed..83ca367 100644
> --- a/drivers/net/mlx5/mlx5_defs.h
> +++ b/drivers/net/mlx5/mlx5_defs.h
> @@ -173,6 +173,10 @@
> #define MLX5_FLOW_MREG_HNAME "MARK_COPY_TABLE"
> #define MLX5_DEFAULT_COPY_ID UINT32_MAX
>
> +/* Hairpin TX/RX queue configuration parameters. */
> +#define MLX5_HAIRPIN_QUEUE_STRIDE 6
> +#define MLX5_HAIRPIN_JUMBO_LOG_SIZE (15 + 2)
> +
> /* Definition of static_assert found in /usr/include/assert.h */
> #ifndef HAVE_STATIC_ASSERT
> #define static_assert _Static_assert
> diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
> index dc0fd82..8a6b410 100644
> --- a/drivers/net/mlx5/mlx5_rxq.c
> +++ b/drivers/net/mlx5/mlx5_rxq.c
> @@ -1268,6 +1268,7 @@
> struct mlx5_devx_create_rq_attr attr = { 0 };
> struct mlx5_rxq_obj *tmpl = NULL;
> int ret = 0;
> + uint32_t max_wq_data;
>
> MLX5_ASSERT(rxq_data);
> MLX5_ASSERT(!rxq_ctrl->obj);
> @@ -1283,11 +1284,15 @@
> tmpl->type = MLX5_RXQ_OBJ_TYPE_DEVX_HAIRPIN;
> tmpl->rxq_ctrl = rxq_ctrl;
> attr.hairpin = 1;
> - /* Workaround for hairpin startup */
> - attr.wq_attr.log_hairpin_num_packets = log2above(32);
> - /* Workaround for packets larger than 1KB */
> + max_wq_data = priv->config.hca_attr.log_max_hairpin_wq_data_sz;
> + /* Jumbo frames > 9KB should be supported, and more packets. */
> attr.wq_attr.log_hairpin_data_sz =
> - priv->config.hca_attr.log_max_hairpin_wq_data_sz;
> + (max_wq_data < MLX5_HAIRPIN_JUMBO_LOG_SIZE) ?
> + max_wq_data : MLX5_HAIRPIN_JUMBO_LOG_SIZE;
> + /* Set the packets number to the maximum value for performance. */
> + attr.wq_attr.log_hairpin_num_packets =
> + attr.wq_attr.log_hairpin_data_sz -
> + MLX5_HAIRPIN_QUEUE_STRIDE;
> tmpl->rq = mlx5_devx_cmd_create_rq(priv->sh->ctx, &attr,
> rxq_ctrl->socket);
> if (!tmpl->rq) {
> diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
> index bc13abf..2ad849a 100644
> --- a/drivers/net/mlx5/mlx5_txq.c
> +++ b/drivers/net/mlx5/mlx5_txq.c
> @@ -493,6 +493,7 @@
> struct mlx5_devx_create_sq_attr attr = { 0 };
> struct mlx5_txq_obj *tmpl = NULL;
> int ret = 0;
> + uint32_t max_wq_data;
>
> MLX5_ASSERT(txq_data);
> MLX5_ASSERT(!txq_ctrl->obj);
> @@ -509,11 +510,15 @@
> tmpl->txq_ctrl = txq_ctrl;
> attr.hairpin = 1;
> attr.tis_lst_sz = 1;
> - /* Workaround for hairpin startup */
> - attr.wq_attr.log_hairpin_num_packets = log2above(32);
> - /* Workaround for packets larger than 1KB */
> + max_wq_data = priv->config.hca_attr.log_max_hairpin_wq_data_sz;
> + /* Jumbo frames > 9KB should be supported, and more packets. */
> attr.wq_attr.log_hairpin_data_sz =
> - priv->config.hca_attr.log_max_hairpin_wq_data_sz;
> + (max_wq_data < MLX5_HAIRPIN_JUMBO_LOG_SIZE) ?
> + max_wq_data : MLX5_HAIRPIN_JUMBO_LOG_SIZE;
> + /* Set the packets number to the maximum value for performance. */
> + attr.wq_attr.log_hairpin_num_packets =
> + attr.wq_attr.log_hairpin_data_sz -
> + MLX5_HAIRPIN_QUEUE_STRIDE;
> attr.tis_num = priv->sh->tis->id;
> tmpl->sq = mlx5_devx_cmd_create_sq(priv->sh->ctx, &attr);
> if (!tmpl->sq) {
> --
> 1.8.3.1