Re: [dpdk-dev] [PATCH v4 1/6] ethdev: fix max Rx packet length

Matan Azrad Sun, 10 Oct 2021 03:31:35 -0700


Hi Ferruh


From: Ferruh Yigit
> There is a confusion on setting max Rx packet length, this patch aims to
> clarify it.
> 
> 'rte_eth_dev_configure()' API accepts max Rx packet size via
> 'uint32_t max_rx_pkt_len' field of the config struct 'struct
> rte_eth_conf'.
> 
> Also 'rte_eth_dev_set_mtu()' API can be used to set the MTU, and result
> stored into '(struct rte_eth_dev)->data->mtu'.
> 
> These two APIs are related but they work in a disconnected way, they
> store the set values in different variables which makes hard to figure
> out which one to use, also having two different method for a related
> functionality is confusing for the users.
> 
> Other issues causing confusion is:
> * maximum transmission unit (MTU) is payload of the Ethernet frame. And
>   'max_rx_pkt_len' is the size of the Ethernet frame. Difference is
>   Ethernet frame overhead, and this overhead may be different from
>   device to device based on what device supports, like VLAN and QinQ.
> * 'max_rx_pkt_len' is only valid when application requested jumbo frame,
>   which adds additional confusion and some APIs and PMDs already
>   discards this documented behavior.
> * For the jumbo frame enabled case, 'max_rx_pkt_len' is an mandatory
>   field, this adds configuration complexity for application.
> 
> As solution, both APIs gets MTU as parameter, and both saves the result
> in same variable '(struct rte_eth_dev)->data->mtu'. For this
> 'max_rx_pkt_len' updated as 'mtu', and it is always valid independent
> from jumbo frame.
> 
> For 'rte_eth_dev_configure()', 'dev->data->dev_conf.rxmode.mtu' is user
> request and it should be used only within configure function and result
> should be stored to '(struct rte_eth_dev)->data->mtu'. After that point
> both application and PMD uses MTU from this variable.
> 
> When application doesn't provide an MTU during 'rte_eth_dev_configure()'
> default 'RTE_ETHER_MTU' value is used.
> 
> Additional clarification done on scattered Rx configuration, in
> relation to MTU and Rx buffer size.
> MTU is used to configure the device for physical Rx/Tx size limitation,
> Rx buffer is where to store Rx packets, many PMDs use mbuf data buffer
> size as Rx buffer size.
> PMDs compare MTU against Rx buffer size to decide enabling scattered Rx
> or not. If scattered Rx is not supported by device, MTU bigger than Rx
> buffer size should fail.

Should it be compared also against max_lro_pkt_size for the SCATTER enabling by 
the PMD?

What do you think about enabling SCATTER by the API instead of making the 
comparison in each PMD?

> Signed-off-by: Ferruh Yigit <ferruh.yi...@intel.com>

<snip>

Please see more below regarding SCATTER.
 
> diff --git a/drivers/net/mlx4/mlx4_rxq.c b/drivers/net/mlx4/mlx4_rxq.c
> index 978cbb8201ea..4a5cfd22aa71 100644
> --- a/drivers/net/mlx4/mlx4_rxq.c
> +++ b/drivers/net/mlx4/mlx4_rxq.c
> @@ -753,6 +753,7 @@ mlx4_rx_queue_setup(struct rte_eth_dev *dev,
> uint16_t idx, uint16_t desc,
>         int ret;
>         uint32_t crc_present;
>         uint64_t offloads;
> +       uint32_t max_rx_pktlen;
> 
>         offloads = conf->offloads | dev->data->dev_conf.rxmode.offloads;
> 
> @@ -828,13 +829,11 @@ mlx4_rx_queue_setup(struct rte_eth_dev *dev,
> uint16_t idx, uint16_t desc,
>         };
>         /* Enable scattered packets support for this queue if necessary. */
>         MLX4_ASSERT(mb_len >= RTE_PKTMBUF_HEADROOM);
> -       if (dev->data->dev_conf.rxmode.max_rx_pkt_len <=
> -           (mb_len - RTE_PKTMBUF_HEADROOM)) {
> +       max_rx_pktlen = dev->data->mtu + RTE_ETHER_HDR_LEN +
> RTE_ETHER_CRC_LEN;
> +       if (max_rx_pktlen <= (mb_len - RTE_PKTMBUF_HEADROOM)) {
>                 ;
>         } else if (offloads & DEV_RX_OFFLOAD_SCATTER) {
> -               uint32_t size =
> -                       RTE_PKTMBUF_HEADROOM +
> -                       dev->data->dev_conf.rxmode.max_rx_pkt_len;
> +               uint32_t size = RTE_PKTMBUF_HEADROOM + max_rx_pktlen;
>                 uint32_t sges_n;
> 
>                 /*
> @@ -846,21 +845,19 @@ mlx4_rx_queue_setup(struct rte_eth_dev *dev,
> uint16_t idx, uint16_t desc,
>                 /* Make sure sges_n did not overflow. */
>                 size = mb_len * (1 << rxq->sges_n);
>                 size -= RTE_PKTMBUF_HEADROOM;
> -               if (size < dev->data->dev_conf.rxmode.max_rx_pkt_len) {
> +               if (size < max_rx_pktlen) {
>                         rte_errno = EOVERFLOW;
>                         ERROR("%p: too many SGEs (%u) needed to handle"
>                               " requested maximum packet size %u",
>                               (void *)dev,
> -                             1 << sges_n,
> -                             dev->data->dev_conf.rxmode.max_rx_pkt_len);
> +                             1 << sges_n, max_rx_pktlen);
>                         goto error;
>                 }
>         } else {
>                 WARN("%p: the requested maximum Rx packet size (%u) is"
>                      " larger than a single mbuf (%u) and scattered"
>                      " mode has not been requested",
> -                    (void *)dev,
> -                    dev->data->dev_conf.rxmode.max_rx_pkt_len,
> +                    (void *)dev, max_rx_pktlen,
>                      mb_len - RTE_PKTMBUF_HEADROOM);
>         }

If, by definition, SCATTER should be enabled implicitly by the PMD according to 
the comparison you wrote above, maybe this check for SCATTER offload is not 
needed.

Also, it can be documented on SCATTER offload precisely the parameters that are 
used for the comparison and that it is for capability only and no need to 
configure it.

Also, for the case of multi RX mempools configuration, it can be implicitly 
understood by the PMDs to enable SCATTER and no need to check that in PMD/API.

What do you think?

>         DEBUG("%p: maximum number of segments per packet: %u",
> diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
> index abd8ce798986..6f4f351222d3 100644
> --- a/drivers/net/mlx5/mlx5_rxq.c
> +++ b/drivers/net/mlx5/mlx5_rxq.c
> @@ -1330,10 +1330,11 @@ mlx5_rxq_new(struct rte_eth_dev *dev,
> uint16_t idx, uint16_t desc,
>         uint64_t offloads = conf->offloads |
>                            dev->data->dev_conf.rxmode.offloads;
>         unsigned int lro_on_queue = !!(offloads &
> DEV_RX_OFFLOAD_TCP_LRO);
> -       unsigned int max_rx_pkt_len = lro_on_queue ?
> +       unsigned int max_rx_pktlen = lro_on_queue ?
>                         dev->data->dev_conf.rxmode.max_lro_pkt_size :
> -                       dev->data->dev_conf.rxmode.max_rx_pkt_len;
> -       unsigned int non_scatter_min_mbuf_size = max_rx_pkt_len +
> +                       dev->data->mtu + (unsigned int)RTE_ETHER_HDR_LEN +
> +                               RTE_ETHER_CRC_LEN;
> +       unsigned int non_scatter_min_mbuf_size = max_rx_pktlen +
>                                                         RTE_PKTMBUF_HEADROOM;
>         unsigned int max_lro_size = 0;
>         unsigned int first_mb_free_size = mb_len - RTE_PKTMBUF_HEADROOM;
> @@ -1372,7 +1373,7 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t
> idx, uint16_t desc,
>          * needed to handle max size packets, replace zero length
>          * with the buffer length from the pool.
>          */
> -       tail_len = max_rx_pkt_len;
> +       tail_len = max_rx_pktlen;
>         do {
>                 struct mlx5_eth_rxseg *hw_seg =
>                                         &tmpl->rxq.rxseg[tmpl->rxq.rxseg_n];
> @@ -1410,7 +1411,7 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t
> idx, uint16_t desc,
>                                 "port %u too many SGEs (%u) needed to handle"
>                                 " requested maximum packet size %u, the 
> maximum"
>                                 " supported are %u", dev->data->port_id,
> -                               tmpl->rxq.rxseg_n, max_rx_pkt_len,
> +                               tmpl->rxq.rxseg_n, max_rx_pktlen,
>                                 MLX5_MAX_RXQ_NSEG);
>                         rte_errno = ENOTSUP;
>                         goto error;
> @@ -1435,7 +1436,7 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t
> idx, uint16_t desc,
>                 DRV_LOG(ERR, "port %u Rx queue %u: Scatter offload is not"
>                         " configured and no enough mbuf space(%u) to contain "
>                         "the maximum RX packet length(%u) with head-room(%u)",
> -                       dev->data->port_id, idx, mb_len, max_rx_pkt_len,
> +                       dev->data->port_id, idx, mb_len, max_rx_pktlen,
>                         RTE_PKTMBUF_HEADROOM);
>                 rte_errno = ENOSPC;
>                 goto error;

Also, here for the SCATTER check. Here, it is even an error.

> @@ -1454,7 +1455,7 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t
> idx, uint16_t desc,
>          * following conditions are met:
>          *  - MPRQ is enabled.
>          *  - The number of descs is more than the number of strides.
> -        *  - max_rx_pkt_len plus overhead is less than the max size
> +        *  - max_rx_pktlen plus overhead is less than the max size
>          *    of a stride or mprq_stride_size is specified by a user.
>          *    Need to make sure that there are enough strides to encap
>          *    the maximum packet size in case mprq_stride_size is set.
> @@ -1478,7 +1479,7 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t
> idx, uint16_t desc,
>                                 !!(offloads & DEV_RX_OFFLOAD_SCATTER);
>                 tmpl->rxq.mprq_max_memcpy_len = RTE_MIN(first_mb_free_size,
>                                 config->mprq.max_memcpy_len);
> -               max_lro_size = RTE_MIN(max_rx_pkt_len,
> +               max_lro_size = RTE_MIN(max_rx_pktlen,
>                                        (1u << tmpl->rxq.strd_num_n) *
>                                        (1u << tmpl->rxq.strd_sz_n));
>                 DRV_LOG(DEBUG,
> @@ -1487,9 +1488,9 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t
> idx, uint16_t desc,
>                         dev->data->port_id, idx,
>                         tmpl->rxq.strd_num_n, tmpl->rxq.strd_sz_n);
>         } else if (tmpl->rxq.rxseg_n == 1) {
> -               MLX5_ASSERT(max_rx_pkt_len <= first_mb_free_size);
> +               MLX5_ASSERT(max_rx_pktlen <= first_mb_free_size);
>                 tmpl->rxq.sges_n = 0;
> -               max_lro_size = max_rx_pkt_len;
> +               max_lro_size = max_rx_pktlen;
>         } else if (offloads & DEV_RX_OFFLOAD_SCATTER) {
>                 unsigned int sges_n;
> 
> @@ -1511,13 +1512,13 @@ mlx5_rxq_new(struct rte_eth_dev *dev,
> uint16_t idx, uint16_t desc,
>                                 "port %u too many SGEs (%u) needed to handle"
>                                 " requested maximum packet size %u, the 
> maximum"
>                                 " supported are %u", dev->data->port_id,
> -                               1 << sges_n, max_rx_pkt_len,
> +                               1 << sges_n, max_rx_pktlen,
>                                 1u << MLX5_MAX_LOG_RQ_SEGS);
>                         rte_errno = ENOTSUP;
>                         goto error;
>                 }
>                 tmpl->rxq.sges_n = sges_n;
> -               max_lro_size = max_rx_pkt_len;
> +               max_lro_size = max_rx_pktlen;
>         }
>         if (config->mprq.enabled && !mlx5_rxq_mprq_enabled(&tmpl->rxq))
>                 DRV_LOG(WARNING,

<snip>

Re: [dpdk-dev] [PATCH v4 1/6] ethdev: fix max Rx packet length

Reply via email to