> -----Original Message-----
> From: Ding, Xuan <xuan.d...@intel.com>
> Sent: Tuesday, March 22, 2022 11:56 AM
> To: tho...@monjalon.net; Yigit, Ferruh <ferruh.yi...@intel.com>;
> andrew.rybche...@oktetlabs.ru
> Cc: dev@dpdk.org; step...@networkplumber.org;
> m...@smartsharesystems.com; viachesl...@nvidia.com; Zhang, Qi Z
> <qi.z.zh...@intel.com>; Yu, Ping <ping...@intel.com>; Wu, WenxuanX
> <wenxuanx...@intel.com>; Ding, Xuan <xuan.d...@intel.com>; Wang,
> YuanX <yuanx.w...@intel.com>
> Subject: [RFC,v2 1/3] ethdev: introduce protocol type based header split
>
> From: Xuan Ding <xuan.d...@intel.com>
>
> Header split consists of splitting a received packet into two separate regions
> based on the packet content. The split happens after the packet header and
> before the packet payload. Splitting is usually between the packet header
> that can be posted to a dedicated buffer and the packet payload that can be
> posted to a different buffer.
>
> Currently, Rx buffer split supports length and offset based packet split.
> Although header split is a subset of buffer split, configure buffer split
> based
> on length and offset is not suitable for NICs that do split based on header
> protocol types. And tunneling makes the conversion from offset to protocol
> impossible.
>
> This patch extends the current buffer split to support protocol based header
> split. A new proto field is introduced in the rte_eth_rxseg_split structure
> reserved field to specify header protocol type. With Rx offload flag
> RTE_ETH_RX_OFFLOAD_HEADER_SPLIT enabled and protocol type
> configured, PMD will split the ingress packets into two separate regions.
> Currently, both inner and outer L2/L3/L4 level header split can be supported.
>
> For example, let's suppose we configured the Rx queue with the following
> segments:
> seg0 - pool0
> seg1 - pool1
>
> With header split type configured with RTE_ETH_RX_HEADER_SPLIT_UDP,
> the packet consists of MAC_IP_UDP_PAYLOAD will be split like following:
> seg0 - pool0, udp_header
> seg1 - pool1, payload
>
> The memory attributes for the split parts may differ either - for example the
> mempool0 and mempool1 belong to dpdk memory and external memory,
> respectively.
>
> Signed-off-by: Xuan Ding <xuan.d...@intel.com>
> Signed-off-by: Yuan Wang <yuanx.w...@intel.com>
> ---
> lib/ethdev/rte_ethdev.c | 24 +++++++++++++----------
> lib/ethdev/rte_ethdev.h | 43
> +++++++++++++++++++++++++++++++++++++++--
> 2 files changed, 55 insertions(+), 12 deletions(-)
>
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
> 70c850a2f1..49c8fef1c3 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -1661,6 +1661,7 @@ rte_eth_rx_queue_check_split(const struct
> rte_eth_rxseg_split *rx_seg,
> struct rte_mempool *mpl = rx_seg[seg_idx].mp;
> uint32_t length = rx_seg[seg_idx].length;
> uint32_t offset = rx_seg[seg_idx].offset;
> + uint16_t proto = rx_seg[seg_idx].proto;
>
> if (mpl == NULL) {
> RTE_ETHDEV_LOG(ERR, "null mempool pointer\n");
> @@ -1692,15 +1693,17 @@ rte_eth_rx_queue_check_split(const struct
> rte_eth_rxseg_split *rx_seg,
> (struct rte_pktmbuf_pool_private));
> return -ENOSPC;
> }
> - offset += seg_idx != 0 ? 0 : RTE_PKTMBUF_HEADROOM;
> - *mbp_buf_size = rte_pktmbuf_data_room_size(mpl);
> - length = length != 0 ? length : *mbp_buf_size;
> - if (*mbp_buf_size < length + offset) {
> - RTE_ETHDEV_LOG(ERR,
> - "%s mbuf_data_room_size %u < %u
> (segment length=%u + segment offset=%u)\n",
> - mpl->name, *mbp_buf_size,
> - length + offset, length, offset);
> - return -EINVAL;
> + if (proto == 0) {
> + offset += seg_idx != 0 ? 0 :
> RTE_PKTMBUF_HEADROOM;
> + *mbp_buf_size =
> rte_pktmbuf_data_room_size(mpl);
> + length = length != 0 ? length : *mbp_buf_size;
> + if (*mbp_buf_size < length + offset) {
> + RTE_ETHDEV_LOG(ERR,
> + "%s mbuf_data_room_size %u < %u
> (segment length=%u + segment offset=%u)\n",
> + mpl->name, *mbp_buf_size,
> + length + offset, length, offset);
> + return -EINVAL;
> + }
> }
As the length and proto is exclusive, it better also check the length when
proto!=0
.....
> @@ -1197,12 +1197,26 @@ struct rte_eth_txmode {
> * - pool from the last valid element
> * - the buffer size from this pool
> * - zero offset
> + *
> + * Header split is a subset of buffer split. The split happens after
> + the
> + * packet header and before the packet payload. For PMDs that do not
> + * support header split configuration by length and offset, the
> + location
> + * of the split needs to be specified by the header protocol type.
> + While for
> + * buffer split, this field should not be configured.
> + *
> + * If RTE_ETH_RX_OFFLOAD_HEADER_SPLIT flag is set in offloads field,
> + * the PMD will split the received packets into two separate regions:
> + * - The header buffer will be allocated from the memory pool,
> + * specified in the first array element, the second buffer, from the
> + * pool in the second element.
> + * - The length and offset do not need to be configured in header split.
We may not necessarily ignore the offset configure for header split as there is
no confliction, a driver still can support copying a split header to a specific
mbuf offset
And if we support offset with header split, offset boundary check can also be
considered in rte_eth_rx_queue_check_split
Regards
Qi