07/10/2020 17:06, Viacheslav Ovsiienko: > The DPDK datapath in the transmit direction is very flexible. > An application can build the multi-segment packet and manages > almost all data aspects - the memory pools where segments > are allocated from, the segment lengths, the memory attributes > like external buffers, registered for DMA, etc. > > In the receiving direction, the datapath is much less flexible, > an application can only specify the memory pool to configure the > receiving queue and nothing more. In order to extend receiving > datapath capabilities it is proposed to add the way to provide > extended information how to split the packets being received. > > The following structure is introduced to specify the Rx packet > segment: > > struct rte_eth_rxseg { > struct rte_mempool *mp; /* memory pools to allocate segment from */ > uint16_t length; /* segment maximal data length */
The "length" parameter is configuring a split point. Worth to note in the comment I think. > uint16_t offset; /* data offset from beginning of mbuf data buffer */ Is it replacing RTE_PKTMBUF_HEADROOM? > uint32_t reserved; /* reserved field */ > }; > > The new routine rte_eth_rx_queue_setup_ex() is introduced to > setup the given Rx queue using the new extended Rx packet segment > description: > > int > rte_eth_rx_queue_setup_ex(uint16_t port_id, uint16_t rx_queue_id, > uint16_t nb_rx_desc, unsigned int socket_id, > const struct rte_eth_rxconf *rx_conf, > const struct rte_eth_rxseg *rx_seg, > uint16_t n_seg) An alternative name for this function: rte_eth_rxseg_queue_setup > This routine presents the two new parameters: > rx_seg - pointer the array of segment descriptions, each element > describes the memory pool, maximal data length, initial > data offset from the beginning of data buffer in mbuf > n_seg - number of elements in the array Not clear why we need an array. I suggest writing here that each segment of the same packet can have different properties, the array representing the full packet. > The new offload flag DEV_RX_OFFLOAD_BUFFER_SPLIT in device The name should start with RTE_ prefix. > capabilities is introduced to present the way for PMD to report to > application about supporting Rx packet split to configurable > segments. Prior invoking the rte_eth_rx_queue_setup_ex() routine > application should check DEV_RX_OFFLOAD_BUFFER_SPLIT flag. > > If the Rx queue is configured with new routine the packets being > received will be split into multiple segments pushed to the mbufs > with specified attributes. The PMD will allocate the first mbuf > from the pool specified in the first segment descriptor and puts > the data staring at specified offset in the allocated mbuf data > buffer. If packet length exceeds the specified segment length > the next mbuf will be allocated according to the next segment > descriptor (if any) and data will be put in its data buffer at > specified offset and not exceeding specified length. If there is > no next descriptor the next mbuf will be allocated and filled in the > same way (from the same pool and with the same buffer offset/length) > as the current one. > > For example, let's suppose we configured the Rx queue with the > following segments: > seg0 - pool0, len0=14B, off0=RTE_PKTMBUF_HEADROOM > seg1 - pool1, len1=20B, off1=0B > seg2 - pool2, len2=20B, off2=0B > seg3 - pool3, len3=512B, off3=0B > > The packet 46 bytes long will look like the following: > seg0 - 14B long @ RTE_PKTMBUF_HEADROOM in mbuf from pool0 > seg1 - 20B long @ 0 in mbuf from pool1 > seg2 - 12B long @ 0 in mbuf from pool2 > > The packet 1500 bytes long will look like the following: > seg0 - 14B @ RTE_PKTMBUF_HEADROOM in mbuf from pool0 > seg1 - 20B @ 0 in mbuf from pool1 > seg2 - 20B @ 0 in mbuf from pool2 > seg3 - 512B @ 0 in mbuf from pool3 > seg4 - 512B @ 0 in mbuf from pool3 > seg5 - 422B @ 0 in mbuf from pool3 > > The offload DEV_RX_OFFLOAD_SCATTER must be present and > configured to support new buffer split feature (if n_seg > is greater than one). > > The new approach would allow splitting the ingress packets into > multiple parts pushed to the memory with different attributes. > For example, the packet headers can be pushed to the embedded > data buffers within mbufs and the application data into > the external buffers attached to mbufs allocated from the > different memory pools. The memory attributes for the split > parts may differ either - for example the application data > may be pushed into the external memory located on the dedicated > physical device, say GPU or NVMe. This would improve the DPDK > receiving datapath flexibility with preserving compatibility > with existing API. > > Also, the proposed segment description might be used to specify > Rx packet split for some other features. For example, provide > the way to specify the extra memory pool for the Header Split > feature of some Intel PMD. I don't understand what you are referring in this last paragraph. I think explanation above is enough to demonstrate the flexibility. > Signed-off-by: Viacheslav Ovsiienko <viachesl...@nvidia.com> Thank you, I like this feature. More minor comments below. [...] > +* **Introduced extended buffer description for receiving.** Rewording: Introduced extended setup of Rx queue > + * Added extended Rx queue setup routine > + * Added description for Rx segment sizes not only "sizes", but also offset and mempool. > + * Added capability to specify the memory pool for each segment This one can be merged with the above, or offset should be added. [...] The doxygen comment is missing here. > +int rte_eth_rx_queue_setup_ex(uint16_t port_id, uint16_t rx_queue_id, > + uint16_t nb_rx_desc, unsigned int socket_id, > + const struct rte_eth_rxconf *rx_conf, > + const struct rte_eth_rxseg *rx_seg, uint16_t n_seg); This new function should be experimental and it should be added to the .map file.