On Wed, Oct 14, 2020 at 11:42 PM Viacheslav Ovsiienko <viachesl...@nvidia.com> wrote: > > The DPDK datapath in the transmit direction is very flexible. > An application can build the multi-segment packet and manages > almost all data aspects - the memory pools where segments > are allocated from, the segment lengths, the memory attributes > like external buffers, registered for DMA, etc. > > In the receiving direction, the datapath is much less flexible, > an application can only specify the memory pool to configure the > receiving queue and nothing more. In order to extend receiving > datapath capabilities it is proposed to add the way to provide > extended information how to split the packets being received. > > The following structure is introduced to specify the Rx packet > segment: > > struct rte_eth_rxseg { > struct rte_mempool *mp; /* memory pools to allocate segment from */ > uint16_t length; /* segment maximal data length, > configures "split point" */ > uint16_t offset; /* data offset from beginning > of mbuf data buffer */ > uint32_t reserved; /* reserved field */ > }; > > The segment descriptions are added to the rte_eth_rxconf structure: > rx_seg - pointer the array of segment descriptions, each element > describes the memory pool, maximal data length, initial > data offset from the beginning of data buffer in mbuf. > This array allows to specify the different settings for > each segment in individual fashion. > rx_nseg - number of elements in the array > > If the extended segment descriptions is provided with these new > fields the mp parameter of the rte_eth_rx_queue_setup must be > specified as NULL to avoid ambiguity. > > There are two options to specifiy Rx buffer configuration: > - mp is not NULL, rx_conf.rx_seg is NULL, rx_conf.rx_nseg is zero, > it is compatible configuraion, follows existing implementation, > provides single pool and no description for segment sizes > and offsets. > - mp is NULL, rx_conf.rx_seg is not NULL, rx_conf.rx_nseg is not > zero, it provides the extended configuration, individually for > each segment. > > The new offload flag RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT in device > capabilities is introduced to present the way for PMD to report to > application about supporting Rx packet split to configurable > segments. Prior invoking the rte_eth_rx_queue_setup() routine > application should check RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT flag. > > If the Rx queue is configured with new settings the packets being > received will be split into multiple segments pushed to the mbufs > with specified attributes. The PMD will split the received packets > into multiple segments according to the specification in the > description array: > > - the first network buffer will be allocated from the memory pool, > specified in the first segment description element, the second > network buffer - from the pool in the second segment description > element and so on. If there is no enough elements to describe > the buffer for entire packet of maximal length the pool from the > last valid element will be used to allocate the buffers from for the > rest of segments > > - the offsets from the segment description elements will provide > the data offset from the buffer beginning except the first mbuf - > for this one the offset is added to the RTE_PKTMBUF_HEADROOM to get > actual offset from the buffer beginning. If there is no enough > elements to describe the buffer for entire packet of maximal length > the offsets for the rest of segment will be supposed to be zero. > > - the data length being received to each segment is limited by the > length specified in the segment description element. The data > receiving starts with filling up the first mbuf data buffer, if the > specified maximal segment length is reached and there are data > remaining (packet is longer than buffer in the first mbuf) the > following data will be pushed to the next segment up to its own > maximal length. If the first two segments is not enough to store > all the packet remaining data the next (third) segment will > be engaged and so on. If the length in the segment description > element is zero the actual buffer size will be deduced from > the appropriate memory pool properties. If there is no enough > elements to describe the buffer for entire packet of maximal > length the buffer size will be deduced from the pool of the last > valid element for the remaining segments. > > For example, let's suppose we configured the Rx queue with the > following segments: > seg0 - pool0, len0=14B, off0=2 > seg1 - pool1, len1=20B, off1=128B > seg2 - pool2, len2=20B, off2=0B > seg3 - pool3, len3=512B, off3=0B
Sorry for chime in late. This API lookout looks good to me. But, I am wondering how the application can know the capability or "limits" of struct rte_eth_rxseg structure for the specific PMD. The other descriptor limit, it's being exposed with struct rte_eth_dev_info::rx_desc_lim; If PMD can support a specific pattern rather than returning the blanket error, the application should know the limit. IMO, it is better to add struct rte_eth_rxseg *rxsegs; unint16_t nb_max_rxsegs in rte_eth_dev_info structure to express the capablity. Where the en and offset can define the max offset. Thoughts?