Hi, Thomas Thank you for the comments, please, see my answers below.
> -----Original Message----- > From: Thomas Monjalon <tho...@monjalon.net> > Sent: Monday, October 12, 2020 1:18 > To: Slava Ovsiienko <viachesl...@nvidia.com> > Cc: dev@dpdk.org; step...@networkplumber.org; ferruh.yi...@intel.com; > olivier.m...@6wind.com; jerinjac...@gmail.com; > maxime.coque...@redhat.com; david.march...@redhat.com; > arybche...@solarflare.com > Subject: Re: [dpdk-dev] [PATCH v2 1/9] ethdev: introduce Rx buffer split > > 07/10/2020 17:06, Viacheslav Ovsiienko: > > The DPDK datapath in the transmit direction is very flexible. > > An application can build the multi-segment packet and manages almost > > all data aspects - the memory pools where segments are allocated from, > > the segment lengths, the memory attributes like external buffers, > > registered for DMA, etc. > > > > In the receiving direction, the datapath is much less flexible, an > > application can only specify the memory pool to configure the > > receiving queue and nothing more. In order to extend receiving > > datapath capabilities it is proposed to add the way to provide > > extended information how to split the packets being received. > > > > The following structure is introduced to specify the Rx packet > > segment: > > > > struct rte_eth_rxseg { > > struct rte_mempool *mp; /* memory pools to allocate segment from */ > > uint16_t length; /* segment maximal data length */ > > The "length" parameter is configuring a split point. > Worth to note in the comment I think. OK, got it. > > > uint16_t offset; /* data offset from beginning of mbuf data buffer > > */ > > Is it replacing RTE_PKTMBUF_HEADROOM? > Actually adding to HEAD_ROOM. We should keep HEAD_ROOM intact, so actual data offset in the firtst mbuf must be the sum HEAD_ROOM + offset. mlx5 PMD Imlementation follows this approach, documentation will be updated in v3. > > uint32_t reserved; /* reserved field */ }; > > > > The new routine rte_eth_rx_queue_setup_ex() is introduced to setup the > > given Rx queue using the new extended Rx packet segment > > description: > > > > int > > rte_eth_rx_queue_setup_ex(uint16_t port_id, uint16_t rx_queue_id, > > uint16_t nb_rx_desc, unsigned int socket_id, > > const struct rte_eth_rxconf *rx_conf, > > const struct rte_eth_rxseg *rx_seg, > > uint16_t n_seg) > > An alternative name for this function: > rte_eth_rxseg_queue_setup M-m-m... Routine name follows patter object_verb: rx_queue is an object, setup is an action. rxseg_queue is not an object. What about "rte_eth_rx_queue_setup_seg"? > > > This routine presents the two new parameters: > > rx_seg - pointer the array of segment descriptions, each element > > describes the memory pool, maximal data length, initial > > data offset from the beginning of data buffer in mbuf > > n_seg - number of elements in the array > > Not clear why we need an array. > I suggest writing here that each segment of the same packet can have > different properties, the array representing the full packet. OK, will write. > > > The new offload flag DEV_RX_OFFLOAD_BUFFER_SPLIT in device > > The name should start with RTE_ prefix. It is an existing pattern for DEV_RX_OFFLOAD_xxxx, no RTE_ for the case. > > > capabilities is introduced to present the way for PMD to report to > > application about supporting Rx packet split to configurable segments. > > Prior invoking the rte_eth_rx_queue_setup_ex() routine application > > should check DEV_RX_OFFLOAD_BUFFER_SPLIT flag. > > > > If the Rx queue is configured with new routine the packets being > > received will be split into multiple segments pushed to the mbufs with > > specified attributes. The PMD will allocate the first mbuf from the > > pool specified in the first segment descriptor and puts the data > > staring at specified offset in the allocated mbuf data buffer. If > > packet length exceeds the specified segment length the next mbuf will > > be allocated according to the next segment descriptor (if any) and > > data will be put in its data buffer at specified offset and not > > exceeding specified length. If there is no next descriptor the next > > mbuf will be allocated and filled in the same way (from the same pool > > and with the same buffer offset/length) as the current one. > > > > For example, let's suppose we configured the Rx queue with the > > following segments: > > seg0 - pool0, len0=14B, off0=RTE_PKTMBUF_HEADROOM > > seg1 - pool1, len1=20B, off1=0B > > seg2 - pool2, len2=20B, off2=0B > > seg3 - pool3, len3=512B, off3=0B > > > > The packet 46 bytes long will look like the following: > > seg0 - 14B long @ RTE_PKTMBUF_HEADROOM in mbuf from pool0 > > seg1 - 20B long @ 0 in mbuf from pool1 > > seg2 - 12B long @ 0 in mbuf from pool2 > > > > The packet 1500 bytes long will look like the following: > > seg0 - 14B @ RTE_PKTMBUF_HEADROOM in mbuf from pool0 > > seg1 - 20B @ 0 in mbuf from pool1 > > seg2 - 20B @ 0 in mbuf from pool2 > > seg3 - 512B @ 0 in mbuf from pool3 > > seg4 - 512B @ 0 in mbuf from pool3 > > seg5 - 422B @ 0 in mbuf from pool3 > > > > The offload DEV_RX_OFFLOAD_SCATTER must be present and configured > to > > support new buffer split feature (if n_seg is greater than one). > > > > The new approach would allow splitting the ingress packets into > > multiple parts pushed to the memory with different attributes. > > For example, the packet headers can be pushed to the embedded data > > buffers within mbufs and the application data into the external > > buffers attached to mbufs allocated from the different memory pools. > > The memory attributes for the split parts may differ either - for > > example the application data may be pushed into the external memory > > located on the dedicated physical device, say GPU or NVMe. This would > > improve the DPDK receiving datapath flexibility with preserving > > compatibility with existing API. > > > > Also, the proposed segment description might be used to specify Rx > > packet split for some other features. For example, provide the way to > > specify the extra memory pool for the Header Split feature of some > > Intel PMD. > > I don't understand what you are referring in this last paragraph. > I think explanation above is enough to demonstrate the flexibility. > Just noted the segment description is common thing and could be promoted to be used in some other features. > > Signed-off-by: Viacheslav Ovsiienko <viachesl...@nvidia.com> > > Thank you, I like this feature. > More minor comments below. > > [...] > > +* **Introduced extended buffer description for receiving.** > > Rewording: > Introduced extended setup of Rx queue OK, sounds better. > > > + * Added extended Rx queue setup routine > > + * Added description for Rx segment sizes > > not only "sizes", but also offset and mempool. > > > + * Added capability to specify the memory pool for each segment > > This one can be merged with the above, or offset should be added. > > [...] > The doxygen comment is missing here. Yes, thank you. Also noted that, updating. > > > +int rte_eth_rx_queue_setup_ex(uint16_t port_id, uint16_t rx_queue_id, > > + uint16_t nb_rx_desc, unsigned int socket_id, > > + const struct rte_eth_rxconf *rx_conf, > > + const struct rte_eth_rxseg *rx_seg, uint16_t n_seg); > > This new function should be experimental and it should be added to the > .map file. > OK. With best regards, Slava