-----Original Message-----
From: Andrew Rybchenko <andrew.rybche...@oktetlabs.ru>
Sent: Monday, October 12, 2020 11:45
To: Slava Ovsiienko <viachesl...@nvidia.com>; dev@dpdk.org
Cc: Thomas Monjalon <thom...@mellanox.com>;
step...@networkplumber.org; ferruh.yi...@intel.com; Shahaf Shuler
<shah...@nvidia.com>; olivier.m...@6wind.com; jerinjac...@gmail.com;
maxime.coque...@redhat.com; david.march...@redhat.com; Asaf Penso
<as...@nvidia.com>
Subject: Re: [dpdk-dev] [RFC] ethdev: introduce Rx buffer split
Hi Slava,
I'm sorry for late reply. See my notes below.
On 10/1/20 11:54 AM, Slava Ovsiienko wrote:
Hi, Andrew
Thank you for the comments, please see my replies below.
-----Original Message-----
From: Andrew Rybchenko <arybche...@solarflare.com>
Sent: Thursday, September 17, 2020 19:55
To: Slava Ovsiienko <viachesl...@nvidia.com>; dev@dpdk.org
Cc: Thomas Monjalon <thom...@mellanox.com>;
step...@networkplumber.org; ferruh.yi...@intel.com; Shahaf Shuler
<shah...@nvidia.com>; olivier.m...@6wind.com;
jerinjac...@gmail.com;
maxime.coque...@redhat.com; david.march...@redhat.com; Asaf
Penso
<as...@nvidia.com>
Subject: Re: [dpdk-dev] [RFC] ethdev: introduce Rx buffer split
[snip]
For example, let's suppose we configured the Rx queue with the
following segments:
seg0 - pool0, len0=14B, off0=RTE_PKTMBUF_HEADROOM
seg1 - pool1, len1=20B, off1=0B
seg2 - pool2, len2=20B, off2=0B
seg3 - pool3, len3=512B, off3=0B
The packet 46 bytes long will look like the following:
seg0 - 14B long @ RTE_PKTMBUF_HEADROOM in mbuf from pool0
seg1 - 20B long @ 0 in mbuf from pool1
seg2 - 12B long @ 0 in mbuf from pool2
The packet 1500 bytes long will look like the following:
seg0 - 14B @ RTE_PKTMBUF_HEADROOM in mbuf from pool0
seg1 - 20B @ 0 in mbuf from pool1
seg2 - 20B @ 0 in mbuf from pool2
seg3 - 512B @ 0 in mbuf from pool3
seg4 - 512B @ 0 in mbuf from pool3
seg5 - 422B @ 0 in mbuf from pool3
The behaviour is logical, but what to do if HW can't do it, i.e. use
the last segment many times. Should it reject configuration if
provided segments are insufficient to fit MTU packet? How to report
the limitation?
(I'm still trying to convince that SCATTER and BUFFER_SPLIT should be
independent).
BUFFER_SPLIT is rather the way to tune SCATTER. Currently scattering
happens on unconditional mbuf data buffer boundaries (we have reserved
HEAD space in the first mbuf and fill this one to the buffer end, the
next mbuf buffers might be filled completely). BUFFER_SPLIT provides
the way to specify the desired points to split packet, not just
blindly follow buffer boundaries. There is the check inplemented in
common part if each split segment fits the mbuf allocated from
appropriate pool.
PMD should do extra check internally whether it supports the requested
split settings, if not - call will be rejected.
@Thomas, @Ferruh: I'd like to hear what other ethdev maintainers think
about it.
[snip]
I dislike the idea to introduce new device operation.
rte_eth_rxconf has reserved space and BUFFER_SPLIT offload will mean
that PMD looks at the split configuration location there.
We considered the approach of pushing split setting to the rxconf
structure.
[https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatc
hes.dpdk.org%2Fpatch%2F75205%2F&data=02%7C01%7Cviacheslavo%
40nvidi
a.com%7C97a49cb62028432610ea08d86e8b3283%7C43083d15727340c1b7
db39efd9c
cc17a%7C0%7C0%7C637380891414182285&sdata=liII5DHGlJAL8wEwV
Vika79tp
8R9faTZ0lXrlfvQGZE%3D&reserved=0]
But it seems there are some issues:
- the split configuration description requires the variable length
array (due to variations in number of segments), so rte_eth_rxconf
structure would have the variable length (not nice, IMO).
We could push pointers to the array of rte_eth_rxseg, but we would
lost the single structure (and contiguous memory) simplicity, this
approach has no advantages over the specifying the split configuration
as parameters of setup_ex().
I think it has a huge advantage to avoid extra device operation.
- it would introduces the ambiguity, rte_eth_rx_queue_setup()
specifies the single mbuf pool as parameter. What should we do with
it? Set to NULL? Treat as the first pool? I would prefer to specify
all split segments in uniform fashion, i.e. as array or rte_eth_rxseg
structures (and it can be easily updated with some extra segment
attributes if needed). So, in my opinion, we should remove/replace the
pool parameter in rx_queue_setup (by introducing new func).
I'm trying to resolve the ambiguity as described above (see BUFFER_SPLIT vs
SCATTER). Use the pointer for tail segments with respect to SCATTER
capability.
- specifying the new extended setup roiutine has an advantage that we
should not update any PMDs code in part of existing implementations of
rte_eth_rx_queue_setup().
It is not required since it is controlled by the new offload flags. If the
offload
is not supported, the new field is invisible for PMD (it simply ignores).
If PMD supports BUFFER_SPLIT (or other related feature) it just should
provide
rte_eth_rx_queue_setup_ex() and check the
DEV_RX_OFFLOAD_BUFFER_SPLIT
(or HEADER_SPLIT, or ever feature) it supports. The common code does
not check the feature flags - it is on PMDs' own. In order to
configure PMD to perfrom arbitrary desired Rx spliting the application
should check DEV_RX_OFFLOAD_BUFFER_SPLIT in port capabilites, if found
- set DEV_RX_OFFLOAD_BUFFER_SPLIT in configuration and call
rte_eth_rx_queue_setup_ex().
And this approach can be followed for any other split related feature.
With best regards, Slava