Sorry, forgot to reply to original thread, resent. Please ignore this series.
On Mon, 2021-10-18 at 20:08 +0800, Xueming Li wrote: > In current DPDK framework, all Rx queues is pre-loaded with mbufs for > incoming packets. When number of representors scale out in a switch > domain, the memory consumption became significant. Further more, > polling all ports leads to high cache miss, high latency and low > throughputs. > > This patch introduces shared Rx queue. PF and representors in same > Rx domain and switch domain could share Rx queue set by specifying > non-zero share group value in Rx queue configuration. > > All ports that share Rx queue actually shares hardware descriptor > queue and feed all Rx queues with one descriptor supply, memory is saved. > > Polling any queue using same shared Rx queue receives packets from all > member ports. Source port is identified by mbuf->port. > > Multiple groups is supported by group ID. Port queue number in a shared > group should be identical. Queue index is 1:1 mapped in shared group. > An example of two share groups: > Group1, 4 shared Rx queues per member port: PF, repr0, repr1 > Group2, 2 shared Rx queues per member port: repr2, repr3, ... repr127 > Poll first port for each group: > core port queue > 0 0 0 > 1 0 1 > 2 0 2 > 3 0 3 > 4 2 0 > 5 2 1 > > Shared Rx queue must be polled on single thread or core. If both PF0 and > representor0 joined same share group, can't poll pf0rxq0 on core1 and > rep0rxq0 on core2. Actually, polling one port within share group is > sufficient since polling any port in group will return packets for any > port in group. > > There was some discussion to aggregate member ports in same group into a > dummy port, several ways to achieve it. Since it optional, need to collect > more feedback and requirement from user, make better decision later. > > v1: > - initial version > v2: > - add testpmd patches > v3: > - change common forwarding api to macro for performance, thanks Jerin. > - save global variable accessed in forwarding to flowstream to minimize > cache miss > - combined patches for each forwarding engine > - support multiple groups in testpmd "--share-rxq" parameter > - new api to aggregate shared rxq group > v4: > - spelling fixes > - remove shared-rxq support for all forwarding engines > - add dedicate shared-rxq forwarding engine > v5: > - fix grammars > - remove aggregate api and leave it for later discussion > - add release notes > - add deployment example > v6: > - replace RxQ offload flag with device offload capability flag > - add Rx domain > - RxQ is shared when share group > 0 > - update testpmd accordingly > v7: > - fix testpmd share group id allocation > - change rx_domain to 16bits > v8: > - add new patch for testpmd to show device Rx domain ID and capability > - new share_qid in RxQ configuration > > Xueming Li (6): > ethdev: introduce shared Rx queue > app/testpmd: dump device capability and Rx domain info > app/testpmd: new parameter to enable shared Rx queue > app/testpmd: dump port info for shared Rx queue > app/testpmd: force shared Rx queue polled on same core > app/testpmd: add forwarding engine for shared Rx queue > > app/test-pmd/config.c | 114 +++++++++++++- > app/test-pmd/meson.build | 1 + > app/test-pmd/parameters.c | 13 ++ > app/test-pmd/shared_rxq_fwd.c | 148 ++++++++++++++++++ > app/test-pmd/testpmd.c | 25 ++- > app/test-pmd/testpmd.h | 5 + > app/test-pmd/util.c | 3 + > doc/guides/nics/features.rst | 13 ++ > doc/guides/nics/features/default.ini | 1 + > .../prog_guide/switch_representation.rst | 11 ++ > doc/guides/rel_notes/release_21_11.rst | 6 + > doc/guides/testpmd_app_ug/run_app.rst | 8 + > doc/guides/testpmd_app_ug/testpmd_funcs.rst | 5 +- > lib/ethdev/rte_ethdev.c | 8 + > lib/ethdev/rte_ethdev.h | 24 +++ > 15 files changed, 379 insertions(+), 6 deletions(-) > create mode 100644 app/test-pmd/shared_rxq_fwd.c >