In current DPDK framework, all Rx queues is pre-loaded with mbufs for incoming packets. When number of representors scale out in a switch domain, the memory consumption became significant. Further more, polling all ports leads to high cache miss, high latency and low throughputs.
This patch introduces shared Rx queue. PF and representors in same Rx domain and switch domain could share Rx queue set by specifying non-zero share group value in Rx queue configuration. All ports that share Rx queue actually shares hardware descriptor queue and feed all Rx queues with one descriptor supply, memory is saved. Polling any queue using same shared Rx queue receives packets from all member ports. Source port is identified by mbuf->port. Multiple groups is supported by group ID. Port queue number in a shared group should be identical. Queue index is 1:1 mapped in shared group. An example of two share groups: Group1, 4 shared Rx queues per member port: PF, repr0, repr1 Group2, 2 shared Rx queues per member port: repr2, repr3, ... repr127 Poll first port for each group: core port queue 0 0 0 1 0 1 2 0 2 3 0 3 4 2 0 5 2 1 Shared Rx queue must be polled on single thread or core. If both PF0 and representor0 joined same share group, can't poll pf0rxq0 on core1 and rep0rxq0 on core2. Actually, polling one port within share group is sufficient since polling any port in group will return packets for any port in group. There was some discussion to aggregate member ports in same group into a dummy port, several ways to achieve it. Since it optional, need to collect more feedback and requirement from user, make better decision later. v1: - initial version v2: - add testpmd patches v3: - change common forwarding api to macro for performance, thanks Jerin. - save global variable accessed in forwarding to flowstream to minimize cache miss - combined patches for each forwarding engine - support multiple groups in testpmd "--share-rxq" parameter - new api to aggregate shared rxq group v4: - spelling fixes - remove shared-rxq support for all forwarding engines - add dedicate shared-rxq forwarding engine v5: - fix grammars - remove aggregate api and leave it for later discussion - add release notes - add deployment example v6: - replace RxQ offload flag with device offload capability flag - add Rx domain - RxQ is shared when share group > 0 - update testpmd accordingly v7: - fix testpmd share group id allocation - change rx_domain to 16bits v8: - add new patch for testpmd to show device Rx domain ID and capability - new share_qid in RxQ configuration v9: - fix some spelling v10: - add device capability name api v11: - remove macro from device capability name list v12: - rephrase - in forwarding core check, add global flag and RxQ enabled check Xueming Li (7): ethdev: introduce shared Rx queue ethdev: get device capability name as string app/testpmd: dump device capability and Rx domain info app/testpmd: new parameter to enable shared Rx queue app/testpmd: dump port info for shared Rx queue app/testpmd: force shared Rx queue polled on same core app/testpmd: add forwarding engine for shared Rx queue app/test-pmd/config.c | 141 +++++++++++++++++- app/test-pmd/meson.build | 1 + app/test-pmd/parameters.c | 13 ++ app/test-pmd/shared_rxq_fwd.c | 113 ++++++++++++++ app/test-pmd/testpmd.c | 26 +++- app/test-pmd/testpmd.h | 9 ++ app/test-pmd/util.c | 3 + doc/guides/nics/features.rst | 13 ++ doc/guides/nics/features/default.ini | 1 + .../prog_guide/switch_representation.rst | 11 ++ doc/guides/rel_notes/release_21_11.rst | 6 + doc/guides/testpmd_app_ug/run_app.rst | 9 ++ doc/guides/testpmd_app_ug/testpmd_funcs.rst | 5 +- lib/ethdev/rte_ethdev.c | 33 ++++ lib/ethdev/rte_ethdev.h | 38 +++++ lib/ethdev/version.map | 1 + 16 files changed, 417 insertions(+), 6 deletions(-) create mode 100644 app/test-pmd/shared_rxq_fwd.c -- 2.33.0