On 9/30/21 5:55 PM, Xueming Li wrote:
> In current DPDK framework, each RX queue is pre-loaded with mbufs for

RX -> Rx

> incoming packets. When number of representors scale out in a switch
> domain, the memory consumption became significant. Most important,
> polling all ports leads to high cache miss, high latency and low
> throughput.

It should be highlighted that it is a problem of some PMDs.
Not all.

> 
> This patch introduces shared RX queue. Ports with same configuration in

"This patch introduces" -> "Introduce"

RX -> Rx

> a switch domain could share RX queue set by specifying sharing group.

RX -> Rx

> Polling any queue using same shared RX queue receives packets from all

RX -> Rx

> member ports. Source port is identified by mbuf->port.
> 
> Port queue number in a shared group should be identical. Queue index is
> 1:1 mapped in shared group.
> 
> Share RX queue must be polled on single thread or core.

RX -> Rx

> 
> Multiple groups is supported by group ID.

is -> are

> 
> Signed-off-by: Xueming Li <xuemi...@nvidia.com>
> Cc: Jerin Jacob <jerinjac...@gmail.com>

The patch should update release notes.

> ---
> Rx queue object could be used as shared Rx queue object, it's important
> to clear all queue control callback api that using queue object:
>   https://mails.dpdk.org/archives/dev/2021-July/215574.html
> ---
>  doc/guides/nics/features.rst                    | 11 +++++++++++
>  doc/guides/nics/features/default.ini            |  1 +
>  doc/guides/prog_guide/switch_representation.rst | 10 ++++++++++
>  lib/ethdev/rte_ethdev.c                         |  1 +
>  lib/ethdev/rte_ethdev.h                         |  7 +++++++
>  5 files changed, 30 insertions(+)
> 
> diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> index 4fce8cd1c97..69bc1d5719c 100644
> --- a/doc/guides/nics/features.rst
> +++ b/doc/guides/nics/features.rst
> @@ -626,6 +626,17 @@ Supports inner packet L4 checksum.
>    ``tx_offload_capa,tx_queue_offload_capa:DEV_TX_OFFLOAD_OUTER_UDP_CKSUM``.
>  
>  
> +.. _nic_features_shared_rx_queue:
> +
> +Shared Rx queue
> +---------------
> +
> +Supports shared Rx queue for ports in same switch domain.
> +
> +* **[uses]     rte_eth_rxconf,rte_eth_rxmode**: 
> ``offloads:RTE_ETH_RX_OFFLOAD_SHARED_RXQ``.
> +* **[provides] mbuf**: ``mbuf.port``.
> +
> +
>  .. _nic_features_packet_type_parsing:
>  
>  Packet type parsing
> diff --git a/doc/guides/nics/features/default.ini 
> b/doc/guides/nics/features/default.ini
> index 754184ddd4d..ebeb4c18512 100644
> --- a/doc/guides/nics/features/default.ini
> +++ b/doc/guides/nics/features/default.ini
> @@ -19,6 +19,7 @@ Free Tx mbuf on demand =
>  Queue start/stop     =
>  Runtime Rx queue setup =
>  Runtime Tx queue setup =
> +Shared Rx queue      =
>  Burst mode info      =
>  Power mgmt address monitor =
>  MTU update           =
> diff --git a/doc/guides/prog_guide/switch_representation.rst 
> b/doc/guides/prog_guide/switch_representation.rst
> index ff6aa91c806..bc7ce65fa3d 100644
> --- a/doc/guides/prog_guide/switch_representation.rst
> +++ b/doc/guides/prog_guide/switch_representation.rst
> @@ -123,6 +123,16 @@ thought as a software "patch panel" front-end for 
> applications.
>  .. [1] `Ethernet switch device driver model (switchdev)
>         <https://www.kernel.org/doc/Documentation/networking/switchdev.txt>`_
>  
> +- Memory usage of representors is huge when number of representor grows,
> +  because PMD always allocate mbuf for each descriptor of Rx queue.

It is a problem of some PMDs only. So, it must be rewritten to
highlight it.

> +  Polling the large number of ports brings more CPU load, cache miss and
> +  latency. Shared Rx queue can be used to share Rx queue between PF and
> +  representors in same switch. ``RTE_ETH_RX_OFFLOAD_SHARED_RXQ`` is
> +  present in Rx offloading capability of device info. Setting the
> +  offloading flag in device Rx mode or Rx queue configuration to enable
> +  shared Rx queue. Polling any member port of the shared Rx queue can return
> +  packets of all ports in the group, port ID is saved in ``mbuf.port``.
> +
>  Basic SR-IOV
>  ------------
>  
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index 61aa49efec6..73270c10492 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -127,6 +127,7 @@ static const struct {
>       RTE_RX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
>       RTE_RX_OFFLOAD_BIT2STR(RSS_HASH),
>       RTE_ETH_RX_OFFLOAD_BIT2STR(BUFFER_SPLIT),
> +     RTE_ETH_RX_OFFLOAD_BIT2STR(SHARED_RXQ),
>  };
>  
>  #undef RTE_RX_OFFLOAD_BIT2STR
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index afdc53b674c..d7ac625ee74 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -1077,6 +1077,7 @@ struct rte_eth_rxconf {
>       uint8_t rx_drop_en; /**< Drop packets if no descriptors are available. 
> */
>       uint8_t rx_deferred_start; /**< Do not start queue with 
> rte_eth_dev_start(). */
>       uint16_t rx_nseg; /**< Number of descriptions in rx_seg array. */
> +     uint32_t shared_group; /**< Shared port group index in switch domain. */
>       /**
>        * Per-queue Rx offloads to be set using DEV_RX_OFFLOAD_* flags.
>        * Only offloads set on rx_queue_offload_capa or rx_offload_capa
> @@ -1403,6 +1404,12 @@ struct rte_eth_conf {
>  #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
>  #define DEV_RX_OFFLOAD_RSS_HASH              0x00080000
>  #define RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT 0x00100000
> +/**
> + * Rx queue is shared among ports in same switch domain to save memory,
> + * avoid polling each port. Any port in the group can be used to receive
> + * packets. Real source port number saved in mbuf->port field.
> + */
> +#define RTE_ETH_RX_OFFLOAD_SHARED_RXQ   0x00200000
>  
>  #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
>                                DEV_RX_OFFLOAD_UDP_CKSUM | \
> 

IMHO it should be squashed with the second patch to make it
easier to review. Otherwise it is hard to understand what is
shared_group and the offlaod which are dead in the patch.

Reply via email to