Hi,

> -----Original Message-----
> From: Ferruh Yigit <ferruh.yi...@amd.com>
> Sent: Friday, February 10, 2023 3:45 AM
> To: Jiawei(Jonny) Wang <jiaw...@nvidia.com>; Slava Ovsiienko
> <viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>;
> andrew.rybche...@oktetlabs.ru; Aman Singh <aman.deep.si...@intel.com>;
> Yuying Zhang <yuying.zh...@intel.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasl...@nvidia.com>
> Subject: Re: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx 
> queue
> API
> 
> On 2/3/2023 1:33 PM, Jiawei Wang wrote:
> > When multiple physical ports are connected to a single DPDK port,
> > (example: kernel bonding, DPDK bonding, failsafe, etc.), we want to
> > know which physical port is used for Rx and Tx.
> >
> 
> I assume "kernel bonding" is out of context, but this patch concerns DPDK
> bonding, failsafe or softnic. (I will refer them as virtual bonding
> device.)
> 
> To use specific queues of the virtual bonding device may interfere with the
> logic of these devices, like bonding modes or RSS of the underlying devices. I
> can see feature focuses on a very specific use case, but not sure if all 
> possible
> side effects taken into consideration.
> 
> 
> And although the feature is only relavent to virtual bondiong device, core
> ethdev structures are updated for this. Most use cases won't need these, so is
> there a way to reduce the scope of the changes to virtual bonding devices?
> 
> 
> There are a few very core ethdev APIs, like:
> rte_eth_dev_configure()
> rte_eth_tx_queue_setup()
> rte_eth_rx_queue_setup()
> rte_eth_dev_start()
> rte_eth_dev_info_get()
> 
> Almost every user of ehtdev uses these APIs, since these are so fundemental I
> am for being a little more conservative on these APIs.
> 
> Every eccentric features are targetting these APIs first because they are
> common and extending them gives an easy solution, but in long run making
> these APIs more complex, harder to maintain and harder for PMDs to support
> them correctly. So I am for not updating them unless it is a generic use case.
> 
> 
> Also as we talked about PMDs supporting them, I assume your coming PMD
> patch will be implementing 'tx_phy_affinity' config option only for mlx 
> drivers.
> What will happen for other NICs? Will they silently ignore the config option
> from user? So this is a problem for the DPDK application portabiltiy.
> 
> 
> 
> As far as I understand target is application controlling which sub-device is 
> used
> under the virtual bonding device, can you pleaes give more information why
> this is required, perhaps it can help to provide a better/different solution.
> Like adding the ability to use both bonding device and sub-device for data 
> path,
> this way application can use whichever it wants. (this is just first solution 
> I
> come with, I am not suggesting as replacement solution, but if you can 
> describe
> the problem more I am sure other people can come with better solutions.)
> 
> And isn't this against the applicatio transparent to underneath device being
> bonding device or actual device?
> 
> 

OK, I will send the new version with separate functions in ethdev layer, 
to support the Map a Tx queue to port and get the number of ports.
And these functions work with device ops callback, other NICs will reported
The unsupported the ops callback is NULL.

> > This patch maps a DPDK Tx queue with a physical port, by adding
> > tx_phy_affinity setting in Tx queue.
> > The affinity number is the physical port ID where packets will be
> > sent.
> > Value 0 means no affinity and traffic could be routed to any connected
> > physical ports, this is the default current behavior.
> >
> > The number of physical ports is reported with rte_eth_dev_info_get().
> >
> > The new tx_phy_affinity field is added into the padding hole of
> > rte_eth_txconf structure, the size of rte_eth_txconf keeps the same.
> > An ABI check rule needs to be added to avoid false warning.
> >
> > Add the testpmd command line:
> > testpmd> port config (port_id) txq (queue_id) phy_affinity (value)
> >
> > For example, there're two physical ports connected to a single DPDK
> > port (port id 0), and phy_affinity 1 stood for the first physical port
> > and phy_affinity 2 stood for the second physical port.
> > Use the below commands to config tx phy affinity for per Tx Queue:
> >         port config 0 txq 0 phy_affinity 1
> >         port config 0 txq 1 phy_affinity 1
> >         port config 0 txq 2 phy_affinity 2
> >         port config 0 txq 3 phy_affinity 2
> >
> > These commands config the Tx Queue index 0 and Tx Queue index 1 with
> > phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these
> > packets will be sent from the first physical port, and similar with
> > the second physical port if sending packets with Tx Queue 2 or Tx
> > Queue 3.
> >
> > Signed-off-by: Jiawei Wang <jiaw...@nvidia.com>
> > ---
> >  app/test-pmd/cmdline.c                      | 100 ++++++++++++++++++++
> >  app/test-pmd/config.c                       |   1 +
> >  devtools/libabigail.abignore                |   5 +
> >  doc/guides/rel_notes/release_23_03.rst      |   4 +
> >  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  13 +++
> >  lib/ethdev/rte_ethdev.h                     |  10 ++
> >  6 files changed, 133 insertions(+)
> >
> > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
> > cb8c174020..f771fcf8ac 100644
> > --- a/app/test-pmd/cmdline.c
> > +++ b/app/test-pmd/cmdline.c
> > @@ -776,6 +776,10 @@ static void cmd_help_long_parsed(void
> > *parsed_result,
> >
> >                     "port cleanup (port_id) txq (queue_id) (free_cnt)\n"
> >                     "    Cleanup txq mbufs for a specific Tx queue\n\n"
> > +
> > +                   "port config (port_id) txq (queue_id) phy_affinity
> (value)\n"
> > +                   "    Set the physical affinity value "
> > +                   "on a specific Tx queue\n\n"
> >             );
> >     }
> >
> > @@ -12633,6 +12637,101 @@ static cmdline_parse_inst_t
> cmd_show_port_flow_transfer_proxy = {
> >     }
> >  };
> >
> > +/* *** configure port txq phy_affinity value *** */ struct
> > +cmd_config_tx_phy_affinity {
> > +   cmdline_fixed_string_t port;
> > +   cmdline_fixed_string_t config;
> > +   portid_t portid;
> > +   cmdline_fixed_string_t txq;
> > +   uint16_t qid;
> > +   cmdline_fixed_string_t phy_affinity;
> > +   uint8_t value;
> > +};
> > +
> > +static void
> > +cmd_config_tx_phy_affinity_parsed(void *parsed_result,
> > +                             __rte_unused struct cmdline *cl,
> > +                             __rte_unused void *data)
> > +{
> > +   struct cmd_config_tx_phy_affinity *res = parsed_result;
> > +   struct rte_eth_dev_info dev_info;
> > +   struct rte_port *port;
> > +   int ret;
> > +
> > +   if (port_id_is_invalid(res->portid, ENABLED_WARN))
> > +           return;
> > +
> > +   if (res->portid == (portid_t)RTE_PORT_ALL) {
> > +           printf("Invalid port id\n");
> > +           return;
> > +   }
> > +
> > +   port = &ports[res->portid];
> > +
> > +   if (strcmp(res->txq, "txq")) {
> > +           printf("Unknown parameter\n");
> > +           return;
> > +   }
> > +   if (tx_queue_id_is_invalid(res->qid))
> > +           return;
> > +
> > +   ret = eth_dev_info_get_print_err(res->portid, &dev_info);
> > +   if (ret != 0)
> > +           return;
> > +
> > +   if (dev_info.nb_phy_ports == 0) {
> > +           printf("Number of physical ports is 0 which is invalid for PHY
> Affinity\n");
> > +           return;
> > +   }
> > +   printf("The number of physical ports is %u\n", dev_info.nb_phy_ports);
> > +   if (dev_info.nb_phy_ports < res->value) {
> > +           printf("The PHY affinity value %u is Invalid, exceeds the "
> > +                  "number of physical ports\n", res->value);
> > +           return;
> > +   }
> > +   port->txq[res->qid].conf.tx_phy_affinity = res->value;
> > +
> > +   cmd_reconfig_device_queue(res->portid, 0, 1); }
> > +
> > +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_port =
> > +   TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> > +                            port, "port");
> > +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_config =
> > +   TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> > +                            config, "config");
> > +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_portid =
> > +   TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
> > +                            portid, RTE_UINT16);
> > +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_txq =
> > +   TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> > +                            txq, "txq");
> > +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_qid =
> > +   TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
> > +                         qid, RTE_UINT16);
> > +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_hwport =
> > +   TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> > +                            phy_affinity, "phy_affinity");
> > +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_value =
> > +   TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
> > +                         value, RTE_UINT8);
> > +
> > +static cmdline_parse_inst_t cmd_config_tx_phy_affinity = {
> > +   .f = cmd_config_tx_phy_affinity_parsed,
> > +   .data = (void *)0,
> > +   .help_str = "port config <port_id> txq <queue_id> phy_affinity <value>",
> > +   .tokens = {
> > +           (void *)&cmd_config_tx_phy_affinity_port,
> > +           (void *)&cmd_config_tx_phy_affinity_config,
> > +           (void *)&cmd_config_tx_phy_affinity_portid,
> > +           (void *)&cmd_config_tx_phy_affinity_txq,
> > +           (void *)&cmd_config_tx_phy_affinity_qid,
> > +           (void *)&cmd_config_tx_phy_affinity_hwport,
> > +           (void *)&cmd_config_tx_phy_affinity_value,
> > +           NULL,
> > +   },
> > +};
> > +
> >  /*
> >
> ****************************************************************
> ******
> > ********** */
> >
> >  /* list of instructions */
> > @@ -12866,6 +12965,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = {
> >     (cmdline_parse_inst_t *)&cmd_show_port_cman_capa,
> >     (cmdline_parse_inst_t *)&cmd_show_port_cman_config,
> >     (cmdline_parse_inst_t *)&cmd_set_port_cman_config,
> > +   (cmdline_parse_inst_t *)&cmd_config_tx_phy_affinity,
> >     NULL,
> >  };
> >
> > diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index
> > acccb6b035..b83fb17cfa 100644
> > --- a/app/test-pmd/config.c
> > +++ b/app/test-pmd/config.c
> > @@ -936,6 +936,7 @@ port_infos_display(portid_t port_id)
> >             printf("unknown\n");
> >             break;
> >     }
> > +   printf("Current number of physical ports: %u\n",
> > +dev_info.nb_phy_ports);
> >  }
> >
> >  void
> > diff --git a/devtools/libabigail.abignore
> > b/devtools/libabigail.abignore index 7a93de3ba1..ac7d3fb2da 100644
> > --- a/devtools/libabigail.abignore
> > +++ b/devtools/libabigail.abignore
> > @@ -34,3 +34,8 @@
> >  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> >  ; Temporary exceptions till next major ABI version ;
> > ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> > +
> > +; Ignore fields inserted in padding hole of rte_eth_txconf
> > +[suppress_type]
> > +        name = rte_eth_txconf
> > +        has_data_member_inserted_between =
> > +{offset_of(tx_deferred_start), offset_of(offloads)}
> > diff --git a/doc/guides/rel_notes/release_23_03.rst
> > b/doc/guides/rel_notes/release_23_03.rst
> > index 73f5d94e14..e99bd2dcb6 100644
> > --- a/doc/guides/rel_notes/release_23_03.rst
> > +++ b/doc/guides/rel_notes/release_23_03.rst
> > @@ -55,6 +55,10 @@ New Features
> >       Also, make sure to start the actual text at the margin.
> >       =======================================================
> >
> > +* **Added affinity for multiple physical ports connected to a single
> > +DPDK port.**
> > +
> > +  * Added Tx affinity in queue setup to map a physical port.
> > +
> >  * **Updated AMD axgbe driver.**
> >
> >    * Added multi-process support.
> > diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > index 79a1fa9cb7..5c716f7679 100644
> > --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > @@ -1605,6 +1605,19 @@ Enable or disable a per queue Tx offloading only
> on a specific Tx queue::
> >
> >  This command should be run when the port is stopped, or else it will fail.
> >
> > +config per queue Tx physical affinity
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Configure a per queue physical affinity value only on a specific Tx queue::
> > +
> > +   testpmd> port (port_id) txq (queue_id) phy_affinity (value)
> > +
> > +* ``phy_affinity``: physical port to use for sending,
> > +                    when multiple physical ports are connected to
> > +                    a single DPDK port.
> > +
> > +This command should be run when the port is stopped, otherwise it fails.
> > +
> >  Config VXLAN Encap outer layers
> >  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
> > c129ca1eaf..2fd971b7b5 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -1138,6 +1138,14 @@ struct rte_eth_txconf {
> >                                   less free descriptors than this value. */
> >
> >     uint8_t tx_deferred_start; /**< Do not start queue with
> > rte_eth_dev_start(). */
> > +   /**
> > +    * Affinity with one of the multiple physical ports connected to the
> DPDK port.
> > +    * Value 0 means no affinity and traffic could be routed to any
> connected
> > +    * physical port.
> > +    * The first physical port is number 1 and so on.
> > +    * Number of physical ports is reported by nb_phy_ports in
> rte_eth_dev_info.
> > +    */
> > +   uint8_t tx_phy_affinity;
> >     /**
> >      * Per-queue Tx offloads to be set  using RTE_ETH_TX_OFFLOAD_*
> flags.
> >      * Only offloads set on tx_queue_offload_capa or tx_offload_capa @@
> > -1744,6 +1752,8 @@ struct rte_eth_dev_info {
> >     /** Device redirection table size, the total number of entries. */
> >     uint16_t reta_size;
> >     uint8_t hash_key_size; /**< Hash key size in bytes */
> > +   /** Number of physical ports connected with DPDK port. */
> > +   uint8_t nb_phy_ports;
> >     /** Bit mask of RSS offloads, the bit offset also means flow type */
> >     uint64_t flow_type_rss_offloads;
> >     struct rte_eth_rxconf default_rxconf; /**< Default Rx configuration
> > */

Reply via email to