Hi, > -----Original Message----- > From: Ferruh Yigit <ferruh.yi...@amd.com> > Sent: Friday, February 10, 2023 3:45 AM > To: Jiawei(Jonny) Wang <jiaw...@nvidia.com>; Slava Ovsiienko > <viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; NBU-Contact- > Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>; > andrew.rybche...@oktetlabs.ru; Aman Singh <aman.deep.si...@intel.com>; > Yuying Zhang <yuying.zh...@intel.com> > Cc: dev@dpdk.org; Raslan Darawsheh <rasl...@nvidia.com> > Subject: Re: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx > queue > API > > On 2/3/2023 1:33 PM, Jiawei Wang wrote: > > When multiple physical ports are connected to a single DPDK port, > > (example: kernel bonding, DPDK bonding, failsafe, etc.), we want to > > know which physical port is used for Rx and Tx. > > > > I assume "kernel bonding" is out of context, but this patch concerns DPDK > bonding, failsafe or softnic. (I will refer them as virtual bonding > device.) > > To use specific queues of the virtual bonding device may interfere with the > logic of these devices, like bonding modes or RSS of the underlying devices. I > can see feature focuses on a very specific use case, but not sure if all > possible > side effects taken into consideration. > > > And although the feature is only relavent to virtual bondiong device, core > ethdev structures are updated for this. Most use cases won't need these, so is > there a way to reduce the scope of the changes to virtual bonding devices? > > > There are a few very core ethdev APIs, like: > rte_eth_dev_configure() > rte_eth_tx_queue_setup() > rte_eth_rx_queue_setup() > rte_eth_dev_start() > rte_eth_dev_info_get() > > Almost every user of ehtdev uses these APIs, since these are so fundemental I > am for being a little more conservative on these APIs. > > Every eccentric features are targetting these APIs first because they are > common and extending them gives an easy solution, but in long run making > these APIs more complex, harder to maintain and harder for PMDs to support > them correctly. So I am for not updating them unless it is a generic use case. > > > Also as we talked about PMDs supporting them, I assume your coming PMD > patch will be implementing 'tx_phy_affinity' config option only for mlx > drivers. > What will happen for other NICs? Will they silently ignore the config option > from user? So this is a problem for the DPDK application portabiltiy. > > > > As far as I understand target is application controlling which sub-device is > used > under the virtual bonding device, can you pleaes give more information why > this is required, perhaps it can help to provide a better/different solution. > Like adding the ability to use both bonding device and sub-device for data > path, > this way application can use whichever it wants. (this is just first solution > I > come with, I am not suggesting as replacement solution, but if you can > describe > the problem more I am sure other people can come with better solutions.) > > And isn't this against the applicatio transparent to underneath device being > bonding device or actual device? > >
OK, I will send the new version with separate functions in ethdev layer, to support the Map a Tx queue to port and get the number of ports. And these functions work with device ops callback, other NICs will reported The unsupported the ops callback is NULL. > > This patch maps a DPDK Tx queue with a physical port, by adding > > tx_phy_affinity setting in Tx queue. > > The affinity number is the physical port ID where packets will be > > sent. > > Value 0 means no affinity and traffic could be routed to any connected > > physical ports, this is the default current behavior. > > > > The number of physical ports is reported with rte_eth_dev_info_get(). > > > > The new tx_phy_affinity field is added into the padding hole of > > rte_eth_txconf structure, the size of rte_eth_txconf keeps the same. > > An ABI check rule needs to be added to avoid false warning. > > > > Add the testpmd command line: > > testpmd> port config (port_id) txq (queue_id) phy_affinity (value) > > > > For example, there're two physical ports connected to a single DPDK > > port (port id 0), and phy_affinity 1 stood for the first physical port > > and phy_affinity 2 stood for the second physical port. > > Use the below commands to config tx phy affinity for per Tx Queue: > > port config 0 txq 0 phy_affinity 1 > > port config 0 txq 1 phy_affinity 1 > > port config 0 txq 2 phy_affinity 2 > > port config 0 txq 3 phy_affinity 2 > > > > These commands config the Tx Queue index 0 and Tx Queue index 1 with > > phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these > > packets will be sent from the first physical port, and similar with > > the second physical port if sending packets with Tx Queue 2 or Tx > > Queue 3. > > > > Signed-off-by: Jiawei Wang <jiaw...@nvidia.com> > > --- > > app/test-pmd/cmdline.c | 100 ++++++++++++++++++++ > > app/test-pmd/config.c | 1 + > > devtools/libabigail.abignore | 5 + > > doc/guides/rel_notes/release_23_03.rst | 4 + > > doc/guides/testpmd_app_ug/testpmd_funcs.rst | 13 +++ > > lib/ethdev/rte_ethdev.h | 10 ++ > > 6 files changed, 133 insertions(+) > > > > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index > > cb8c174020..f771fcf8ac 100644 > > --- a/app/test-pmd/cmdline.c > > +++ b/app/test-pmd/cmdline.c > > @@ -776,6 +776,10 @@ static void cmd_help_long_parsed(void > > *parsed_result, > > > > "port cleanup (port_id) txq (queue_id) (free_cnt)\n" > > " Cleanup txq mbufs for a specific Tx queue\n\n" > > + > > + "port config (port_id) txq (queue_id) phy_affinity > (value)\n" > > + " Set the physical affinity value " > > + "on a specific Tx queue\n\n" > > ); > > } > > > > @@ -12633,6 +12637,101 @@ static cmdline_parse_inst_t > cmd_show_port_flow_transfer_proxy = { > > } > > }; > > > > +/* *** configure port txq phy_affinity value *** */ struct > > +cmd_config_tx_phy_affinity { > > + cmdline_fixed_string_t port; > > + cmdline_fixed_string_t config; > > + portid_t portid; > > + cmdline_fixed_string_t txq; > > + uint16_t qid; > > + cmdline_fixed_string_t phy_affinity; > > + uint8_t value; > > +}; > > + > > +static void > > +cmd_config_tx_phy_affinity_parsed(void *parsed_result, > > + __rte_unused struct cmdline *cl, > > + __rte_unused void *data) > > +{ > > + struct cmd_config_tx_phy_affinity *res = parsed_result; > > + struct rte_eth_dev_info dev_info; > > + struct rte_port *port; > > + int ret; > > + > > + if (port_id_is_invalid(res->portid, ENABLED_WARN)) > > + return; > > + > > + if (res->portid == (portid_t)RTE_PORT_ALL) { > > + printf("Invalid port id\n"); > > + return; > > + } > > + > > + port = &ports[res->portid]; > > + > > + if (strcmp(res->txq, "txq")) { > > + printf("Unknown parameter\n"); > > + return; > > + } > > + if (tx_queue_id_is_invalid(res->qid)) > > + return; > > + > > + ret = eth_dev_info_get_print_err(res->portid, &dev_info); > > + if (ret != 0) > > + return; > > + > > + if (dev_info.nb_phy_ports == 0) { > > + printf("Number of physical ports is 0 which is invalid for PHY > Affinity\n"); > > + return; > > + } > > + printf("The number of physical ports is %u\n", dev_info.nb_phy_ports); > > + if (dev_info.nb_phy_ports < res->value) { > > + printf("The PHY affinity value %u is Invalid, exceeds the " > > + "number of physical ports\n", res->value); > > + return; > > + } > > + port->txq[res->qid].conf.tx_phy_affinity = res->value; > > + > > + cmd_reconfig_device_queue(res->portid, 0, 1); } > > + > > +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_port = > > + TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity, > > + port, "port"); > > +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_config = > > + TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity, > > + config, "config"); > > +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_portid = > > + TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity, > > + portid, RTE_UINT16); > > +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_txq = > > + TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity, > > + txq, "txq"); > > +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_qid = > > + TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity, > > + qid, RTE_UINT16); > > +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_hwport = > > + TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity, > > + phy_affinity, "phy_affinity"); > > +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_value = > > + TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity, > > + value, RTE_UINT8); > > + > > +static cmdline_parse_inst_t cmd_config_tx_phy_affinity = { > > + .f = cmd_config_tx_phy_affinity_parsed, > > + .data = (void *)0, > > + .help_str = "port config <port_id> txq <queue_id> phy_affinity <value>", > > + .tokens = { > > + (void *)&cmd_config_tx_phy_affinity_port, > > + (void *)&cmd_config_tx_phy_affinity_config, > > + (void *)&cmd_config_tx_phy_affinity_portid, > > + (void *)&cmd_config_tx_phy_affinity_txq, > > + (void *)&cmd_config_tx_phy_affinity_qid, > > + (void *)&cmd_config_tx_phy_affinity_hwport, > > + (void *)&cmd_config_tx_phy_affinity_value, > > + NULL, > > + }, > > +}; > > + > > /* > > > **************************************************************** > ****** > > ********** */ > > > > /* list of instructions */ > > @@ -12866,6 +12965,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = { > > (cmdline_parse_inst_t *)&cmd_show_port_cman_capa, > > (cmdline_parse_inst_t *)&cmd_show_port_cman_config, > > (cmdline_parse_inst_t *)&cmd_set_port_cman_config, > > + (cmdline_parse_inst_t *)&cmd_config_tx_phy_affinity, > > NULL, > > }; > > > > diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index > > acccb6b035..b83fb17cfa 100644 > > --- a/app/test-pmd/config.c > > +++ b/app/test-pmd/config.c > > @@ -936,6 +936,7 @@ port_infos_display(portid_t port_id) > > printf("unknown\n"); > > break; > > } > > + printf("Current number of physical ports: %u\n", > > +dev_info.nb_phy_ports); > > } > > > > void > > diff --git a/devtools/libabigail.abignore > > b/devtools/libabigail.abignore index 7a93de3ba1..ac7d3fb2da 100644 > > --- a/devtools/libabigail.abignore > > +++ b/devtools/libabigail.abignore > > @@ -34,3 +34,8 @@ > > ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; > > ; Temporary exceptions till next major ABI version ; > > ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; > > + > > +; Ignore fields inserted in padding hole of rte_eth_txconf > > +[suppress_type] > > + name = rte_eth_txconf > > + has_data_member_inserted_between = > > +{offset_of(tx_deferred_start), offset_of(offloads)} > > diff --git a/doc/guides/rel_notes/release_23_03.rst > > b/doc/guides/rel_notes/release_23_03.rst > > index 73f5d94e14..e99bd2dcb6 100644 > > --- a/doc/guides/rel_notes/release_23_03.rst > > +++ b/doc/guides/rel_notes/release_23_03.rst > > @@ -55,6 +55,10 @@ New Features > > Also, make sure to start the actual text at the margin. > > ======================================================= > > > > +* **Added affinity for multiple physical ports connected to a single > > +DPDK port.** > > + > > + * Added Tx affinity in queue setup to map a physical port. > > + > > * **Updated AMD axgbe driver.** > > > > * Added multi-process support. > > diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst > > b/doc/guides/testpmd_app_ug/testpmd_funcs.rst > > index 79a1fa9cb7..5c716f7679 100644 > > --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst > > +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst > > @@ -1605,6 +1605,19 @@ Enable or disable a per queue Tx offloading only > on a specific Tx queue:: > > > > This command should be run when the port is stopped, or else it will fail. > > > > +config per queue Tx physical affinity > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > + > > +Configure a per queue physical affinity value only on a specific Tx queue:: > > + > > + testpmd> port (port_id) txq (queue_id) phy_affinity (value) > > + > > +* ``phy_affinity``: physical port to use for sending, > > + when multiple physical ports are connected to > > + a single DPDK port. > > + > > +This command should be run when the port is stopped, otherwise it fails. > > + > > Config VXLAN Encap outer layers > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index > > c129ca1eaf..2fd971b7b5 100644 > > --- a/lib/ethdev/rte_ethdev.h > > +++ b/lib/ethdev/rte_ethdev.h > > @@ -1138,6 +1138,14 @@ struct rte_eth_txconf { > > less free descriptors than this value. */ > > > > uint8_t tx_deferred_start; /**< Do not start queue with > > rte_eth_dev_start(). */ > > + /** > > + * Affinity with one of the multiple physical ports connected to the > DPDK port. > > + * Value 0 means no affinity and traffic could be routed to any > connected > > + * physical port. > > + * The first physical port is number 1 and so on. > > + * Number of physical ports is reported by nb_phy_ports in > rte_eth_dev_info. > > + */ > > + uint8_t tx_phy_affinity; > > /** > > * Per-queue Tx offloads to be set using RTE_ETH_TX_OFFLOAD_* > flags. > > * Only offloads set on tx_queue_offload_capa or tx_offload_capa @@ > > -1744,6 +1752,8 @@ struct rte_eth_dev_info { > > /** Device redirection table size, the total number of entries. */ > > uint16_t reta_size; > > uint8_t hash_key_size; /**< Hash key size in bytes */ > > + /** Number of physical ports connected with DPDK port. */ > > + uint8_t nb_phy_ports; > > /** Bit mask of RSS offloads, the bit offset also means flow type */ > > uint64_t flow_type_rss_offloads; > > struct rte_eth_rxconf default_rxconf; /**< Default Rx configuration > > */