On 2/14/2023 9:38 AM, Jiawei(Jonny) Wang wrote: > Hi, > >> -----Original Message----- >> From: Ferruh Yigit <ferruh.yi...@amd.com> >> Sent: Friday, February 10, 2023 3:45 AM >> To: Jiawei(Jonny) Wang <jiaw...@nvidia.com>; Slava Ovsiienko >> <viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; NBU-Contact- >> Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>; >> andrew.rybche...@oktetlabs.ru; Aman Singh <aman.deep.si...@intel.com>; >> Yuying Zhang <yuying.zh...@intel.com> >> Cc: dev@dpdk.org; Raslan Darawsheh <rasl...@nvidia.com> >> Subject: Re: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx >> queue >> API >> >> On 2/3/2023 1:33 PM, Jiawei Wang wrote: >>> When multiple physical ports are connected to a single DPDK port, >>> (example: kernel bonding, DPDK bonding, failsafe, etc.), we want to >>> know which physical port is used for Rx and Tx. >>> >> >> I assume "kernel bonding" is out of context, but this patch concerns DPDK >> bonding, failsafe or softnic. (I will refer them as virtual bonding >> device.) >> >> To use specific queues of the virtual bonding device may interfere with the >> logic of these devices, like bonding modes or RSS of the underlying devices. >> I >> can see feature focuses on a very specific use case, but not sure if all >> possible >> side effects taken into consideration. >> >> >> And although the feature is only relavent to virtual bondiong device, core >> ethdev structures are updated for this. Most use cases won't need these, so >> is >> there a way to reduce the scope of the changes to virtual bonding devices? >> >> >> There are a few very core ethdev APIs, like: >> rte_eth_dev_configure() >> rte_eth_tx_queue_setup() >> rte_eth_rx_queue_setup() >> rte_eth_dev_start() >> rte_eth_dev_info_get() >> >> Almost every user of ehtdev uses these APIs, since these are so fundemental I >> am for being a little more conservative on these APIs. >> >> Every eccentric features are targetting these APIs first because they are >> common and extending them gives an easy solution, but in long run making >> these APIs more complex, harder to maintain and harder for PMDs to support >> them correctly. So I am for not updating them unless it is a generic use >> case. >> >> >> Also as we talked about PMDs supporting them, I assume your coming PMD >> patch will be implementing 'tx_phy_affinity' config option only for mlx >> drivers. >> What will happen for other NICs? Will they silently ignore the config option >> from user? So this is a problem for the DPDK application portabiltiy. >> >> >> >> As far as I understand target is application controlling which sub-device is >> used >> under the virtual bonding device, can you pleaes give more information why >> this is required, perhaps it can help to provide a better/different solution. >> Like adding the ability to use both bonding device and sub-device for data >> path, >> this way application can use whichever it wants. (this is just first >> solution I >> come with, I am not suggesting as replacement solution, but if you can >> describe >> the problem more I am sure other people can come with better solutions.) >> >> And isn't this against the applicatio transparent to underneath device being >> bonding device or actual device? >> >> > > OK, I will send the new version with separate functions in ethdev layer, > to support the Map a Tx queue to port and get the number of ports. > And these functions work with device ops callback, other NICs will reported > The unsupported the ops callback is NULL. >
OK, thanks Jonny, at least this separates the fetaure to its own APIs which reduces the impact for applications and drivers that are not using this feature. >>> This patch maps a DPDK Tx queue with a physical port, by adding >>> tx_phy_affinity setting in Tx queue. >>> The affinity number is the physical port ID where packets will be >>> sent. >>> Value 0 means no affinity and traffic could be routed to any connected >>> physical ports, this is the default current behavior. >>> >>> The number of physical ports is reported with rte_eth_dev_info_get(). >>> >>> The new tx_phy_affinity field is added into the padding hole of >>> rte_eth_txconf structure, the size of rte_eth_txconf keeps the same. >>> An ABI check rule needs to be added to avoid false warning. >>> >>> Add the testpmd command line: >>> testpmd> port config (port_id) txq (queue_id) phy_affinity (value) >>> >>> For example, there're two physical ports connected to a single DPDK >>> port (port id 0), and phy_affinity 1 stood for the first physical port >>> and phy_affinity 2 stood for the second physical port. >>> Use the below commands to config tx phy affinity for per Tx Queue: >>> port config 0 txq 0 phy_affinity 1 >>> port config 0 txq 1 phy_affinity 1 >>> port config 0 txq 2 phy_affinity 2 >>> port config 0 txq 3 phy_affinity 2 >>> >>> These commands config the Tx Queue index 0 and Tx Queue index 1 with >>> phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these >>> packets will be sent from the first physical port, and similar with >>> the second physical port if sending packets with Tx Queue 2 or Tx >>> Queue 3. >>> >>> Signed-off-by: Jiawei Wang <jiaw...@nvidia.com> >>> --- >>> app/test-pmd/cmdline.c | 100 ++++++++++++++++++++ >>> app/test-pmd/config.c | 1 + >>> devtools/libabigail.abignore | 5 + >>> doc/guides/rel_notes/release_23_03.rst | 4 + >>> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 13 +++ >>> lib/ethdev/rte_ethdev.h | 10 ++ >>> 6 files changed, 133 insertions(+) >>> >>> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index >>> cb8c174020..f771fcf8ac 100644 >>> --- a/app/test-pmd/cmdline.c >>> +++ b/app/test-pmd/cmdline.c >>> @@ -776,6 +776,10 @@ static void cmd_help_long_parsed(void >>> *parsed_result, >>> >>> "port cleanup (port_id) txq (queue_id) (free_cnt)\n" >>> " Cleanup txq mbufs for a specific Tx queue\n\n" >>> + >>> + "port config (port_id) txq (queue_id) phy_affinity >> (value)\n" >>> + " Set the physical affinity value " >>> + "on a specific Tx queue\n\n" >>> ); >>> } >>> >>> @@ -12633,6 +12637,101 @@ static cmdline_parse_inst_t >> cmd_show_port_flow_transfer_proxy = { >>> } >>> }; >>> >>> +/* *** configure port txq phy_affinity value *** */ struct >>> +cmd_config_tx_phy_affinity { >>> + cmdline_fixed_string_t port; >>> + cmdline_fixed_string_t config; >>> + portid_t portid; >>> + cmdline_fixed_string_t txq; >>> + uint16_t qid; >>> + cmdline_fixed_string_t phy_affinity; >>> + uint8_t value; >>> +}; >>> + >>> +static void >>> +cmd_config_tx_phy_affinity_parsed(void *parsed_result, >>> + __rte_unused struct cmdline *cl, >>> + __rte_unused void *data) >>> +{ >>> + struct cmd_config_tx_phy_affinity *res = parsed_result; >>> + struct rte_eth_dev_info dev_info; >>> + struct rte_port *port; >>> + int ret; >>> + >>> + if (port_id_is_invalid(res->portid, ENABLED_WARN)) >>> + return; >>> + >>> + if (res->portid == (portid_t)RTE_PORT_ALL) { >>> + printf("Invalid port id\n"); >>> + return; >>> + } >>> + >>> + port = &ports[res->portid]; >>> + >>> + if (strcmp(res->txq, "txq")) { >>> + printf("Unknown parameter\n"); >>> + return; >>> + } >>> + if (tx_queue_id_is_invalid(res->qid)) >>> + return; >>> + >>> + ret = eth_dev_info_get_print_err(res->portid, &dev_info); >>> + if (ret != 0) >>> + return; >>> + >>> + if (dev_info.nb_phy_ports == 0) { >>> + printf("Number of physical ports is 0 which is invalid for PHY >> Affinity\n"); >>> + return; >>> + } >>> + printf("The number of physical ports is %u\n", dev_info.nb_phy_ports); >>> + if (dev_info.nb_phy_ports < res->value) { >>> + printf("The PHY affinity value %u is Invalid, exceeds the " >>> + "number of physical ports\n", res->value); >>> + return; >>> + } >>> + port->txq[res->qid].conf.tx_phy_affinity = res->value; >>> + >>> + cmd_reconfig_device_queue(res->portid, 0, 1); } >>> + >>> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_port = >>> + TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity, >>> + port, "port"); >>> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_config = >>> + TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity, >>> + config, "config"); >>> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_portid = >>> + TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity, >>> + portid, RTE_UINT16); >>> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_txq = >>> + TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity, >>> + txq, "txq"); >>> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_qid = >>> + TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity, >>> + qid, RTE_UINT16); >>> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_hwport = >>> + TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity, >>> + phy_affinity, "phy_affinity"); >>> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_value = >>> + TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity, >>> + value, RTE_UINT8); >>> + >>> +static cmdline_parse_inst_t cmd_config_tx_phy_affinity = { >>> + .f = cmd_config_tx_phy_affinity_parsed, >>> + .data = (void *)0, >>> + .help_str = "port config <port_id> txq <queue_id> phy_affinity <value>", >>> + .tokens = { >>> + (void *)&cmd_config_tx_phy_affinity_port, >>> + (void *)&cmd_config_tx_phy_affinity_config, >>> + (void *)&cmd_config_tx_phy_affinity_portid, >>> + (void *)&cmd_config_tx_phy_affinity_txq, >>> + (void *)&cmd_config_tx_phy_affinity_qid, >>> + (void *)&cmd_config_tx_phy_affinity_hwport, >>> + (void *)&cmd_config_tx_phy_affinity_value, >>> + NULL, >>> + }, >>> +}; >>> + >>> /* >>> >> **************************************************************** >> ****** >>> ********** */ >>> >>> /* list of instructions */ >>> @@ -12866,6 +12965,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = { >>> (cmdline_parse_inst_t *)&cmd_show_port_cman_capa, >>> (cmdline_parse_inst_t *)&cmd_show_port_cman_config, >>> (cmdline_parse_inst_t *)&cmd_set_port_cman_config, >>> + (cmdline_parse_inst_t *)&cmd_config_tx_phy_affinity, >>> NULL, >>> }; >>> >>> diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index >>> acccb6b035..b83fb17cfa 100644 >>> --- a/app/test-pmd/config.c >>> +++ b/app/test-pmd/config.c >>> @@ -936,6 +936,7 @@ port_infos_display(portid_t port_id) >>> printf("unknown\n"); >>> break; >>> } >>> + printf("Current number of physical ports: %u\n", >>> +dev_info.nb_phy_ports); >>> } >>> >>> void >>> diff --git a/devtools/libabigail.abignore >>> b/devtools/libabigail.abignore index 7a93de3ba1..ac7d3fb2da 100644 >>> --- a/devtools/libabigail.abignore >>> +++ b/devtools/libabigail.abignore >>> @@ -34,3 +34,8 @@ >>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; >>> ; Temporary exceptions till next major ABI version ; >>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; >>> + >>> +; Ignore fields inserted in padding hole of rte_eth_txconf >>> +[suppress_type] >>> + name = rte_eth_txconf >>> + has_data_member_inserted_between = >>> +{offset_of(tx_deferred_start), offset_of(offloads)} >>> diff --git a/doc/guides/rel_notes/release_23_03.rst >>> b/doc/guides/rel_notes/release_23_03.rst >>> index 73f5d94e14..e99bd2dcb6 100644 >>> --- a/doc/guides/rel_notes/release_23_03.rst >>> +++ b/doc/guides/rel_notes/release_23_03.rst >>> @@ -55,6 +55,10 @@ New Features >>> Also, make sure to start the actual text at the margin. >>> ======================================================= >>> >>> +* **Added affinity for multiple physical ports connected to a single >>> +DPDK port.** >>> + >>> + * Added Tx affinity in queue setup to map a physical port. >>> + >>> * **Updated AMD axgbe driver.** >>> >>> * Added multi-process support. >>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst >>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst >>> index 79a1fa9cb7..5c716f7679 100644 >>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst >>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst >>> @@ -1605,6 +1605,19 @@ Enable or disable a per queue Tx offloading only >> on a specific Tx queue:: >>> >>> This command should be run when the port is stopped, or else it will fail. >>> >>> +config per queue Tx physical affinity >>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> + >>> +Configure a per queue physical affinity value only on a specific Tx queue:: >>> + >>> + testpmd> port (port_id) txq (queue_id) phy_affinity (value) >>> + >>> +* ``phy_affinity``: physical port to use for sending, >>> + when multiple physical ports are connected to >>> + a single DPDK port. >>> + >>> +This command should be run when the port is stopped, otherwise it fails. >>> + >>> Config VXLAN Encap outer layers >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index >>> c129ca1eaf..2fd971b7b5 100644 >>> --- a/lib/ethdev/rte_ethdev.h >>> +++ b/lib/ethdev/rte_ethdev.h >>> @@ -1138,6 +1138,14 @@ struct rte_eth_txconf { >>> less free descriptors than this value. */ >>> >>> uint8_t tx_deferred_start; /**< Do not start queue with >>> rte_eth_dev_start(). */ >>> + /** >>> + * Affinity with one of the multiple physical ports connected to the >> DPDK port. >>> + * Value 0 means no affinity and traffic could be routed to any >> connected >>> + * physical port. >>> + * The first physical port is number 1 and so on. >>> + * Number of physical ports is reported by nb_phy_ports in >> rte_eth_dev_info. >>> + */ >>> + uint8_t tx_phy_affinity; >>> /** >>> * Per-queue Tx offloads to be set using RTE_ETH_TX_OFFLOAD_* >> flags. >>> * Only offloads set on tx_queue_offload_capa or tx_offload_capa @@ >>> -1744,6 +1752,8 @@ struct rte_eth_dev_info { >>> /** Device redirection table size, the total number of entries. */ >>> uint16_t reta_size; >>> uint8_t hash_key_size; /**< Hash key size in bytes */ >>> + /** Number of physical ports connected with DPDK port. */ >>> + uint8_t nb_phy_ports; >>> /** Bit mask of RSS offloads, the bit offset also means flow type */ >>> uint64_t flow_type_rss_offloads; >>> struct rte_eth_rxconf default_rxconf; /**< Default Rx configuration >>> */ >