Hi Ivan, I agree with your reply.
Acked-by: Ori Kam <or...@nvidia.com> Thanks, Ori > -----Original Message----- > From: Ivan Malov <ivan.ma...@oktetlabs.ru> > Subject: Re: [PATCH] ethdev: let apps find transfer admin port for a given > ethdev > > Hi Ori, > > Thanks a lot for eagerly reviewing this and related patches. That's very > helpful > of yours. > > On 05/10/2021 12:22, Ori Kam wrote: > > Hi Ivan, > > > >> -----Original Message----- > >> From: Ivan Malov <ivan.ma...@oktetlabs.ru> > >> Sent: Tuesday, October 5, 2021 3:36 AM > >> Subject: [PATCH] ethdev: let apps find transfer admin port for a > >> given ethdev > >> > >> Introduce a helper API to let applications find transfer admin port > >> for a given ethdev to avoid failures when managing "transfer" flows > >> through unprivileged ports. > >> > >> Signed-off-by: Ivan Malov <ivan.ma...@oktetlabs.ru> > >> Reviewed-by: Andrew Rybchenko <andrew.rybche...@oktetlabs.ru> > >> --- > >> Patch series [1] has reworked support for "transfer" flows. > >> As a result, an application no longer needs to communicate such flows > >> through a particular ethdev where it receives the corresponding > >> packets in the first place. > >> > >> Hence, this patch is a legitimate follow-up to the series [1]. > >> At the same time, it's a follow-up to the early RFC [2]. > >> > >> net/sfc driver is going to support this method. The corresponding > >> patch is already in progress and will be provided in the course of this > >> release > cycle. > >> > >> [1] https://patches.dpdk.org/project/dpdk/list/?series=19326 > >> [2] https://patches.dpdk.org/project/dpdk/list/?series=18737 > >> --- > >> app/test-pmd/config.c | 106 ++++++++++++++++++++++++- > >> app/test-pmd/testpmd.c | 21 +++++ > >> app/test-pmd/testpmd.h | 4 + > >> app/test-pmd/util.c | 5 +- > >> doc/guides/rel_notes/release_21_11.rst | 3 + > >> lib/ethdev/rte_flow.c | 22 +++++ > >> lib/ethdev/rte_flow.h | 32 ++++++++ > >> lib/ethdev/rte_flow_driver.h | 5 ++ > >> lib/ethdev/version.map | 3 + > >> 9 files changed, 197 insertions(+), 4 deletions(-) > >> > >> diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index > >> 9c66329e96..2772c83d0a 100644 > >> --- a/app/test-pmd/config.c > >> +++ b/app/test-pmd/config.c > >> @@ -1505,10 +1505,25 @@ port_action_handle_create(portid_t port_id, > >> uint32_t id, > >> struct port_indirect_action *pia; > >> int ret; > >> struct rte_flow_error error; > >> + struct rte_port *port; > >> + > >> + if (port_id_is_invalid(port_id, ENABLED_WARN) || > >> + port_id == (portid_t)RTE_PORT_ALL) > >> + return -EINVAL; > >> > > > > Is this part of the patch or a general fix? > > I know it might seem unrelated at first glance. But it has to be added here > for > consistency with the rest of the code being added. It should be OK. > Just checking I'm O.K. with it. > > > >> ret = action_alloc(port_id, id, &pia); > >> if (ret) > >> return ret; > >> + > >> + port = &ports[port_id]; > >> + > >> + if (conf->transfer) > >> + port_id = port->flow_transfer_proxy; > >> + > >> + if (port_id_is_invalid(port_id, ENABLED_WARN) || > >> + port_id == (portid_t)RTE_PORT_ALL) > >> + return -EINVAL; > >> + > >> if (action->type == RTE_FLOW_ACTION_TYPE_AGE) { > >> struct rte_flow_action_age *age = > >> (struct rte_flow_action_age *)(uintptr_t)(action->conf); > @@ > >> -1531,6 +1546,7 @@ port_action_handle_create(portid_t port_id, > >> uint32_t id, > >> return port_flow_complain(&error); > >> } > >> pia->type = action->type; > >> + pia->transfer = conf->transfer; > >> printf("Indirect action #%u created\n", pia->id); > >> return 0; > >> } > >> @@ -1557,9 +1573,18 @@ port_action_handle_destroy(portid_t port_id, > >> for (i = 0; i != n; ++i) { > >> struct rte_flow_error error; > >> struct port_indirect_action *pia = *tmp; > >> + portid_t port_id_eff = port_id; > >> > >> if (actions[i] != pia->id) > >> continue; > >> + > >> + if (pia->transfer) > >> + port_id_eff = port->flow_transfer_proxy; > >> + > >> + if (port_id_is_invalid(port_id_eff, ENABLED_WARN) || > >> + port_id_eff == (portid_t)RTE_PORT_ALL) > >> + return -EINVAL; > >> + > >> /* > >> * Poisoning to make sure PMDs update it in case > >> * of error. > >> @@ -1567,7 +1592,7 @@ port_action_handle_destroy(portid_t port_id, > >> memset(&error, 0x33, sizeof(error)); > >> > >> if (pia->handle && rte_flow_action_handle_destroy( > >> - port_id, pia->handle, &error)) { > >> + port_id_eff, pia->handle, &error)) { > >> ret = port_flow_complain(&error); > >> continue; > >> } > >> @@ -1602,8 +1627,15 @@ port_action_handle_update(portid_t port_id, > >> uint32_t id, > >> struct rte_flow_error error; > >> struct rte_flow_action_handle *action_handle; > >> struct port_indirect_action *pia; > >> + struct rte_port *port; > >> const void *update; > >> > >> + if (port_id_is_invalid(port_id, ENABLED_WARN) || > >> + port_id == (portid_t)RTE_PORT_ALL) > >> + return -EINVAL; > >> + > >> + port = &ports[port_id]; > >> + > >> action_handle = port_action_handle_get_by_id(port_id, id); > >> if (!action_handle) > >> return -EINVAL; > >> @@ -1618,6 +1650,14 @@ port_action_handle_update(portid_t port_id, > >> uint32_t id, > >> update = action; > >> break; > >> } > >> + > >> + if (pia->transfer) > >> + port_id = port->flow_transfer_proxy; > >> + > >> + if (port_id_is_invalid(port_id, ENABLED_WARN) || > >> + port_id == (portid_t)RTE_PORT_ALL) > >> + return -EINVAL; > >> + > >> if (rte_flow_action_handle_update(port_id, action_handle, update, > >> &error)) { > >> return port_flow_complain(&error); @@ -1636,6 +1676,14 > @@ > >> port_action_handle_query(portid_t port_id, uint32_t id) > >> struct rte_flow_query_age age; > >> struct rte_flow_action_conntrack ct; > >> } query; > >> + portid_t port_id_eff = port_id; > >> + struct rte_port *port; > >> + > >> + if (port_id_is_invalid(port_id, ENABLED_WARN) || > >> + port_id == (portid_t)RTE_PORT_ALL) > >> + return -EINVAL; > >> + > >> + port = &ports[port_id]; > >> > >> pia = action_get_by_id(port_id, id); > >> if (!pia) > >> @@ -1650,10 +1698,19 @@ port_action_handle_query(portid_t port_id, > >> uint32_t id) > >> id, pia->type, port_id); > >> return -ENOTSUP; > >> } > >> + > >> + if (pia->transfer) > >> + port_id_eff = port->flow_transfer_proxy; > >> + > >> + if (port_id_is_invalid(port_id_eff, ENABLED_WARN) || > >> + port_id_eff == (portid_t)RTE_PORT_ALL) > >> + return -EINVAL; > >> + > >> /* Poisoning to make sure PMDs update it in case of error. */ > >> memset(&error, 0x55, sizeof(error)); > >> memset(&query, 0, sizeof(query)); > >> - if (rte_flow_action_handle_query(port_id, pia->handle, &query, > >> &error)) > >> + if (rte_flow_action_handle_query(port_id_eff, pia->handle, &query, > >> + &error)) > >> return port_flow_complain(&error); > >> switch (pia->type) { > >> case RTE_FLOW_ACTION_TYPE_AGE: > >> @@ -1872,6 +1929,20 @@ port_flow_validate(portid_t port_id, { > >> struct rte_flow_error error; > >> struct port_flow_tunnel *pft = NULL; > >> + struct rte_port *port; > >> + > >> + if (port_id_is_invalid(port_id, ENABLED_WARN) || > >> + port_id == (portid_t)RTE_PORT_ALL) > >> + return -EINVAL; > >> + > >> + port = &ports[port_id]; > >> + > >> + if (attr->transfer) > >> + port_id = port->flow_transfer_proxy; > >> + > >> + if (port_id_is_invalid(port_id, ENABLED_WARN) || > >> + port_id == (portid_t)RTE_PORT_ALL) > >> + return -EINVAL; > >> > >> /* Poisoning to make sure PMDs update it in case of error. */ > >> memset(&error, 0x11, sizeof(error)); @@ -1925,7 +1996,19 @@ > >> port_flow_create(portid_t port_id, > >> struct port_flow_tunnel *pft = NULL; > >> struct rte_flow_action_age *age = age_action_get(actions); > >> > >> + if (port_id_is_invalid(port_id, ENABLED_WARN) || > >> + port_id == (portid_t)RTE_PORT_ALL) > >> + return -EINVAL; > >> + > >> port = &ports[port_id]; > >> + > >> + if (attr->transfer) > >> + port_id = port->flow_transfer_proxy; > >> + > >> + if (port_id_is_invalid(port_id, ENABLED_WARN) || > >> + port_id == (portid_t)RTE_PORT_ALL) > >> + return -EINVAL; > >> + > >> if (port->flow_list) { > >> if (port->flow_list->id == UINT32_MAX) { > >> fprintf(stderr, > >> @@ -1989,6 +2072,7 @@ port_flow_destroy(portid_t port_id, uint32_t n, > >> const uint32_t *rule) > >> uint32_t i; > >> > >> for (i = 0; i != n; ++i) { > >> + portid_t port_id_eff = port_id; > >> struct rte_flow_error error; > >> struct port_flow *pf = *tmp; > >> > >> @@ -1999,7 +2083,15 @@ port_flow_destroy(portid_t port_id, uint32_t > >> n, const uint32_t *rule) > >> * of error. > >> */ > >> memset(&error, 0x33, sizeof(error)); > >> - if (rte_flow_destroy(port_id, pf->flow, &error)) { > >> + > >> + if (pf->rule.attr->transfer) > >> + port_id_eff = port->flow_transfer_proxy; > >> + > >> + if (port_id_is_invalid(port_id_eff, ENABLED_WARN) || > >> + port_id_eff == (portid_t)RTE_PORT_ALL) > >> + return -EINVAL; > >> + > >> + if (rte_flow_destroy(port_id_eff, pf->flow, &error)) { > >> ret = port_flow_complain(&error); > >> continue; > >> } > >> @@ -2133,6 +2225,14 @@ port_flow_query(portid_t port_id, uint32_t rule, > >> fprintf(stderr, "Flow rule #%u not found\n", rule); > >> return -ENOENT; > >> } > >> + > >> + if (pf->rule.attr->transfer) > >> + port_id = port->flow_transfer_proxy; > >> + > >> + if (port_id_is_invalid(port_id, ENABLED_WARN) || > >> + port_id == (portid_t)RTE_PORT_ALL) > >> + return -EINVAL; > >> + > >> ret = rte_flow_conv(RTE_FLOW_CONV_OP_ACTION_NAME_PTR, > >> &name, sizeof(name), > >> (void *)(uintptr_t)action->type, &error); diff --git > >> a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index > >> 97ae52e17e..a88e920bd0 100644 > >> --- a/app/test-pmd/testpmd.c > >> +++ b/app/test-pmd/testpmd.c > >> @@ -533,6 +533,25 @@ int proc_id; > >> */ > >> unsigned int num_procs = 1; > >> > >> +static void > >> +flow_pick_transfer_proxy_mp(uint16_t port_id) { > >> + struct rte_port *port = &ports[port_id]; > >> + int ret; > >> + > >> + port->flow_transfer_proxy = port_id; > >> + > >> + if (!is_proc_primary()) > >> + return; > >> + > >> + ret = rte_flow_pick_transfer_proxy(port_id, &port- > >>> flow_transfer_proxy, > >> + NULL); > >> + if (ret != 0) { > >> + fprintf(stderr, "Error picking flow transfer proxy for port %u: > >> %s > >> - ignore\n", > >> + port_id, rte_strerror(-ret)); > >> + } > >> +} > >> + > >> static int > >> eth_dev_configure_mp(uint16_t port_id, uint16_t nb_rx_q, uint16_t > nb_tx_q, > >> const struct rte_eth_conf *dev_conf) @@ -1489,6 +1508,8 > @@ > >> init_config_port_offloads(portid_t pid, uint32_t socket_id) > >> int ret; > >> int i; > >> > >> + flow_pick_transfer_proxy_mp(pid); > >> + > >> port->dev_conf.txmode = tx_mode; > >> port->dev_conf.rxmode = rx_mode; > >> > >> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h index > >> 5863b2f43f..b3dfd350e5 100644 > >> --- a/app/test-pmd/testpmd.h > >> +++ b/app/test-pmd/testpmd.h > >> @@ -173,6 +173,8 @@ struct port_indirect_action { > >> enum rte_flow_action_type type; /**< Action type. */ > >> struct rte_flow_action_handle *handle; /**< Indirect action handle. */ > >> enum age_action_context_type age_type; /**< Age action context > >> type. */ > >> + /** If true, the action applies to "transfer" flows, and vice versa */ > >> + bool transfer; > >> }; > >> > >> struct port_flow_tunnel { > >> @@ -234,6 +236,8 @@ struct rte_port { > >> /**< dynamic flags. */ > >> uint64_t mbuf_dynf; > >> const struct rte_eth_rxtx_callback > >> *tx_set_dynf_cb[RTE_MAX_QUEUES_PER_PORT+1]; > >> + /** Associated port which is supposed to handle "transfer" flows */ > >> + portid_t flow_transfer_proxy; > >> }; > >> > >> /** > >> diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c index > >> 14a9a251fb..d9edbbf9ee 100644 > >> --- a/app/test-pmd/util.c > >> +++ b/app/test-pmd/util.c > >> @@ -98,13 +98,16 @@ dump_pkt_burst(uint16_t port_id, uint16_t queue, > >> struct rte_mbuf *pkts[], > >> int ret; > >> struct rte_flow_error error; > >> struct rte_flow_restore_info info = { 0, }; > >> + struct rte_port *port = &ports[port_id]; > >> > >> mb = pkts[i]; > >> eth_hdr = rte_pktmbuf_read(mb, 0, sizeof(_eth_hdr), > &_eth_hdr); > >> eth_type = RTE_BE_TO_CPU_16(eth_hdr->ether_type); > >> packet_type = mb->packet_type; > >> is_encapsulation = RTE_ETH_IS_TUNNEL_PKT(packet_type); > >> - ret = rte_flow_get_restore_info(port_id, mb, &info, &error); > >> + > >> + ret = rte_flow_get_restore_info(port->flow_transfer_proxy, > >> + mb, &info, &error); > > > > I'm not sure this is correct, > > Since to restore the data you need to know the port that this traffic was > > sent > to. > > Even if the action was offloaded on the proxy port, it is important to > > know what was the destination port. > > even setting the tunnel must be done on the target port and not the proxy > port. > > Is tunnel offload a "transfer"-based feature? My answer is "yes". And, if we > all > agree that it's easier to communicate all "transfer"-related requests through > a > single ("transfer" admin) ethdev, then the first argument in the highlighted > invocation also should be this "proxy" > ethdev. A unified entry point. > > But please don't think me forward: I fully understand your concern about the > real DPDK port through which the packet was delivered to the application. But > that should not be a big problem. The mbuf which is passed to this API still > has > its metadata set (tunnel mark, for > instance) which the PMD can use to recall the port which this packet was sent > to. More to that, the mbuf even has a dedicated field to express exactly what > you may want to have - the "port" field. So, when your PMD receives the > "missed" packet and hands it over to the application through some ethdev / > port, it can set this mbuf fields to that port ID. > This way, such information is not lost because of migration to "transfer > proxy" > solution, is it? > Yes this answer my concerns (maybe there will some time penaltiy to set this value) but it looks legit. > > > >> if (!ret) { > >> MKDUMPSTR(print_buf, buf_size, cur_len, > >> "restore info:"); > >> diff --git a/doc/guides/rel_notes/release_21_11.rst > >> b/doc/guides/rel_notes/release_21_11.rst > >> index 37776c135e..cc0ea4695a 100644 > >> --- a/doc/guides/rel_notes/release_21_11.rst > >> +++ b/doc/guides/rel_notes/release_21_11.rst > >> @@ -67,6 +67,9 @@ New Features > >> Added macros ETH_RSS_IPV4_CHKSUM and ETH_RSS_L4_CHKSUM, now > IPv4 > >> and > >> TCP/UDP/SCTP header checksum field can be used as input set for RSS. > >> > >> +* **Added an API to pick flow transfer proxy port** > >> + A new API, ``rte_flow_pick_transfer_proxy()``, was added. > >> + > >> * **Updated Broadcom bnxt PMD.** > >> > >> * Added flow offload support for Thor. > >> diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c index > >> 647bbf91ce..15e978f7f7 100644 > >> --- a/lib/ethdev/rte_flow.c > >> +++ b/lib/ethdev/rte_flow.c > >> @@ -1270,3 +1270,25 @@ rte_flow_tunnel_item_release(uint16_t port_id, > >> RTE_FLOW_ERROR_TYPE_UNSPECIFIED, > >> NULL, rte_strerror(ENOTSUP)); > >> } > >> + > >> +int > >> +rte_flow_pick_transfer_proxy(uint16_t port_id, uint16_t *proxy_port_id, > >> + struct rte_flow_error *error) > > > > What about replaceing pick with get? > > There's a school of thought that terms like "get" imply the existence of > paired > actions. For example, "get - set", "probe - unprobe", "allocate - free", > "init - > fini". > > So, in order to avoid any wrong assumptions, I believe it's better to stick > with > term "pick" here as it doesn't seem to have a paired term. > I still think get is better, but lets leave it like this. > > > >> +{ > >> + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; > >> + const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error); > >> + > >> + if (unlikely(ops == NULL)) > >> + return -rte_errno; > >> + > >> + if (likely(ops->pick_transfer_proxy != NULL)) { > >> + return flow_err(port_id, > >> + ops->pick_transfer_proxy(dev, proxy_port_id, > >> + error), > >> + error); > >> + } > >> + > >> + *proxy_port_id = port_id; > >> + > >> + return 0; > >> +} > >> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index > >> f195aa7224..d2cb476189 100644 > >> --- a/lib/ethdev/rte_flow.h > >> +++ b/lib/ethdev/rte_flow.h > >> @@ -122,6 +122,10 @@ struct rte_flow_attr { > >> * > >> * In order to match traffic originating from specific source(s), the > >> * application should use pattern items ETHDEV and ESWITCH_PORT. > >> + * > >> + * Communicating "transfer" flows via unprivileged ethdevs may not > >> + * be possible. In order to pick the ethdev suitable for that, the > >> + * application should use @p rte_flow_pick_transfer_proxy(). > >> */ > >> uint32_t transfer:1; > >> uint32_t reserved:29; /**< Reserved, must be zero. */ @@ -4427,6 > >> +4431,34 @@ rte_flow_tunnel_item_release(uint16_t port_id, > >> struct rte_flow_item *items, > >> uint32_t num_of_items, > >> struct rte_flow_error *error); > >> + > >> +/** > >> + * An application receiving packets on a given ethdev may want to > >> +have their > >> + * processing offloaded to the e-switch lying beneath this ethdev by > >> +means > >> + * of maintaining "transfer" flows. However, it need never use this > >> +exact > >> + * ethdev as an entry point for such flows to be managed through. > >> +More to > >> + * that, this particular ethdev may be short of privileges to > >> +control the > >> + * e-switch. Instead, the application should find an admin ethdev > >> +sitting > >> + * on top of the same e-switch to be used as the entry point (a "proxy"). > >> + * > > > > This explanation is not clear, can you rephrase it? > > Could you please be so kind to outline the parts which are not clear to you. > Maybe your understanding is right in fact but you are just unsure. > If you expressed your interpretation of this text, I could review it and > either > confirm its correctness or debunk any misunderstanding. In the latter case, it > would be easier to me to rephrase the comment in the patch. > After a few more reads I can't point my finger to it. Lets leave it. > > > >> + * This API is a helper to find such "transfer proxy" for a given ethdev. > >> + * > >> + * @note > >> + * If the PMD serving @p port_id doesn't have the corresponding method > >> + * implemented, the API will return @p port_id via @p proxy_port_id. > > > > +1 > > > >> + * > >> + * @param port_id > >> + * ID of the ethdev in question > >> + * @param[out] proxy_port_id > >> + * ID of the "transfer proxy" > >> + * > >> + * @return > >> + * 0 on success, a negative error code otherwise > >> + */ > >> +__rte_experimental > >> +int > >> +rte_flow_pick_transfer_proxy(uint16_t port_id, uint16_t *proxy_port_id, > >> + struct rte_flow_error *error); > >> #ifdef __cplusplus > >> } > >> #endif > >> diff --git a/lib/ethdev/rte_flow_driver.h b/lib/ethdev/rte_flow_driver.h > index > >> 46f62c2ec2..ed52e59a0a 100644 > >> --- a/lib/ethdev/rte_flow_driver.h > >> +++ b/lib/ethdev/rte_flow_driver.h > >> @@ -139,6 +139,11 @@ struct rte_flow_ops { > >> struct rte_flow_item *pmd_items, > >> uint32_t num_of_items, > >> struct rte_flow_error *err); > >> + /** See rte_flow_pick_transfer_proxy() */ > >> + int (*pick_transfer_proxy) > >> + (struct rte_eth_dev *dev, > >> + uint16_t *proxy_port_id, > >> + struct rte_flow_error *error); > >> }; > >> > >> /** > >> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index > >> 904bce6ea1..d4286dc8dd 100644 > >> --- a/lib/ethdev/version.map > >> +++ b/lib/ethdev/version.map > >> @@ -247,6 +247,9 @@ EXPERIMENTAL { > >> rte_mtr_meter_policy_delete; > >> rte_mtr_meter_policy_update; > >> rte_mtr_meter_policy_validate; > >> + > >> + # added in 21.11 > >> + rte_flow_pick_transfer_proxy; > >> }; > >> > >> INTERNAL { > >> -- > >> 2.20.1 > > > > Best, > > Ori > > > > -- > Ivan M