HI BR Rongwei
> -----Original Message----- > From: Ivan Malov <ivan.ma...@oktetlabs.ru> > Sent: Tuesday, September 13, 2022 22:33 > To: Rongwei Liu <rongw...@nvidia.com> > Cc: Matan Azrad <ma...@nvidia.com>; Slava Ovsiienko > <viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; NBU-Contact- > Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>; Aman Singh > <aman.deep.si...@intel.com>; Yuying Zhang <yuying.zh...@intel.com>; > Andrew Rybchenko <andrew.rybche...@oktetlabs.ru>; dev@dpdk.org; Raslan > Darawsheh <rasl...@nvidia.com> > Subject: RE: [PATCH v1] ethdev: add direction info when creating the transfer > table > > External email: Use caution opening links or attachments > > > Hi Rongwei, > > PSB > > On Tue, 13 Sep 2022, Rongwei Liu wrote: > > > Hi > > > > BR > > Rongwei > > > >> -----Original Message----- > >> From: Ivan Malov <ivan.ma...@oktetlabs.ru> > >> Sent: Tuesday, September 13, 2022 00:57 > >> To: Rongwei Liu <rongw...@nvidia.com> > >> Cc: Matan Azrad <ma...@nvidia.com>; Slava Ovsiienko > >> <viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; NBU-Contact- > >> Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>; Aman Singh > >> <aman.deep.si...@intel.com>; Yuying Zhang <yuying.zh...@intel.com>; > >> Andrew Rybchenko <andrew.rybche...@oktetlabs.ru>; dev@dpdk.org; > >> Raslan Darawsheh <rasl...@nvidia.com> > >> Subject: Re: [PATCH v1] ethdev: add direction info when creating the > >> transfer table > >> > >> External email: Use caution opening links or attachments > >> > >> > >> Hi, > >> > >> On Wed, 7 Sep 2022, Rongwei Liu wrote: > >> > >>> The transfer domain rule is able to match traffic wire/vf origin and > >>> it means two directions' underlayer resource. > >> > >> The point of fact is that matching traffic coming from some entity > >> like wire / VF has been long generalised in the form of representors. > >> So, a flow rule with attribute "transfer" is able to match traffic > >> coming from either a REPRESENTED_PORT or from a PORT_REPRESENTOR > (please find these items). > >> > >>> > >>> In customer deployments, they usually match only one direction > >>> traffic in single flow table: either from wire or from vf. > >> > >> Which customer deployments? Could you please provide detailed examples? > >> > >>> > > > > We saw a lot of customers' deployment like: > > 1. Match overlay traffic from wire and do decap, then send to specific > > vport. > > 2. Match specific 5-tuples and do encap, then send to wire. > > The matching criteria has obvious direction preference. > > Thank you. My questions are as follows: > > In (1), when you say "from wire", do you mean the need to match packets > arriving via whatever physical ports rather then matching packets arriving > from some specific phys. port? > > If, however, matching traffic "from wire" in fact means matching packets > arriving from a *specific* physical port, then for sure item > REPRESENTED_PORT should perfectly do the job, and the proposed attribute is > unneeded. > > (BTW, in DPDK, it is customary to use term "physical port", not "wire") > > In (1), what are "vport"s? Please explain. Once again, I should remind that, > in > DPDK, folks prefer terms "represented entity" / "representor" > over vendor-specific terms like "vport", etc. > Vport is virtual port for short such as VF. > As for (2), imagine matching 5-tuple traffic emitted by a VF / guest. > Could you please explain, why not just add a match item REPRESENTED_PORT > pointing to that VF via its representor? Doing so should perfectly define the > exact direction / traffic source. Isn't that sufficient? > Per my view, there is matching field and matching value difference. Like IPv4 src_addr 1.1.1.1, 1.1.1.2. 1.1.1.3, will you treat it as same or different matching criteria? I would like to call them same since it can be summarized like 1.1.1.0/30 REPRESENTED_PORT is just another matching item, no essential differences and it can't stand for direction info. Port id depends on the attach sequence. > Also please mind that, although I appreciate your explanations here, on the > mailing list, they should finally be added to the commit message, so that > readers do not have to look for them elsewhere. > We have explained the high possibility of single-direction matching, right? It' hard to list all the possibilities of traffic matching preferences. The underlay is the one we have met for now. > > > >>> Introduce one new member transfer_mode into rte_flow_attr to > >>> indicate the flow table direction property: from wire, from vf or > >>> bi-direction(default). > >> > >> AFAIK, 'rte_flow_attr' serves both traditional flow rule insertion > >> and asynchronous (table) approach. The patch adds the attributes to > >> generic 'rte_flow_attr' but, for some reason, ignores non-table rules. > >> > >>> > > Sync API uses one rule to contain everything. It' hard for PMD to determine > if this rule has direction preference or not. > > Image a situation, just for an example: > > 1. Vport 1 VxLAN do decap send to vport 2. 1 million scale > > 2. Vport 0 (wire) VxLAN do decap send to vport 3. 1 hundred scale. > > 1 and 2 share the same matching conditions (eth / ipv4 / udp / vxlan /...), > > so > sync API consider them share matching determination logic. > > It means "2" have 1M scale capability too. Obviously, it wastes a lot of > resources. > > Strictly speaking, they do not share the same match pattern. > Your example clearly shows that, in (1), the pattern should request packets > coming from "vport 1" and, in (2), packets coming from "vport 0". > > My point is simple: the "vport" from which packets enter the embedded switch > is ALSO a match criterion. If you accept this, you'll see: the matching > conditions differ. > See above. In this case, I think the matching fields are both "port_id + ipv4_vxlan". They are same. Only differs with values like vni 100 or 200 vice versa. > > > > In async API, there is pattern_template introduced. We can mark "1" to use > pattern_tempate id 1 and "2" to use pattern_template 2. > > They will be separated from each other, don't share anymore. > > Consider an example. "Wire" is a physical port represented by PF0 which, in > turn, is attached to DPDK via ethdev 0. "VF" (vport?) is attached to guest > and is > represented by a representor ethdev 1 in DPDK. > > So, some rules (template 1) are needed to deliver packets from "wire" > to "VF" and also decapsulate them. And some rules (template 2) are needed to > deliver packets in the opposite direction, from "VF" > to "wire" and also encapsulate them. > > My question is, what prevents you from adding match item > REPRESENTED_PORT[ethdev_id=0] to the pattern template 1 and > REPRESENTED_PORT[ethdev_id=1] to the pattern template 2? > > As I said previously, if you insert such item before eth / ipv4 / etc to your > match pattern, doing so defines an *exact* direction / source. > Could you check the async API guidance? I think pattern template focusing on the matching field (mask). "REPRESENTED_PORT[ethdev_id=0] " and "REPRESENTED_PORT[ethdev_id=1] "are the same. 1. pattern template: REPRESENTED_PORT mask 0xffff ... 2. action template: action1 / actions2. / 3. table create with pattern_template plus action template.. REPRESENTED_PORT[ethdev_id=0] will be rule1: rule create REPRESENTED_PORT port_id is 0 / actions .... REPRESENTED_PORT[ethdev_id=1] will be rule2: rule create REPRESENTED_PORT port_id is 1 / actions .... > > > >> For example, the diff below adds the attributes to "table" commands > >> in testpmd but does not add them to regular (non-table) commands like > >> "flow create". Why? > >> > >>> > > > > "table" command limits pattern_template to single direction or bidirection > per user specified attribute. > > As I say above, the same effect can be achieved by adding item > REPRESENTED_PORT to the corresponding pattern template. See above. > > > "rule" command must tight with one "table_id", so the rule will inherit the > "table" direction property, no need to specify again. > > You migh've misunderstood. I do not talk about "rule" command coupled with > some "table". What I talk about is regular, NON-async flow insertion > commands. > > Please take a look at section "/* Validate/create attributes. */" in file > "app/test-pmd/cmdline_flow.c". When one adds a new flow attribute, they > should reflect it the same way as VC_INGRESS, VC_TRANSFER, etc. > > That's it. We don't intend to pass this to sync API. The above code example is for sync API. > > But, as I say, I still believe that the new attributes aren't needed. I think we are not at the same page for now. Can we reach agreement on the same matching criteria first? > > > >>> It helps to save underlayer memory also on insertion rate. > >> > >> Which memory? Host memory? NIC memory? Term "underlayer" is vague. > >> I suggest that the commit message be revised to first explain how > >> such memory is spent currently, then explain why this is not optimal > >> and, finally, which way the patch is supposed to improve that. I.e. be more > specific. > >> > >>> > > > > For large scalable rules, HW (depends on implementation) always needs > memory to hold the rules' patterns and actions, either from NIC or from host. > > The memory footprint highly depends on "user rules' complexity", also diff > between NICs. > > ~50% memory saving is expected if one-direction is cut. > > Regardless of this talk, this explanation should probably be present in the > commit description. > This number may differ with different NICs or implementation. We can't say it for sure. > > > >>> By default, the transfer domain is bi-direction, and no behavior changes. > >>> > >>> 1. Match wire origin only > >>> flow template_table 0 create group 0 priority 0 transfer wire_orig... > >>> 2. Match vf origin only > >>> flow template_table 0 create group 0 priority 0 transfer vf_orig... > >>> > >>> Signed-off-by: Rongwei Liu <rongweil at nvidia.com> > >>> --- > >>> app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++++ > >>> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++- > >>> lib/ethdev/rte_flow.h | 9 ++++++- > >>> 3 files changed, 36 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/app/test-pmd/cmdline_flow.c > >>> b/app/test-pmd/cmdline_flow.c index 7f50028eb7..b25b595e82 100644 > >>> --- a/app/test-pmd/cmdline_flow.c > >>> +++ b/app/test-pmd/cmdline_flow.c > >>> @@ -177,6 +177,8 @@ enum index { > >>> TABLE_INGRESS, > >>> TABLE_EGRESS, > >>> TABLE_TRANSFER, > >>> + TABLE_TRANSFER_WIRE_ORIG, > >>> + TABLE_TRANSFER_VF_ORIG, > >>> TABLE_RULES_NUMBER, > >>> TABLE_PATTERN_TEMPLATE, > >>> TABLE_ACTIONS_TEMPLATE, > >>> @@ -1141,6 +1143,8 @@ static const enum index next_table_attr[] = { > >>> TABLE_INGRESS, > >>> TABLE_EGRESS, > >>> TABLE_TRANSFER, > >>> + TABLE_TRANSFER_WIRE_ORIG, > >>> + TABLE_TRANSFER_VF_ORIG, > >>> TABLE_RULES_NUMBER, > >>> TABLE_PATTERN_TEMPLATE, > >>> TABLE_ACTIONS_TEMPLATE, > >>> @@ -2881,6 +2885,18 @@ static const struct token token_list[] = { > >>> .next = NEXT(next_table_attr), > >>> .call = parse_table, > >>> }, > >>> + [TABLE_TRANSFER_WIRE_ORIG] = { > >>> + .name = "wire_orig", > >>> + .help = "affect rule direction to transfer", > >> > >> This does not explain the "wire" aspect. It's too broad. > >> > >>> + .next = NEXT(next_table_attr), > >>> + .call = parse_table, > >>> + }, > >>> + [TABLE_TRANSFER_VF_ORIG] = { > >>> + .name = "vf_orig", > >>> + .help = "affect rule direction to transfer", > >> > >> This explanation simply duplicates such of the "wire_orig". > >> It does not explain the "vf" part. Should be more specific. > >> > >>> + .next = NEXT(next_table_attr), > >>> + .call = parse_table, > >>> + }, > >>> [TABLE_RULES_NUMBER] = { > >>> .name = "rules_number", > >>> .help = "number of rules in table", @@ -8894,6 > >>> +8910,16 @@ parse_table(struct context *ctx, const struct token *token, > >>> case TABLE_TRANSFER: > >>> out->args.table.attr.flow_attr.transfer = 1; > >>> return len; > >>> + case TABLE_TRANSFER_WIRE_ORIG: > >>> + if (!out->args.table.attr.flow_attr.transfer) > >>> + return -1; > >>> + out->args.table.attr.flow_attr.transfer_mode = 1; > >>> + return len; > >>> + case TABLE_TRANSFER_VF_ORIG: > >>> + if (!out->args.table.attr.flow_attr.transfer) > >>> + return -1; > >>> + out->args.table.attr.flow_attr.transfer_mode = 2; > >>> + return len; > >>> default: > >>> return -1; > >>> } > >>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst > >>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst > >>> index 330e34427d..603b7988dd 100644 > >>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst > >>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst > >>> @@ -3332,7 +3332,8 @@ It is bound to > >> ``rte_flow_template_table_create()``:: > >>> > >>> flow template_table {port_id} create > >>> [table_id {id}] [group {group_id}] > >>> - [priority {level}] [ingress] [egress] [transfer] > >>> + [priority {level}] [ingress] [egress] > >>> + [transfer [vf_orig] [wire_orig]] > >> > >> Is it correct? Shouldn't it rather be [transfer] [vf_orig] > >> [wire_orig] ? > >> > >>> rules_number {number} > >>> pattern_template {pattern_template_id} > >>> actions_template {actions_template_id} diff --git > >>> a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index > >>> a79f1e7ef0..512b08d817 100644 > >>> --- a/lib/ethdev/rte_flow.h > >>> +++ b/lib/ethdev/rte_flow.h > >>> @@ -130,7 +130,14 @@ struct rte_flow_attr { > >>> * through a suitable port. @see rte_flow_pick_transfer_proxy(). > >>> */ > >>> uint32_t transfer:1; > >>> - uint32_t reserved:29; /**< Reserved, must be zero. */ > >>> + /** > >>> + * 0 means bidirection, > >>> + * 0x1 origin uplink, > >> > >> What does "uplink" mean? It's too vague. Hardly a good term. > >> > >>> + * 0x2 origin vport, > >> > >> What does "origin vport" mean? Hardly a good term as well. > >> > >>> + * N/A both set. > >> > >> What's this? > >> > >>> + */ > >>> + uint32_t transfer_mode:2; > >>> + uint32_t reserved:27; /**< Reserved, must be zero. */ > >>> }; > >>> > >>> /** > >>> -- > >>> 2.27.0 > >>> > >> > >> Since the attributes are added to generic 'struct rte_flow_attr', > >> non-table > >> (synchronous) flow rules are supposed to support them, too. If that > >> is indeed the case, then I'm afraid such proposal does not agree with > >> the existing items PORT_REPRESENTOR and REPRESENTED_PORT. They do > >> exactly the same thing, but they are designed to be way more generic. Why > not use them? > > The question stands. > > >> > >> Ivan > > > > Ivan