Hi Ivan, BR Rongwei
> -----Original Message----- > From: Ivan Malov <ivan.ma...@arknetworks.am> > Sent: Monday, January 30, 2023 08:00 > To: Rongwei Liu <rongw...@nvidia.com> > Cc: Matan Azrad <ma...@nvidia.com>; Slava Ovsiienko > <viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; NBU-Contact- > Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>; Aman Singh > <aman.deep.si...@intel.com>; Yuying Zhang <yuying.zh...@intel.com>; > Ferruh Yigit <ferruh.yi...@amd.com>; Andrew Rybchenko > <andrew.rybche...@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh > <rasl...@nvidia.com> > Subject: Re: [PATCH v7] ethdev: add special flags when creating async transfer > table > > External email: Use caution opening links or attachments > > > Hi Rongwei, > > Thanks for persevering. I have no strong opinion, but, at least, the fact > that the > new flags are no longer meant for use in rte_flow_attr, which is clearly not > the right place for such, is an improvement. > Thanks for the suggestion, move it to rte_flow_table_attr now and it' dedicated to async API. > However, let's take a closer look at the current patch, shall we? > > But, before we get to that, I'd like to kindly request that you provide a more > concrete example of how this feature is supposed to be used. Are there some > real-life application examples? > Sure. > Also, to me, it's still unclear how an application can obtain the knowledge of > this hint in the first instance. For example, can Open vSwitch somehow tell > ethdevs representing physical ports from ones representing "vports" (host > endpoints)? > How does it know which attribute to specify? > Hint should be initiated by application and application knows it' traffic pattern which highly relates to deployment. Let' use VxLAN encap/decap as an example: 1. Traffic from wire should be VxLAN pattern and do the decap, then send to different vports. flow pattern_template 0 create transfer relaxed no pattern_template_id 4 template represented_port ethdev_port_id is 0 / eth / ipv4 / udp / vxlan / tag index is 0 data is 0x33 / end flow actions_template 0 create transfer actions_template_id 4 template raw_decap index 0 / represented_port ethdev_port_id 1 / end mask raw_decap index 0 / represented_port ethdev_port_id 1 / end flow template_table 0 create group 1 priority 0 transfer wire_orig table_id 4 rules_number 128 pattern_template 4 actions_template 4 2. Traffic from vports should be encap with different VxLAN header and send to wire. flow actions_template 1 create transfer actions_template_id 5 template raw_encap index 0 / represented_port ethdev_port_id 0 / end mask raw_encap index 0 / represented_port ethdev_port_id 0 / end flow template_table 0 create group 1 priority 0 transfer vport_orig table_id 5 rules_number 128 pattern_template 4 actions_template 5 > For the rest of my notes, PSB. > > On Mon, 14 Nov 2022, Rongwei Liu wrote: > > > In case flow rules match only one kind of traffic in a flow table, > > then optimization can be done via allocation of this table. > > This wording might confuse readers. Consider rephrasing it, please: > If multiple flow rules share a common set of match masks, then they might > belong in a flow table which can be pre-allocated. > > > Such optimization is possible only if the application gives a hint > > about its usage of the table during initial configuration. > > > > The transfer domain rules may process traffic from wire or vport, > > which may correspond to two kinds of underlayer resources. > > Why name it a "vport"? Why not "host"? > > host = packets generated by any of the host's "vport"s wire = packets arriving > at the NIC from the network Vport is "virtual port" for short and contains "VF/SF" for now. Per my thoughts, it' clearer and maps to DPDK port probing/management. > > > That's why the first two hints introduced in this patch are about wire > > and vport traffic specialization. > > Wire means traffic arrives from the uplink port while vport means > > traffic initiated from VF/SF. > > By the sound of it, the meaning is confined to just VFs/SFs. > What if the user wants to match packets coming from PFs? > It should be "wire_orig". > > > > There are two possible approaches for providing the hints. > > Using IPv4 as an example: > > 1. Use pattern item in both template table and flow rules. > > > > pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end > > async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end > > > > "ANY_VPORT" needs to be present in each flow rule even if it's just > > a hint. No value to match because matching is already done by > > IPv4 item. > > Why no value to match on? How does it prevent rogue tenants from spoofing > network headers? If the application receives a packet on a particular vport's > representor, then it may strictly specify item represented_port pointing to > that > vport so that only packets from that vport match. > > Why isn't security a consideration? > There is some misunderstanding here. "ANY_VPORT" is the approach (new matching item without value) suggested by you. I was explaining we need to apply it to each flow rule even if it's only a flag and no value. > > > > 2. Add special flags into table_attr. > > > > template_table 0 create table_id 0 group 1 transfer vport_orig > > > > Approach 1 needs to specify the pattern in each flow rule which wastes > > memory and is not user friendly. > > What if the user has to insert a group of rules which not only have the same > set of match masks but also share exactly the same match spec values for a > limited subset of network items (for example, those of an encap. header)? This > way, a subset of network item specs can remain fixed across many rules. Does > that count as wasting memory? > Per my understanding, you are talking "multiple spec and mask mixing". We provide a hint in this patch and no assumption on the matching patterns. I think matching pattern is totally controlled by application layer. "wasting memory " because your approach needs to scatter in each rule while this patch only needs to set table_attr once. No relation with matching patter totally. > If yes, then the problem does not concern just a single pair of attributes, > but > rather deserves a more versatile solution like some sort of indirect grouping > of > constant item specs. > Have you considered such options? See above. > > > This patch takes the 2nd approach and introduces one new member > > "specialize" into rte_flow_table_attr to indicate possible flow table > > optimization. > > The name "specialize" might have some drawbacks: > - spelling difference (specialise/specialize) > - in grep output, will mix with flows' "spec" > - quite long > - not a noun > > Why not "scope"? Or something like that? > It means special optimization to PMD. "scope" is more rogue. > > > > By default, there is no hint, so the behavior of the transfer domain > > doesn't change. > > There is no guarantee that the hint will be used by the PMD. > > > > Signed-off-by: Rongwei Liu <rongweil at nvidia.com> > > Acked-by: Ori Kam <orika at nvidia.com> > > > > v2: Move the new field to template table attribute. > > v4: Mark it as optional and clear the concept. > > v5: Change specialize type to uint32_t. > > v6: Change the flags to macros and re-construct the commit log. > > v7: Fix build failure. > > --- > > app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++ > > doc/guides/prog_guide/rte_flow.rst | 15 +++++++++++ > > doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++- > > lib/ethdev/rte_flow.h | 28 +++++++++++++++++++++ > > 4 files changed, 71 insertions(+), 1 deletion(-) > > > > diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c > > index 88108498e0..62197f2618 100644 > > --- a/app/test-pmd/cmdline_flow.c > > +++ b/app/test-pmd/cmdline_flow.c > > @@ -184,6 +184,8 @@ enum index { > > TABLE_INGRESS, > > TABLE_EGRESS, > > TABLE_TRANSFER, > > + TABLE_TRANSFER_WIRE_ORIG, > > + TABLE_TRANSFER_VPORT_ORIG, > > TABLE_RULES_NUMBER, > > TABLE_PATTERN_TEMPLATE, > > TABLE_ACTIONS_TEMPLATE, > > @@ -1158,6 +1160,8 @@ static const enum index next_table_attr[] = { > > TABLE_INGRESS, > > TABLE_EGRESS, > > TABLE_TRANSFER, > > + TABLE_TRANSFER_WIRE_ORIG, > > + TABLE_TRANSFER_VPORT_ORIG, > > TABLE_RULES_NUMBER, > > TABLE_PATTERN_TEMPLATE, > > TABLE_ACTIONS_TEMPLATE, > > @@ -2933,6 +2937,18 @@ static const struct token token_list[] = { > > .next = NEXT(next_table_attr), > > .call = parse_table, > > }, > > + [TABLE_TRANSFER_WIRE_ORIG] = { > > + .name = "wire_orig", > > + .help = "affect rule direction to transfer", > > + .next = NEXT(next_table_attr), > > + .call = parse_table, > > + }, > > + [TABLE_TRANSFER_VPORT_ORIG] = { > > + .name = "vport_orig", > > + .help = "affect rule direction to transfer", > > + .next = NEXT(next_table_attr), > > + .call = parse_table, > > + }, > > [TABLE_RULES_NUMBER] = { > > .name = "rules_number", > > .help = "number of rules in table", @@ -8993,6 +9009,16 > > @@ parse_table(struct context *ctx, const struct token *token, > > case TABLE_TRANSFER: > > out->args.table.attr.flow_attr.transfer = 1; > > return len; > > + case TABLE_TRANSFER_WIRE_ORIG: > > + if (!out->args.table.attr.flow_attr.transfer) > > + return -1; > > + out->args.table.attr.specialize = > > RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG; > > + return len; > > + case TABLE_TRANSFER_VPORT_ORIG: > > + if (!out->args.table.attr.flow_attr.transfer) > > + return -1; > > + out->args.table.attr.specialize = > > RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG; > > + return len; > > default: > > return -1; > > } > > diff --git a/doc/guides/prog_guide/rte_flow.rst > > b/doc/guides/prog_guide/rte_flow.rst > > index 3e6242803d..d9ca041ae4 100644 > > --- a/doc/guides/prog_guide/rte_flow.rst > > +++ b/doc/guides/prog_guide/rte_flow.rst > > @@ -3605,6 +3605,21 @@ and pattern and actions templates are created. > > &actions_templates, nb_actions_templ, > > &error); > > > > +Table Attribute: Specialize > > +^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > + > > +Application can help optimizing underlayer resources and insertion > > +rate by specializing template table. > > +Specialization is done by providing hints in the template table > > +attribute ``specialize``. > > + > > +This attribute is not mandatory for each PMD to implement. > > +If a hint is not supported, it will be silently ignored, and no > > +special optimization is done. > > Silently ignoring the field does not sit well with the application's possible > intent > to drop represented_port match from the patterns. From my point of view, if > the application sets this attribute, it believes it can rely on it, that is, > packets > coming from host won't match if the attribute asks to match network only, for > instance. Has this been considered? > > > + > > +If a table is specialized, the application should make sure the rules > > +comply with the table attribute. > > How does the application enforce that? I would appreciate you explain it. > > > + > > Asynchronous operations > > ----------------------- > > > > diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst > > b/doc/guides/testpmd_app_ug/testpmd_funcs.rst > > index 96c5ae0fe4..b3238415f4 100644 > > --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst > > +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst > > @@ -3145,7 +3145,8 @@ It is bound to > ``rte_flow_template_table_create()``:: > > > > flow template_table {port_id} create > > [table_id {id}] [group {group_id}] > > - [priority {level}] [ingress] [egress] [transfer] > > + [priority {level}] [ingress] [egress] > > + [transfer [vport_orig] [wire_orig]] > > rules_number {number} > > pattern_template {pattern_template_id} > > actions_template {actions_template_id} diff --git > > a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index > > 8858b56428..c27b48c5c1 100644 > > --- a/lib/ethdev/rte_flow.h > > +++ b/lib/ethdev/rte_flow.h > > @@ -5186,6 +5186,29 @@ rte_flow_actions_template_destroy(uint16_t > > port_id, */ struct rte_flow_template_table; > > > > +/**@{@name Special optional flags for template table attribute > > + * Each bit is a hint for table specialization, > > + * offering a potential optimization at PMD layer. > > + * PMD can ignore the unsupported bits silently. > > + */ > > +/** > > + * Specialize table for transfer flows which come only from wire. > > + * It allows PMD not to allocate resources for non-wire originated traffic. > > + * This bit is not a matching criteria, just an optimization hint. > > You intended to spell "criterion", I take it. And still, it *is* a match > criterion. > I'm not denying the possible need to have this criterion at the earliest > processing stage. That might be OK, but I still have a hunch that this is too > specific. > Please see my comment above about wasting memory. > I guess this type of criterion is not the only one that may need to be > provided > as a "hint". > > > + * Flow rules which match non-wire originated traffic will be missed > > + * if the hint is supported. > > And what if it's unsupported? Is it indeed OK to silently ignore it? > > > + */ > > +#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG > RTE_BIT32(0) > > Why not RTE_FLOW_TABLE_SCOPE_FROM_WIRE ? > > To me, TRANSFER looks redundant as this bit is already supposed to be ticked > in the "struct rte_flow_attr flow_attr" field of the "struct > rte_flow_template_table_attr". > > > +/** > > + * Specialize table for transfer flows which come only from vport > > +(e.g. VF, > > SF). > > And PF? > > > + * It allows PMD not to allocate resources for non-vport originated > > traffic. > > + * This bit is not a matching criteria, just an optimization hint. > > + * Flow rules which match non-vport originated traffic will be missed > > + * if the hint is supported. > > + */ > > +#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG > RTE_BIT32(1) > > Why not RTE_FLOW_TABLE_SCOPE_FROM_HOST ? > > > +/**@}*/ > > + > > /** > > * @warning > > * @b EXPERIMENTAL: this API may change without prior notice. > > @@ -5201,6 +5224,11 @@ struct rte_flow_template_table_attr { > > * Maximum number of flow rules that this table holds. > > */ > > uint32_t nb_flows; > > + /** > > + * Optional hint flags for PMD optimization. > > + * Value is composed with RTE_FLOW_TABLE_SPECIALIZE_*. > > + */ > > + uint32_t specialize; > > Why not "scope" or something? > > > }; > > > > /** > > -- > > 2.27.0 > > > > Thank you.