Hi Rongwei,

For my responses, PSB.

By the way, now you mention things like wasting memory and insertion
optimisastions, are there any comparative figures to see the effect
of this hint on insertion performance / memory footprint?
Some "before" / "after" examples would really be helpful.

After all, I'm not objecting this patch. But I believe that other
reviewers' concerns should nevertheless be addressed anyway.

On Mon, 30 Jan 2023, Rongwei Liu wrote:

Hi Ivan,

BR
Rongwei

-----Original Message-----
From: Ivan Malov <ivan.ma...@arknetworks.am>
Sent: Monday, January 30, 2023 08:00
To: Rongwei Liu <rongw...@nvidia.com>
Cc: Matan Azrad <ma...@nvidia.com>; Slava Ovsiienko
<viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; NBU-Contact-
Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>; Aman Singh
<aman.deep.si...@intel.com>; Yuying Zhang <yuying.zh...@intel.com>;
Ferruh Yigit <ferruh.yi...@amd.com>; Andrew Rybchenko
<andrew.rybche...@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
<rasl...@nvidia.com>
Subject: Re: [PATCH v7] ethdev: add special flags when creating async transfer
table

External email: Use caution opening links or attachments


Hi Rongwei,

Thanks for persevering. I have no strong opinion, but, at least, the fact that 
the
new flags are no longer meant for use in rte_flow_attr, which is clearly not
the right place for such, is an improvement.

Thanks for the suggestion, move it to rte_flow_table_attr now and it' dedicated 
to async API.
However, let's take a closer look at the current patch, shall we?

But, before we get to that, I'd like to kindly request that you provide a more
concrete example of how this feature is supposed to be used. Are there some
real-life application examples?

Sure.
Also, to me, it's still unclear how an application can obtain the knowledge of
this hint in the first instance. For example, can Open vSwitch somehow tell
ethdevs representing physical ports from ones representing "vports" (host
endpoints)?
How does it know which attribute to specify?

Hint should be initiated by application and application knows it' traffic 
pattern which highly relates to deployment.
Let' use VxLAN encap/decap as an example:
1. Traffic from wire should be VxLAN pattern and do the decap, then send to 
different vports.
flow pattern_template 0 create transfer relaxed no pattern_template_id 4 
template represented_port ethdev_port_id is 0 / eth / ipv4 / udp / vxlan / tag 
index is 0 data is 0x33 / end
flow actions_template 0 create transfer actions_template_id 4 template 
raw_decap index 0 / represented_port ethdev_port_id 1 / end mask raw_decap 
index 0 / represented_port ethdev_port_id 1 / end
flow template_table 0 create group 1 priority 0 transfer wire_orig table_id 4 
rules_number 128 pattern_template 4 actions_template 4

2. Traffic from vports should be encap with different VxLAN header and send to 
wire.
flow actions_template 1 create transfer actions_template_id 5 template 
raw_encap index 0 / represented_port ethdev_port_id 0 / end mask raw_encap 
index 0 / represented_port ethdev_port_id 0 / end
flow template_table 0 create group 1 priority 0 transfer vport_orig table_id 5 
rules_number 128 pattern_template 4 actions_template 5

For the rest of my notes, PSB.

On Mon, 14 Nov 2022, Rongwei Liu wrote:

In case flow rules match only one kind of traffic in a flow table,
then optimization can be done via allocation of this table.

This wording might confuse readers. Consider rephrasing it, please:
If multiple flow rules share a common set of match masks, then they might
belong in a flow table which can be pre-allocated.

Such optimization is possible only if the application gives a hint
about its usage of the table during initial configuration.

The transfer domain rules may process traffic from wire or vport,
which may correspond to two kinds of underlayer resources.

Why name it a "vport"? Why not "host"?

host = packets generated by any of the host's "vport"s wire = packets arriving
at the NIC from the network
Vport is "virtual port" for short and contains "VF/SF" for now.
Per my thoughts, it' clearer and maps to DPDK port probing/management.

I understand that "host" might not be a brilliant name.

If "vport" stands for every port of the NIC that is not a network port,
then this name might be OK to me, but why doesn't it cover PFs? A PF is
clearly not a network / physical port. Why just VF/SF then? Where does
that "for now" decision come from? Just wondering.


That's why the first two hints introduced in this patch are about wire
and vport traffic specialization.
Wire means traffic arrives from the uplink port while vport means
traffic initiated from VF/SF.

By the sound of it, the meaning is confined to just VFs/SFs.
What if the user wants to match packets coming from PFs?

It should be "wire_orig".

Forgive me, but that does not sound correct. Say, there's an application
and it has a PF plugged into it: ethdev index 0. And the application
transmits packets using rte_eth_tx_burst() from that port.
You say that these packets can be matched via "wire_orig".
But they do not come from the wire. They come from PF...


There are two possible approaches for providing the hints.
Using IPv4 as an example:
1. Use pattern item in both template table and flow rules.

 pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end

 "ANY_VPORT" needs to be present in each flow rule even if it's  just
a hint. No value to match because matching is already done by
 IPv4 item.

Why no value to match on? How does it prevent rogue tenants from spoofing
network headers? If the application receives a packet on a particular vport's
representor, then it may strictly specify item represented_port pointing to that
vport so that only packets from that vport match.

Why isn't security a consideration?

There is some misunderstanding here.  "ANY_VPORT" is the approach (new matching 
item without value)  suggested by you.
I'm not talking about ANY_VPORT in this particular paragraph.

There's item "represented_port" mentioned over there. I'm just asking
about this "already done by IPv4 item" bit. Yes, it matches on the
header but not on the true origin of the packet (the logical port
of the NIC). If the app knows which logical port the packet
ingresses the NIC, why not match on it for security?

I was explaining we need to apply it to each flow rule even if it's only a flag 
and no value.

That's clear. But PSB.


2. Add special flags into table_attr.

 template_table 0 create table_id 0 group 1 transfer vport_orig

Approach 1 needs to specify the pattern in each flow rule which wastes
memory and is not user friendly.

What if the user has to insert a group of rules which not only have the same
set of match masks but also share exactly the same match spec values for a
limited subset of network items (for example, those of an encap. header)? This
way, a subset of network item specs can remain fixed across many rules. Does
that count as wasting memory?

Per my understanding, you are talking "multiple spec and mask mixing".

Say, there's a group of rules, and each of them matches on exactly
the same encap. header (the same in all rules), but different
internal match field values. So, why don't these "fixed"
encap. header items deserve being "optimised" somehow,
the same way as this "wire_orig" does?

If the application knows for sure that there will be packets with exactly
the same encap. header, - that forms this special knowledge that can be
used during init times to help the PMD optimise resource allocation.
Isn't that so? Don't these items deserve some form of a "hint"?

We provide a hint in this patch and no assumption on the matching patterns.

So I understand. My point is, certain portions of matching patterns
may be "fixed" = entirely the same (masks and specs) in all rules
of a table. Why not give PMD a "hint" about them, too?

I think matching pattern is totally controlled by application layer.

So is the "direction" spec: the app layer has item represented_port
to control that. But, still, we're here to discuss a hint for that.
Why does the new hint aim exclusively at optimising out this
specific meta item? Why isn't it possible to care about a
generic portion of "know in advance" all-the-same items?

"wasting memory " because your approach needs to scatter in each rule while 
this patch only needs to set table_attr once.
No relation with matching patter totally.

The slight problem with your proposal is that for some reason only
one type of a match criterion deserves a hint moved to the attrs.
Whilst in reality the applicaction may know in advance that
certain subsets of items will not only have the same masks
in all rules but also totally the same specs. If that is
a valid use case, why doesn't it deserve the same (more
generic) optimisation / a hint? Just wondering...
Or has that been addressed already somehow?

If yes, then the problem does not concern just a single pair of attributes, but
rather deserves a more versatile solution like some sort of indirect grouping of
constant item specs.
Have you considered such options?
See above.

This patch takes the 2nd approach and introduces one new member
"specialize" into rte_flow_table_attr to indicate possible flow table
optimization.

The name "specialize" might have some drawbacks:
- spelling difference (specialise/specialize)
- in grep output, will mix with flows' "spec"
- quite long
- not a noun

Why not "scope"? Or something like that?

It means special optimization to PMD. "scope" is more rogue.

Why is it "rogue"? Scope is something limiting the point of view.
So are the suggested flags. Flag "wire_origin" (or whatever it
can be named eventually) limits the scope of matching. No?


By default, there is no hint, so the behavior of the transfer domain
doesn't change.
There is no guarantee that the hint will be used by the PMD.

Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
Acked-by: Ori Kam <orika at nvidia.com>

v2: Move the new field to template table attribute.
v4: Mark it as optional and clear the concept.
v5: Change specialize type to uint32_t.
v6: Change the flags to macros and re-construct the commit log.
v7: Fix build failure.
---
app/test-pmd/cmdline_flow.c                 | 26 +++++++++++++++++++
doc/guides/prog_guide/rte_flow.rst          | 15 +++++++++++
doc/guides/testpmd_app_ug/testpmd_funcs.rst |  3 ++-
lib/ethdev/rte_flow.h                       | 28 +++++++++++++++++++++
4 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 88108498e0..62197f2618 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -184,6 +184,8 @@ enum index {
      TABLE_INGRESS,
      TABLE_EGRESS,
      TABLE_TRANSFER,
+     TABLE_TRANSFER_WIRE_ORIG,
+     TABLE_TRANSFER_VPORT_ORIG,
      TABLE_RULES_NUMBER,
      TABLE_PATTERN_TEMPLATE,
      TABLE_ACTIONS_TEMPLATE,
@@ -1158,6 +1160,8 @@ static const enum index next_table_attr[] = {
      TABLE_INGRESS,
      TABLE_EGRESS,
      TABLE_TRANSFER,
+     TABLE_TRANSFER_WIRE_ORIG,
+     TABLE_TRANSFER_VPORT_ORIG,
      TABLE_RULES_NUMBER,
      TABLE_PATTERN_TEMPLATE,
      TABLE_ACTIONS_TEMPLATE,
@@ -2933,6 +2937,18 @@ static const struct token token_list[] = {
              .next = NEXT(next_table_attr),
              .call = parse_table,
      },
+     [TABLE_TRANSFER_WIRE_ORIG] = {
+             .name = "wire_orig",
+             .help = "affect rule direction to transfer",
+             .next = NEXT(next_table_attr),
+             .call = parse_table,
+     },
+     [TABLE_TRANSFER_VPORT_ORIG] = {
+             .name = "vport_orig",
+             .help = "affect rule direction to transfer",
+             .next = NEXT(next_table_attr),
+             .call = parse_table,
+     },
      [TABLE_RULES_NUMBER] = {
              .name = "rules_number",
              .help = "number of rules in table", @@ -8993,6 +9009,16
@@ parse_table(struct context *ctx, const struct token *token,
      case TABLE_TRANSFER:
              out->args.table.attr.flow_attr.transfer = 1;
              return len;
+     case TABLE_TRANSFER_WIRE_ORIG:
+             if (!out->args.table.attr.flow_attr.transfer)
+                     return -1;
+             out->args.table.attr.specialize =
RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG;
+             return len;
+     case TABLE_TRANSFER_VPORT_ORIG:
+             if (!out->args.table.attr.flow_attr.transfer)
+                     return -1;
+             out->args.table.attr.specialize =
RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG;
+             return len;
      default:
              return -1;
      }
diff --git a/doc/guides/prog_guide/rte_flow.rst
b/doc/guides/prog_guide/rte_flow.rst
index 3e6242803d..d9ca041ae4 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -3605,6 +3605,21 @@ and pattern and actions templates are created.
                  &actions_templates, nb_actions_templ,
                  &error);

+Table Attribute: Specialize
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Application can help optimizing underlayer resources and insertion
+rate by specializing template table.
+Specialization is done by providing hints in the template table
+attribute ``specialize``.
+
+This attribute is not mandatory for each PMD to implement.
+If a hint is not supported, it will be silently ignored, and no
+special optimization is done.

Silently ignoring the field does not sit well with the application's possible 
intent
to drop represented_port match from the patterns. From my point of view, if
the application sets this attribute, it believes it can rely on it, that is, 
packets
coming from host won't match if the attribute asks to match network only, for
instance. Has this been considered?

+
+If a table is specialized, the application should make sure the rules
+comply with the table attribute.

How does the application enforce that? I would appreciate you explain it.

+
Asynchronous operations
-----------------------

diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 96c5ae0fe4..b3238415f4 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3145,7 +3145,8 @@ It is bound to
``rte_flow_template_table_create()``::

  flow template_table {port_id} create
      [table_id {id}] [group {group_id}]
-       [priority {level}] [ingress] [egress] [transfer]
+       [priority {level}] [ingress] [egress]
+       [transfer [vport_orig] [wire_orig]]
      rules_number {number}
      pattern_template {pattern_template_id}
      actions_template {actions_template_id} diff --git
a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
8858b56428..c27b48c5c1 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -5186,6 +5186,29 @@ rte_flow_actions_template_destroy(uint16_t
port_id, */ struct rte_flow_template_table;

+/**@{@name Special optional flags for template table attribute
+ * Each bit is a hint for table specialization,
+ * offering a potential optimization at PMD layer.
+ * PMD can ignore the unsupported bits silently.
+ */
+/**
+ * Specialize table for transfer flows which come only from wire.
+ * It allows PMD not to allocate resources for non-wire originated traffic.
+ * This bit is not a matching criteria, just an optimization hint.

You intended to spell "criterion", I take it. And still, it *is* a match 
criterion.
I'm not denying the possible need to have this criterion at the earliest
processing stage. That might be OK, but I still have a hunch that this is too
specific.
Please see my comment above about wasting memory.
I guess this type of criterion is not the only one that may need to be provided
as a "hint".

+ * Flow rules which match non-wire originated traffic will be missed
+ * if the hint is supported.

And what if it's unsupported? Is it indeed OK to silently ignore it?

+ */
+#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG
RTE_BIT32(0)

Why not RTE_FLOW_TABLE_SCOPE_FROM_WIRE ?

To me, TRANSFER looks redundant as this bit is already supposed to be ticked
in the "struct rte_flow_attr flow_attr" field of the "struct
rte_flow_template_table_attr".

+/**
+ * Specialize table for transfer flows which come only from vport
+(e.g. VF,
SF).

And PF?

+ * It allows PMD not to allocate resources for non-vport originated traffic.
+ * This bit is not a matching criteria, just an optimization hint.
+ * Flow rules which match non-vport originated traffic will be missed
+ * if the hint is supported.
+ */
+#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG
RTE_BIT32(1)

Why not RTE_FLOW_TABLE_SCOPE_FROM_HOST ?

+/**@}*/
+
/**
* @warning
* @b EXPERIMENTAL: this API may change without prior notice.
@@ -5201,6 +5224,11 @@ struct rte_flow_template_table_attr {
       * Maximum number of flow rules that this table holds.
       */
      uint32_t nb_flows;
+     /**
+      * Optional hint flags for PMD optimization.
+      * Value is composed with RTE_FLOW_TABLE_SPECIALIZE_*.
+      */
+     uint32_t specialize;

Why not "scope" or something?

};

/**
--
2.27.0


Thank you.

Reply via email to