RE: [PATCH v1] ethdev: add direction info when creating the transfer table

Ivan Malov Wed, 14 Sep 2022 00:32:18 -0700

Hi,

On Wed, 14 Sep 2022, Rongwei Liu wrote:

HI

BR
Rongwei

-----Original Message-----
From: Ivan Malov <ivan.ma...@oktetlabs.ru>
Sent: Tuesday, September 13, 2022 22:33
To: Rongwei Liu <rongw...@nvidia.com>
Cc: Matan Azrad <ma...@nvidia.com>; Slava Ovsiienko
<viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; NBU-Contact-
Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>; Aman Singh
<aman.deep.si...@intel.com>; Yuying Zhang <yuying.zh...@intel.com>;
Andrew Rybchenko <andrew.rybche...@oktetlabs.ru>; dev@dpdk.org; Raslan
Darawsheh <rasl...@nvidia.com>
Subject: RE: [PATCH v1] ethdev: add direction info when creating the transfer
table

External email: Use caution opening links or attachments


Hi Rongwei,

PSB

On Tue, 13 Sep 2022, Rongwei Liu wrote:

Hi

BR
Rongwei

-----Original Message-----
From: Ivan Malov <ivan.ma...@oktetlabs.ru>
Sent: Tuesday, September 13, 2022 00:57
To: Rongwei Liu <rongw...@nvidia.com>
Cc: Matan Azrad <ma...@nvidia.com>; Slava Ovsiienko
<viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; NBU-Contact-
Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>; Aman Singh
<aman.deep.si...@intel.com>; Yuying Zhang <yuying.zh...@intel.com>;
Andrew Rybchenko <andrew.rybche...@oktetlabs.ru>; dev@dpdk.org;
Raslan Darawsheh <rasl...@nvidia.com>
Subject: Re: [PATCH v1] ethdev: add direction info when creating the
transfer table

External email: Use caution opening links or attachments


Hi,

On Wed, 7 Sep 2022, Rongwei Liu wrote:

The transfer domain rule is able to match traffic wire/vf origin and
it means two directions' underlayer resource.


The point of fact is that matching traffic coming from some entity
like wire / VF has been long generalised in the form of representors.
So, a flow rule with attribute "transfer" is able to match traffic
coming from either a REPRESENTED_PORT or from a PORT_REPRESENTOR

(please find these items).


In customer deployments, they usually match only one direction
traffic in single flow table: either from wire or from vf.


Which customer deployments? Could you please provide detailed examples?


We saw a lot of customers' deployment like:
1. Match overlay traffic from wire and do decap, then send to specific vport.
2. Match specific 5-tuples and do encap, then send to wire.
The matching criteria has obvious direction preference.


Thank you. My questions are as follows:

In (1), when you say "from wire", do you mean the need to match packets
arriving via whatever physical ports rather then matching packets arriving
from some specific phys. port?


^^

Could you please find my question above? Based on your understanding
of templates in async flow approach, an answer to this question may
help us find the common ground.

--


If, however, matching traffic "from wire" in fact means matching packets
arriving from a *specific* physical port, then for sure item
REPRESENTED_PORT should perfectly do the job, and the proposed attribute is
unneeded.

(BTW, in DPDK, it is customary to use term "physical port", not "wire")

In (1), what are "vport"s? Please explain. Once again, I should remind that, in
DPDK, folks prefer terms "represented entity" / "representor"
over vendor-specific terms like "vport", etc.

Vport is virtual port for short such as VF.


Thanks. As I say, term "vport" might be confusing to some readers,
so it'd be better to provide this explanation (about VF)
in the commit description next time.

As for (2), imagine matching 5-tuple traffic emitted by a VF / guest.
Could you please explain, why not just add a match item REPRESENTED_PORT
pointing to that VF via its representor? Doing so should perfectly define the
exact direction / traffic source. Isn't that sufficient?

Per my view, there is matching field and matching value difference.
Like IPv4 src_addr 1.1.1.1, 1.1.1.2. 1.1.1.3, will you treat it as same or 
different matching criteria?
I would like to call them same since it can be summarized like 1.1.1.0/30
REPRESENTED_PORT is just another matching item, no essential differences and it 
can't stand for direction info.


It looks like we're starting to run into disagreement here.
There's no "direction" at all. There's an embedded switch
inside the NIC, and there're (logical) switch ports that
packets enter the switch from.

When the user submits a "transfer" rule and does not provide
neither REPRESENTED_PORT nor PORT_REPRESENTOR in the pattern,
the embedded switch is supposed to match packets coming from
ANY ports, be it VFs or physical (wire) ports.

But when the user provides, in example, item REPRESENTED_PORT
to point to the physical (wire) port, the embedded switch
knows exactly which port the packets should enter it from.
In this case, it is supposed to match only packets coming
from that physical port. And this should be sufficient.
This in fact replaces the need to know a "direction".
It's just an exact specification of packet's origin.

Port id depends on the attach sequence.


Unfortunately, this is hardly a good argument because flow rules
are supposed to be inserted based on the run-time packet
learning. Attach sequence is a don't care here.

Also please mind that, although I appreciate your explanations here, on the
mailing list, they should finally be added to the commit message, so that
readers do not have to look for them elsewhere.

We have explained the high possibility of single-direction matching, right?


Not quite. As I said, it is not correct to assume any "direction", like in
geographical sense ("north", "south", etc.). Application has ethdevs, and
they are representors of some "virtual ports" (in your terminology)
belonging to the switch, for example, VFs, SFs or physical ports.

The user adds an appropriate item to the pattern (REPRESENTED_PORT),
and doing so specifies the packet path which it enters the switch.

It' hard to list all the possibilities of traffic matching preferences.


And let's say more: one need never do this. That's exactly the reason
why DPDK has abandoned the concept of "direction" in *transfer* rules
and switched to the use of precise criteria (REPRESENTED_PORT, etc.).

The underlay is the one we have met for now.

Introduce one new member transfer_mode into rte_flow_attr to
indicate the flow table direction property: from wire, from vf or
bi-direction(default).


AFAIK, 'rte_flow_attr' serves both traditional flow rule insertion
and asynchronous (table) approach. The patch adds the attributes to
generic 'rte_flow_attr' but, for some reason, ignores non-table rules.

Sync API uses one rule to contain everything. It' hard for PMD to determine

if this rule has direction preference or not.

Image a situation, just for an example:
1. Vport 1 VxLAN do decap send to vport 2.     1 million scale
2. Vport 0 (wire) VxLAN do decap send to vport 3.   1 hundred scale.
1 and 2 share the same matching conditions (eth / ipv4 / udp / vxlan /...), so

sync API consider them share matching determination logic.

It means "2" have 1M scale capability too. Obviously, it wastes a lot of

resources.

Strictly speaking, they do not share the same match pattern.
Your example clearly shows that, in (1), the pattern should request packets
coming from "vport 1" and, in (2), packets coming from "vport 0".

My point is simple: the "vport" from which packets enter the embedded switch
is ALSO a match criterion. If you accept this, you'll see: the matching
conditions differ.

See above.
In this case, I think the matching fields are both "port_id + ipv4_vxlan". They 
are same.
Only differs with values like vni 100 or 200 vice versa.


Not quite. Look closer: you use *different* port IDs for (1) and (2).
The value of "ethdev_id" field in item REPRESENTED_PORT differs.


In async API, there is pattern_template introduced. We can mark "1" to use

pattern_tempate id 1 and "2" to use pattern_template 2.

They will be separated from each other, don't share anymore.


Consider an example. "Wire" is a physical port represented by PF0 which, in
turn, is attached to DPDK via ethdev 0. "VF" (vport?) is attached to guest and 
is
represented by a representor ethdev 1 in DPDK.

So, some rules (template 1) are needed to deliver packets from "wire"
to "VF" and also decapsulate them. And some rules (template 2) are needed to
deliver packets in the opposite direction, from "VF"
to "wire" and also encapsulate them.

My question is, what prevents you from adding match item
REPRESENTED_PORT[ethdev_id=0] to the pattern template 1 and
REPRESENTED_PORT[ethdev_id=1] to the pattern template 2?

As I said previously, if you insert such item before eth / ipv4 / etc to your
match pattern, doing so defines an *exact* direction / source.

Could you check the async API guidance? I think pattern template focusing on 
the matching field (mask).
"REPRESENTED_PORT[ethdev_id=0] " and "REPRESENTED_PORT[ethdev_id=1] "are the 
same.
1. pattern  template:  REPRESENTED_PORT mask 0xffff ...
2. action template: action1 / actions2. /
3. table create with pattern_template plus action template..
REPRESENTED_PORT[ethdev_id=0]  will be rule1:  rule create REPRESENTED_PORT 
port_id is 0 / actions ....
REPRESENTED_PORT[ethdev_id=1]  will be rule2:  rule create REPRESENTED_PORT 
port_id is 1 / actions ....


OK, so, based on this explanation, it appears that
you might be looking to refer to:
a) a *set* of any physical (wire) ports
b) a *set* of any guest ports (VFs)

You chose to achieve this using an attribute, but:

1) as I explained above, the use of term "direction" is wrong;
   please hear me out: I'm not saying that your use case and
   your optimisation is wrong: I'm saying that naming for it
   is wrong: it has nothing to do with "direction";

2) while naming a *set* of wire ports as "wire_orig" might be OK,
   sticking with term "vf_orig" for a *set* of guest ports is
   clearly not, simply because the user may pass another PF
   to a guest instead of passing a VF; in other words,
   a better term is needed here;

3) since it is possible to plug multiple NICs to a DPDK application,
   even from different vendors, the user may end up having multiple
   physical ports belonging to different physical NICs attached to
   the application; if this is the case, then referring to a *set*
   of wire ports using the new attribute is ambiguous in the
   sense that it's unclear whether this applies only to
   wire ports of some specific physical NIC or to the
   physical ports of *all* NICs managed by the app;

4) adding an attribute instead of yet another pattern item type
   is not quite good because PMDs need to be updated separately
   to detect this attribute and throw an error if it's not
   supported, whilst with a new item type, the PMDs do not
   need to be updated = if a PMD sees an unsupported item
   while traversing the item with switch () { case }, it
   will anyway throw an error;

5) as in (4), a new attribute is not good from documentation
   standpoint; plase search for "represented_port = Y" in
   documentation = this way, all supported items are
   easily defined for various NIC vendors, but the
   same isn't true for attributes = there is no
   way to indicate supported attributes in doc.

If points (1 - 5) make sense to you, then, if I may be so bold,
I'd like to suggest that the idea of adding a new attribute be
abandoned. Instead, I'd like to suggest adding new items:

(the names are just sketch, for sure, it should be discussed)

ANY_PHY_PORTS { switch_domain_id }
 = match packets entering the embedded switch from *whatever*
   physical ports belonging to the given switch domain

ANY_GUEST_PORTS { switch_domain_id }
 = match packets entering the embedded switch from *whatever*
   guest ports (VFs, PFs, etc.) belonging to the given
   switch domain

The field "switch_domain_id" is required to tell one physical
board / vendor from another (as I explained in point (3)).
The application can query this parameter from ethdev's
switch info: please see "struct rte_eth_switch_info".

What's your opinion?

For example, the diff below adds the attributes to "table" commands
in testpmd but does not add them to regular (non-table) commands like
"flow create". Why?


"table" command limits pattern_template to single direction or bidirection

per user specified attribute.

As I say above, the same effect can be achieved by adding item
REPRESENTED_PORT to the corresponding pattern template.

See above.

"rule" command must tight with one "table_id", so the rule will inherit the

"table" direction property, no need to specify again.

You migh've misunderstood. I do not talk about "rule" command coupled with
some "table". What I talk about is regular, NON-async flow insertion
commands.

Please take a look at section "/* Validate/create attributes. */" in file
"app/test-pmd/cmdline_flow.c". When one adds a new flow attribute, they
should reflect it the same way as VC_INGRESS, VC_TRANSFER, etc.

That's it.

We don't intend to pass this to sync API. The above code example is for sync 
API.


So I understand. But there's one slight problem: in your patch, you add
the new attributes to the structure which is *shared* between sync and
async use case scenarios. If one adds an attribute to this structure,
they have to provide accessors for it in all sync-related commands
in testpmd, but your patch does not do that.

In other words, it is wrong to assume that "struct rte_flow_attr" only
applies to async approach. It had been introduced long before the
async flow design was added to DPDK. That's it.


But, as I say, I still believe that the new attributes aren't needed.

I think we are not at the same page for now. Can we reach agreement on the same
matching criteria first?

It helps to save underlayer memory also on insertion rate.


Which memory? Host memory? NIC memory? Term "underlayer" is vague.
I suggest that the commit message be revised to first explain how
such memory is spent currently, then explain why this is not optimal
and, finally, which way the patch is supposed to improve that. I.e. be more

specific.


For large scalable rules, HW (depends on implementation) always needs

memory to hold the rules' patterns and actions, either from NIC or from host.

The memory footprint highly depends on "user rules' complexity", also diff

between NICs.

~50% memory saving is expected if one-direction is cut.


Regardless of this talk, this explanation should probably be present in the
commit description.

This number may differ with different NICs or implementation. We can't say it 
for sure.


Not an exact number, of course, but a brief explanation of:
a) what is wrong / not optimal in the current design;
b) how it is observed in customer deployments;
c) why the proposed patch is a good solution.

By default, the transfer domain is bi-direction, and no behavior changes.

1. Match wire origin only
 flow template_table 0 create group 0 priority 0 transfer wire_orig...
2. Match vf origin only
 flow template_table 0 create group 0 priority 0 transfer vf_orig...

Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
---
app/test-pmd/cmdline_flow.c                 | 26 +++++++++++++++++++++
doc/guides/testpmd_app_ug/testpmd_funcs.rst |  3 ++-
lib/ethdev/rte_flow.h                       |  9 ++++++-
3 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c
b/app/test-pmd/cmdline_flow.c index 7f50028eb7..b25b595e82 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -177,6 +177,8 @@ enum index {
      TABLE_INGRESS,
      TABLE_EGRESS,
      TABLE_TRANSFER,
+     TABLE_TRANSFER_WIRE_ORIG,
+     TABLE_TRANSFER_VF_ORIG,
      TABLE_RULES_NUMBER,
      TABLE_PATTERN_TEMPLATE,
      TABLE_ACTIONS_TEMPLATE,
@@ -1141,6 +1143,8 @@ static const enum index next_table_attr[] = {
      TABLE_INGRESS,
      TABLE_EGRESS,
      TABLE_TRANSFER,
+     TABLE_TRANSFER_WIRE_ORIG,
+     TABLE_TRANSFER_VF_ORIG,
      TABLE_RULES_NUMBER,
      TABLE_PATTERN_TEMPLATE,
      TABLE_ACTIONS_TEMPLATE,
@@ -2881,6 +2885,18 @@ static const struct token token_list[] = {
              .next = NEXT(next_table_attr),
              .call = parse_table,
      },
+     [TABLE_TRANSFER_WIRE_ORIG] = {
+             .name = "wire_orig",
+             .help = "affect rule direction to transfer",


This does not explain the "wire" aspect. It's too broad.

+             .next = NEXT(next_table_attr),
+             .call = parse_table,
+     },
+     [TABLE_TRANSFER_VF_ORIG] = {
+             .name = "vf_orig",
+             .help = "affect rule direction to transfer",


This explanation simply duplicates such of the "wire_orig".
It does not explain the "vf" part. Should be more specific.

+             .next = NEXT(next_table_attr),
+             .call = parse_table,
+     },
      [TABLE_RULES_NUMBER] = {
              .name = "rules_number",
              .help = "number of rules in table", @@ -8894,6
+8910,16 @@ parse_table(struct context *ctx, const struct token *token,
      case TABLE_TRANSFER:
              out->args.table.attr.flow_attr.transfer = 1;
              return len;
+     case TABLE_TRANSFER_WIRE_ORIG:
+             if (!out->args.table.attr.flow_attr.transfer)
+                     return -1;
+             out->args.table.attr.flow_attr.transfer_mode = 1;
+             return len;
+     case TABLE_TRANSFER_VF_ORIG:
+             if (!out->args.table.attr.flow_attr.transfer)
+                     return -1;
+             out->args.table.attr.flow_attr.transfer_mode = 2;
+             return len;
      default:
              return -1;
      }
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 330e34427d..603b7988dd 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3332,7 +3332,8 @@ It is bound to

``rte_flow_template_table_create()``::


  flow template_table {port_id} create
      [table_id {id}] [group {group_id}]
-       [priority {level}] [ingress] [egress] [transfer]
+       [priority {level}] [ingress] [egress]
+       [transfer [vf_orig] [wire_orig]]


Is it correct? Shouldn't it rather be [transfer] [vf_orig]
[wire_orig] ?

      rules_number {number}
      pattern_template {pattern_template_id}
      actions_template {actions_template_id} diff --git
a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
a79f1e7ef0..512b08d817 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -130,7 +130,14 @@ struct rte_flow_attr {
       * through a suitable port. @see rte_flow_pick_transfer_proxy().
       */
      uint32_t transfer:1;
-     uint32_t reserved:29; /**< Reserved, must be zero. */
+     /**
+      * 0 means bidirection,
+      * 0x1 origin uplink,


What does "uplink" mean? It's too vague. Hardly a good term.


I believe this comment should be reworked, in case
the idea of having an extra attribute persists.

+      * 0x2 origin vport,


What does "origin vport" mean? Hardly a good term as well.


I still believe this explanation is way too brief and needs
to be reworked to provide more details, to define the
use case for the attribute more specifically.

+      * N/A both set.


What's this?


The question stands.

+      */
+     uint32_t transfer_mode:2;
+     uint32_t reserved:27; /**< Reserved, must be zero. */
};

/**
--
2.27.0


Since the attributes are added to generic 'struct rte_flow_attr',
non-table
(synchronous) flow rules are supposed to support them, too. If that
is indeed the case, then I'm afraid such proposal does not agree with
the existing items PORT_REPRESENTOR and REPRESENTED_PORT. They do
exactly the same thing, but they are designed to be way more generic. Why

not use them?

The question stands.


Ivan


Ivan

RE: [PATCH v1] ethdev: add direction info when creating the transfer table

Reply via email to