Insertion rate has very very few improvements.
After all, I'm not objecting this patch. But I believe that other reviewers'
concerns should nevertheless be addressed anyway.
Let me try to show the hint is useful.
On Mon, 30 Jan 2023, Rongwei Liu wrote:
Hi Ivan,
BR
Rongwei
-----Original Message-----
From: Ivan Malov <ivan.ma...@arknetworks.am>
Sent: Monday, January 30, 2023 08:00
To: Rongwei Liu <rongw...@nvidia.com>
Cc: Matan Azrad <ma...@nvidia.com>; Slava Ovsiienko
<viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; NBU-Contact-
Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>; Aman Singh
<aman.deep.si...@intel.com>; Yuying Zhang <yuying.zh...@intel.com>;
Ferruh Yigit <ferruh.yi...@amd.com>; Andrew Rybchenko
<andrew.rybche...@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
<rasl...@nvidia.com>
Subject: Re: [PATCH v7] ethdev: add special flags when creating
async transfer table
External email: Use caution opening links or attachments
Hi Rongwei,
Thanks for persevering. I have no strong opinion, but, at least,
the fact that the new flags are no longer meant for use in
rte_flow_attr, which is clearly not the right place for such, is an
improvement.
Thanks for the suggestion, move it to rte_flow_table_attr now and it'
dedicated to async API.
However, let's take a closer look at the current patch, shall we?
But, before we get to that, I'd like to kindly request that you
provide a more concrete example of how this feature is supposed to
be used. Are there some real-life application examples?
Sure.
Also, to me, it's still unclear how an application can obtain the
knowledge of this hint in the first instance. For example, can Open
vSwitch somehow tell ethdevs representing physical ports from ones
representing "vports" (host endpoints)?
How does it know which attribute to specify?
Hint should be initiated by application and application knows it'
traffic
pattern which highly relates to deployment.
Let' use VxLAN encap/decap as an example:
1. Traffic from wire should be VxLAN pattern and do the decap, then
send to
different vports.
flow pattern_template 0 create transfer relaxed no
pattern_template_id
4 template represented_port ethdev_port_id is 0 / eth / ipv4 / udp /
vxlan / tag index is 0 data is 0x33 / end flow actions_template 0
create transfer actions_template_id 4 template raw_decap index 0 /
represented_port ethdev_port_id 1 / end mask raw_decap index 0 /
represented_port ethdev_port_id 1 / end flow template_table 0 create
group 1 priority 0 transfer wire_orig table_id 4 rules_number 128
pattern_template 4 actions_template 4
2. Traffic from vports should be encap with different VxLAN header
and send
to wire.
flow actions_template 1 create transfer actions_template_id 5
template raw_encap index 0 / represented_port ethdev_port_id 0 / end
mask raw_encap index 0 / represented_port ethdev_port_id 0 / end
flow template_table 0 create group 1 priority 0 transfer vport_orig
table_id 5 rules_number 128 pattern_template 4 actions_template 5
For the rest of my notes, PSB.
On Mon, 14 Nov 2022, Rongwei Liu wrote:
In case flow rules match only one kind of traffic in a flow table,
then optimization can be done via allocation of this table.
This wording might confuse readers. Consider rephrasing it, please:
If multiple flow rules share a common set of match masks, then they
might belong in a flow table which can be pre-allocated.
Such optimization is possible only if the application gives a hint
about its usage of the table during initial configuration.
The transfer domain rules may process traffic from wire or vport,
which may correspond to two kinds of underlayer resources.
Why name it a "vport"? Why not "host"?
host = packets generated by any of the host's "vport"s wire =
packets arriving at the NIC from the network
Vport is "virtual port" for short and contains "VF/SF" for now.
Per my thoughts, it' clearer and maps to DPDK port probing/management.
I understand that "host" might not be a brilliant name.
If "vport" stands for every port of the NIC that is not a network
port, then this name might be OK to me, but why doesn't it cover PFs?
A PF is clearly not a network / physical port. Why just VF/SF then? Where
does that "for now"
decision come from? Just wondering.
"For now" stands for my understanding. DPDK is always in evolution, right?
You are right, PF should be included in 'vport" concept.
That's why the first two hints introduced in this patch are about
wire and vport traffic specialization.
Wire means traffic arrives from the uplink port while vport means
traffic initiated from VF/SF.
By the sound of it, the meaning is confined to just VFs/SFs.
What if the user wants to match packets coming from PFs?
It should be "wire_orig".
Forgive me, but that does not sound correct. Say, there's an
application and it has a PF plugged into it: ethdev index 0. And the
application transmits packets using rte_eth_tx_burst() from that port.
You say that these packets can be matched via "wire_orig".
But they do not come from the wire. They come from PF...
Hmm. My mistake.
This may highly depend on PMD implementation. Basically, PFs' traffic
may contain "from wire"/"wire_orig" and '"from local"/"vport_orig".
That' why we emphasize it' optional for PMD.
There are two possible approaches for providing the hints.
Using IPv4 as an example:
1. Use pattern item in both template table and flow rules.
pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
"ANY_VPORT" needs to be present in each flow rule even if it's
just a hint. No value to match because matching is already done by
IPv4 item.
Why no value to match on? How does it prevent rogue tenants from
spoofing network headers? If the application receives a packet on a
particular vport's representor, then it may strictly specify item
represented_port pointing to that vport so that only packets from
that vport
match.
Why isn't security a consideration?
There is some misunderstanding here. "ANY_VPORT" is the approach
(new
matching item without value) suggested by you.
I'm not talking about ANY_VPORT in this particular paragraph.
There's item "represented_port" mentioned over there. I'm just asking
about this "already done by IPv4 item" bit. Yes, it matches on the
header but not on the true origin of the packet (the logical port of
the NIC). If the app knows which logical port the packet ingresses
the NIC, why not match on it for security?
Hint is not a matching and it implies how to manage underlayer steering
resource.
If "vport_orig" is present, PMD will only apply the steering logic to vport
traffic.
The resource is allocated in the async table before each rule. Already cover
security considerations.
Matching on "represented_port" needs to program each rule, considering a
port range like index "5-10".
Hint tells PMD only to take care of traffic from vport regardless the port
index.
I was explaining we need to apply it to each flow rule even if it's
only a flag
and no value.
That's clear. But PSB.
2. Add special flags into table_attr.
template_table 0 create table_id 0 group 1 transfer vport_orig
Approach 1 needs to specify the pattern in each flow rule which
wastes memory and is not user friendly.
What if the user has to insert a group of rules which not only have
the same set of match masks but also share exactly the same match
spec values for a limited subset of network items (for example,
those of an encap. header)? This way, a subset of network item
specs can remain fixed across many rules. Does that count as wasting
memory?
Per my understanding, you are talking "multiple spec and mask mixing".
Say, there's a group of rules, and each of them matches on exactly
the same encap. header (the same in all rules), but different internal match
field values.
So, why don't these "fixed"
encap. header items deserve being "optimised" somehow, the same way
as this "wire_orig" does?
We are back to original point. Async approach is trying to pre-allocate
resources and speed up the insertion.
Resource is allocated in async table stage and we only have mask
information.
In each rule, the matching value passes in. I guess you are saying to optimize
per different matching values, right?
This needs dynamic calculations per each rule and wastes the resource in
async table(table allocates resource for all possible values).
If the application knows for sure that there will be packets with
exactly the same encap. header, - that forms this special knowledge
that can be used during init times to help the PMD optimise resource
allocation.
Isn't that so? Don't these items deserve some form of a "hint"?
It can deserve some kinds of "hint". But see above, these hints are per rule
and resource allocation happens before rules.
That's not per rule. Perhaps I should've worded it differently.
Suppose, an application has to insert many flow rules, each of which has
match items A and B. Item A not only has the same mask in *all* rule
instances, but also the same spec.
On the other hand, item B only has the same mask in all the rules, but its spec
is different for each rule.
In this example, the application can allocate a template with items A and B,
but that only provides a fixed mask for them. And the application will HAVE to
provide item A with exactly the same spec in all rule instances. The PMD, in
turn, will HAVE to process this item every time, being unable to see it's in
fact
the same at all times.
To me, this sounds very similar to how you described the need to always
provide item ANY_VPORT in each rule thus facing some waste of memory and
parsing difficulties.
If the application knows that a certain item (or a certain fraction of items) is
going to be entirely the same (mask +
spec) across all the rules, why shouldn't it be able to express this as a hint
to
the PMD? Why shouldn't it be able to avoid providing such items in every new
flow rule instance? The same way the "vport_orig" works.
I'm not demanding that you re-implement or re-design this.
Just trying to find out whether such a problem can indeed be acknowledged.
Or has it been solved already? If not, then perhaps it pays to just discuss
whether solving it can be combined with this "vport_orig" solution.
What do you think? What do others think?
We provide a hint in this patch and no assumption on the matching
patterns.
So I understand. My point is, certain portions of matching patterns
may be "fixed" = entirely the same (masks and specs) in all rules of
a table. Why not give PMD a "hint" about them, too?
I think matching pattern is totally controlled by application layer.
So is the "direction" spec: the app layer has item represented_port
to control that. But, still, we're here to discuss a hint for that.
Why does the new hint aim exclusively at optimising out this specific
meta item? Why isn't it possible to care about a generic portion of
"know in advance" all-the-same items?
" generic portion of know in advance" is some still kind of dynamic approach,
right?
Imagine a situation. DPDK has 10 VFs, each VF may have different VxLAN
encap headers.
This hint approach can work for 10 VFs once.
In public cloud deployments, each VF/SF may map to different users, but
underlay is almost same(GRE/VxLAN... differ in filed values).
"wasting memory " because your approach needs to scatter in each
rule
while this patch only needs to set table_attr once.
No relation with matching patter totally.
The slight problem with your proposal is that for some reason only
one type of a match criterion deserves a hint moved to the attrs.
Whilst in reality the applicaction may know in advance that certain
subsets of items will not only have the same masks in all rules but
also totally the same specs. If that is a valid use case, why doesn't
it deserve the same (more
generic) optimisation / a hint? Just wondering...
Or has that been addressed already somehow?
Believe me, the hint helps us to save significant resources already.
I'm not arguing it can be helpful. You're working round the clock to offer a
solution, - that's fine and is greatly appreciated.
But what I'm trying to say is that it looks like the problem might manifest
itself
for other type of knowledge that also may deserve a hint. Hence the questions.
Hence the offer to think of covering more match criteria, not just net/vport.
Per my view, your proposal is totally valid in sync approach, but
please check my response, Async is trying to allocate resources in advance
and speed up insertion ASAP.
So if it's valid in sync approach, then why can't it be valid in the async one?
And I guess it can reflect positively on the insertion rate, too. Why limit this
"hint" approach to just one aspect then?
I'm sure we're close to understanding each other here.
Yes, "orig_vport" is just a one-bit knowledge, and seems innocent to add as a
hint, but why not make it possible to have a hint for an arbitrary set of
always-
the-same match criteria?
In this case, nobody will ever argue of whether a hint is a match criterion or
if
it's not. It will be quite a generic instrument, potentially useful to vendors.
I'm afraid I can't think of an immediate example of such usefulness, but at
least it will appear as generic as possible from the API perspective.
If yes, then the problem does not concern just a single pair of
attributes, but rather deserves a more versatile solution like some
sort of indirect grouping of constant item specs.
Have you considered such options?
See above.
This patch takes the 2nd approach and introduces one new member
"specialize" into rte_flow_table_attr to indicate possible flow
table optimization.
The name "specialize" might have some drawbacks:
- spelling difference (specialise/specialize)
- in grep output, will mix with flows' "spec"
- quite long
- not a noun
Why not "scope"? Or something like that?
It means special optimization to PMD. "scope" is more rogue.
Why is it "rogue"? Scope is something limiting the point of view.
So are the suggested flags. Flag "wire_origin" (or whatever it can be
named
eventually) limits the scope of matching. No?
Hint won't interfere with matching. It has no knowledge of matching.
Does specifying "orig_vport" actually provide a *choice* for the packet origin?
Does it filter out everything else? If yes, then, alas, it *is* matching.
Because
matching is choosing something of interest. Let's face it.
As I said above, I do acknowledge the fact that, for some vendors, this match
criterion, internally goes to a different HW aspect that is separate from
matching on, for example, IPv4 addresses.
That's OK. But for some vendors, this might be just a regular match criterion
internally. So let's describe it with care.
Instead, it only controls matching resources. "wire_orig" tells PMD to
allocate HW resource for traffic from wire only.
If it controls "matching resources", it's indeed affiliated with matching then.
Look. When the application creates a template, it tells the PMD that it is going
to match on this, this and this.
Masks... No exact values; they will come at a later stage. But, with this
"wire_orig", the application tells the PMD that not only it will match on
*some* "direction", but it actually provides a SPEC for that. If it indicates
bit
"wire_orig", that equals to setting a "mask" for the "direction enum"
AND a "spec" (WIRE). Isn't that the case?
If it is, then please see my above concerns about possibly having similar need
to provide exact-spec hints for other items as well.
Then traffic from vport is sliently ignored. Hint doesn't know what are
matched and how many fields are involves.
By default, there is no hint, so the behavior of the transfer
domain doesn't change.
There is no guarantee that the hint will be used by the PMD.
Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
Acked-by: Ori Kam <orika at nvidia.com>
v2: Move the new field to template table attribute.
v4: Mark it as optional and clear the concept.
v5: Change specialize type to uint32_t.
v6: Change the flags to macros and re-construct the commit log.
v7: Fix build failure.
---
app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++
doc/guides/prog_guide/rte_flow.rst | 15 +++++++++++
doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
lib/ethdev/rte_flow.h | 28 +++++++++++++++++++++
4 files changed, 71 insertions(+), 1 deletion(-)
diff --git a/app/test-pmd/cmdline_flow.c
b/app/test-pmd/cmdline_flow.c index 88108498e0..62197f2618 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -184,6 +184,8 @@ enum index {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VPORT_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -1158,6 +1160,8 @@ static const enum index next_table_attr[] = {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VPORT_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -2933,6 +2937,18 @@ static const struct token token_list[] = {
.next = NEXT(next_table_attr),
.call = parse_table,
},
+ [TABLE_TRANSFER_WIRE_ORIG] = {
+ .name = "wire_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
+ [TABLE_TRANSFER_VPORT_ORIG] = {
+ .name = "vport_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
[TABLE_RULES_NUMBER] = {
.name = "rules_number",
.help = "number of rules in table", @@ -8993,6
+9009,16 @@ parse_table(struct context *ctx, const struct token
+*token,
case TABLE_TRANSFER:
out->args.table.attr.flow_attr.transfer = 1;
return len;
+ case TABLE_TRANSFER_WIRE_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.specialize =
RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG;
+ return len;
+ case TABLE_TRANSFER_VPORT_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.specialize =
RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG;
+ return len;
default:
return -1;
}
diff --git a/doc/guides/prog_guide/rte_flow.rst
b/doc/guides/prog_guide/rte_flow.rst
index 3e6242803d..d9ca041ae4 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -3605,6 +3605,21 @@ and pattern and actions templates are
created.
&actions_templates, nb_actions_templ,
&error);
+Table Attribute: Specialize
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Application can help optimizing underlayer resources and
+insertion rate by specializing template table.
+Specialization is done by providing hints in the template table
+attribute ``specialize``.
+
+This attribute is not mandatory for each PMD to implement.
+If a hint is not supported, it will be silently ignored, and no
+special optimization is done.
Silently ignoring the field does not sit well with the
application's possible intent to drop represented_port match from the
patterns.
From my point of view, if the application sets this attribute, it
believes it can rely on it, that is, packets coming from host won't
match if the attribute asks to match network only, for instance.
Has this
been considered?
+
+If a table is specialized, the application should make sure the
+rules comply with the table attribute.
How does the application enforce that? I would appreciate you explain it.
+
Asynchronous operations
-----------------------
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 96c5ae0fe4..b3238415f4 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3145,7 +3145,8 @@ It is bound to
``rte_flow_template_table_create()``::
flow template_table {port_id} create
[table_id {id}] [group {group_id}]
- [priority {level}] [ingress] [egress] [transfer]
+ [priority {level}] [ingress] [egress]
+ [transfer [vport_orig] [wire_orig]]
rules_number {number}
pattern_template {pattern_template_id}
actions_template {actions_template_id} diff --git
a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
8858b56428..c27b48c5c1 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -5186,6 +5186,29 @@
rte_flow_actions_template_destroy(uint16_t
port_id, */ struct rte_flow_template_table;
+/**@{@name Special optional flags for template table attribute
+ * Each bit is a hint for table specialization,
+ * offering a potential optimization at PMD layer.
+ * PMD can ignore the unsupported bits silently.
+ */
+/**
+ * Specialize table for transfer flows which come only from wire.
+ * It allows PMD not to allocate resources for non-wire originated
traffic.
+ * This bit is not a matching criteria, just an optimization hint.
You intended to spell "criterion", I take it. And still, it *is* a match
criterion.
I'm not denying the possible need to have this criterion at the
earliest processing stage. That might be OK, but I still have a
hunch that this is too specific.
Please see my comment above about wasting memory.
I guess this type of criterion is not the only one that may need to
be provided as a "hint".
+ * Flow rules which match non-wire originated traffic will be
+ missed
+ * if the hint is supported.
And what if it's unsupported? Is it indeed OK to silently ignore it?
+ */
+#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG
RTE_BIT32(0)
Why not RTE_FLOW_TABLE_SCOPE_FROM_WIRE ?
To me, TRANSFER looks redundant as this bit is already supposed to
be ticked in the "struct rte_flow_attr flow_attr" field of the
"struct rte_flow_template_table_attr".
+/**
+ * Specialize table for transfer flows which come only from vport
+(e.g. VF,
SF).
And PF?
+ * It allows PMD not to allocate resources for non-vport originated
traffic.
+ * This bit is not a matching criteria, just an optimization hint.
+ * Flow rules which match non-vport originated traffic will be
+missed
+ * if the hint is supported.
+ */
+#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG
RTE_BIT32(1)
Why not RTE_FLOW_TABLE_SCOPE_FROM_HOST ?
+/**@}*/
+
/**
* @warning
* @b EXPERIMENTAL: this API may change without prior notice.
@@ -5201,6 +5224,11 @@ struct rte_flow_template_table_attr {
* Maximum number of flow rules that this table holds.
*/
uint32_t nb_flows;
+ /**
+ * Optional hint flags for PMD optimization.
+ * Value is composed with RTE_FLOW_TABLE_SPECIALIZE_*.
+ */
+ uint32_t specialize;
Why not "scope" or something?
};
/**
--
2.27.0
Thank you.
Thank you.