On 10/30/19 11:59 AM, Slava Ovsiienko wrote:
-----Original Message-----
From: Andrew Rybchenko <arybche...@solarflare.com>
Sent: Wednesday, October 30, 2019 9:35
To: Slava Ovsiienko <viachesl...@mellanox.com>; Thomas Monjalon
<tho...@monjalon.net>; olivier.m...@6wind.com
Cc: dev@dpdk.org; Matan Azrad <ma...@mellanox.com>; Ori Kam
<or...@mellanox.com>; Yongseok Koh <ys...@mellanox.com>
Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata

@Olivier, please, take a look at the end of the mail.

On 10/29/19 8:19 PM, Slava Ovsiienko wrote:
Hi, Andrew

Thank you for the review.

-----Original Message-----
From: Andrew Rybchenko <arybche...@solarflare.com>
Sent: Tuesday, October 29, 2019 18:22
To: Slava Ovsiienko <viachesl...@mellanox.com>; dev@dpdk.org
Cc: Thomas Monjalon <tho...@monjalon.net>; Matan Azrad
<ma...@mellanox.com>; olivier.m...@6wind.com; Ori Kam
<or...@mellanox.com>; Yongseok Koh <ys...@mellanox.com>
Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata

On 10/27/19 9:40 PM, Viacheslav Ovsiienko wrote:
Currently, metadata can be set on egress path via mbuf tx_metadata
field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META
matches metadata.
This patch extends the metadata feature usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a
rule and matched by another rule. This new action allows metadata to
be set as a result of flow match.

2) Metadata on ingress

There's also need to support metadata on ingress. Metadata can be
set by SET_META action and matched by META item like Tx. The final
value set by the action will be delivered to application via
metadata dynamic field of mbuf which can be accessed by
RTE_FLOW_DYNF_METADATA().
PKT_RX_DYNF_METADATA flag will be set along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior to use SET_META action.

The availability of dynamic mbuf metadata field can be checked with
rte_flow_dynf_metadata_avail() routine.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on hardware capability.

Signed-off-by: Yongseok Koh <ys...@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viachesl...@mellanox.com>
Above explanations lack information about "meta" vs "mark" which may
be set on Rx as well and delivered in other mbuf field.
It should be explained by one more field is required and rules defined.
There is some story about metadata features.
Initially, there were proposed two metadata related actions:

- RTE_FLOW_ACTION_TYPE_FLAG
- RTE_FLOW_ACTION_TYPE_MARK

These actions set the special flag in the packet metadata, MARK action
stores some specified value in the metadata storage, and, on the
packet receiving PMD puts the flag and value to the mbuf and
applications can see the packet was threated inside flow engine
according to the appropriate RTE flow(s). MARK and FLAG are like some
kind of gateway to transfer some per-packet information from the flow
engine to the application via receiving datapath.
  From the datapath point of view, the MARK and FLAG are related to the
receiving side only.
It would useful to have the same gateway on the transmitting side and
there was the feature of type RTE_FLOW_ITEM_TYPE_META was
proposed.
The application can fill the field in mbuf and this value will be transferred to
some field in the packet metadata inside the flow engine.
It did not matter whether these metadata fields are shared because of
MARK and META items belonged to different domains (receiving and
transmitting) and could be vendor-specific.
So far, so good, DPDK proposes some entities to control metadata
inside the flow engine and gateways to exchange these values on a per-
packet basis via datapaths.
As we can see, the MARK and META means are not symmetric, there is
absent action which would allow us to set META value on the transmitting
path. So, the action of type:
- RTE_FLOW_ACTION_TYPE_SET_META is proposed.

The next, applications raise the new requirements for packet metadata.
The flow engines are getting more complex, internal switches are
introduced, multiple ports might be supported within the same flow
engine namespace. From the DPDK points of view, it means the packets
might be sent on one eth_dev port and received on the other one, and the
packet path inside the flow engine entirely belongs to the same hardware
device. The simplest example is SR-IOV with PF, VFs and the representors.
And there is a brilliant opportunity to provide some out-of-band channel to
transfer some extra data
   from one port to another one, besides the packet data itself.


Above explanations lack information about "meta" vs "mark" which may
be set on Rx as well and delivered in other mbuf field.
It should be explained by one more field is required and rules defined.
Otherwise we can endup in half PMDs supporting mark only, half PMDs
supporting meta only and applications in an interesting situation to
make a choice which one to use.
There is no "mark" vs "meta". MARK and META means are kept for
compatibility issues and legacy part works exactly as before. The
trials (with flow_validate)  is supposed to check whether PMD supports
MARK or META feature on appropriate domain. It depends on PMD
implementation, configuration and underlaying HW/FW/kernel capabilities
and should be resolved in runtime.

The trials a way, but very tricky way. My imagination draws me pictures how
an application code could look like in attempt to use either mark or meta for
Rx only and these pictures are not nice.
Agree, trials is not the best way.
For now there is no application trying to choose mark or meta, because these 
ones
belonged to other domains, and extension is newly introduced.
So, the applications using mark will continue use mark, the same is regarding 
meta.
The new application definitely will ask for both mark and of them,
we have the requirements from customers  - "give us as many through bits as you 
can".
This new application just may refuse to work if metadata features are not 
detected,
because relays on it strongly.
BTW, trials are not so complex: rte_flow_validate(mark), 
rte_flow_validate_metadata()
and that's it.

In assumption that pattern is supported and fate action used together with mark
is supported as well. Not that easy, but OK.

May be it will look acceptable when mark becomes a dynamic since usage of
either one or another dynamic field is definitely easier than usage of either
fixed or dynamic field.
At least in PMD datapath it does not look very ugly.
May be dynamic field for mark at fixed offset should be introduced in the
release or the nearest future? It will allow to preserve ABI up to 20.11 and
provide future proof API.
The trick is to register dynamic meta field at fixed offset at start of a day to
be sure that it is guaranteed to succeed.
It sounds like it is a transition mechanism from fixed to dynamic fields.

[snip]

diff --git a/lib/librte_ethdev/rte_flow.h
b/lib/librte_ethdev/rte_flow.h index 4fee105..b821557 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -28,6 +28,8 @@
    #include <rte_byteorder.h>
    #include <rte_esp.h>
    #include <rte_higig.h>
+#include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>

    #ifdef __cplusplus
    extern "C" {
@@ -418,7 +420,8 @@ enum rte_flow_item_type {
        /**
         * [META]
         *
-        * Matches a metadata value specified in mbuf metadata field.
+        * Matches a metadata value.
+        *
         * See struct rte_flow_item_meta.
         */
        RTE_FLOW_ITEM_TYPE_META,
@@ -1263,9 +1266,17 @@ struct rte_flow_item_icmp6_nd_opt_tla_eth
{
    #endif

    /**
- * RTE_FLOW_ITEM_TYPE_META.
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
Is it allowed to make experimental back?
I think we should remove EXPERIMENTAL here. We do not introduce new
feature, but just extend the apply area.
Agreed.

     *
- * Matches a specified metadata value.
+ * RTE_FLOW_ITEM_TYPE_META
+ *
+ * Matches a specified metadata value. On egress, metadata can be
+ set either by
+ * mbuf tx_metadata field with PKT_TX_METADATA flag or
+ * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
+ RTE_FLOW_ACTION_TYPE_SET_META sets
+ * metadata for a packet and the metadata will be reported via mbuf
+ metadata
+ * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic
mbuf
+ field must be
+ * registered in advance by rte_flow_dynf_metadata_register().
     */
    struct rte_flow_item_meta {
        rte_be32_t data;
[snip]

@@ -2429,6 +2447,55 @@ struct rte_flow_action_set_mac {
        uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
    };

+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_META
+ *
+ * Set metadata. Metadata set by mbuf tx_metadata field with
+ * PKT_TX_METADATA flag on egress will be overridden by this action.
+On
+ * ingress, the metadata will be carried by mbuf metadata dynamic
+field
+ * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
+must be
+ * registered in advance by rte_flow_dynf_metadata_register().
+ *
+ * Altering partial bits is supported with mask. For bits which
+have never
+ * been set, unpredictable value will be seen depending on driver
+ * implementation. For loopback/hairpin packet, metadata set on
+Rx/Tx may
+ * or may not be propagated to the other path depending on HW
capability.
+ *
+ * RTE_FLOW_ITEM_TYPE_META matches metadata.
+ */
+struct rte_flow_action_set_meta {
+       rte_be32_t data;
+       rte_be32_t mask;
As I understand tx_metadata is host endian. Just double-checking.
Is a new dynamic field host endian or big endian?
I definitely would like to see motivation in comments why data/mask
are big- endian here.
metadata is opaque value, endianness does not matter, there are no
some special motivations for choosing endiannes. rte_flow_item_meta()
structure provides data with rte_be32_t type, so meta related action does
the same.

Endianness of meta in mbuf and flow API should match and it must be
Yes, and they match (for meta), both are rte_be32_t. OK, will emphasize it in 
docs.

documented. Endianness is important if a HW supports less bits since it
makes a hit for application to use LSB first if the bit space is sufficient.
mark is defined as host-endian (uint32_t) and I think meta should be the
same. Otherwise it complicates even more either mark or meta usage as
discussed above .
mark is mark, meta is meta, these ones are not interrelated (despite they
are becoming similar). And there is no choice between mark and meta.
It is not obvious we should have the same endianness for both.
There are some applications using this legacy features, so there might be 
compatibility
issues either.

There are few reasons above to make meta host endian:
- match mark endianness (explained above, I still think that my reasons vaild)
- make it easier for application to use it without endianness conversion
  if a sequence number is used to allocate metas (similar to mark in OVS)
  and bit space is less than 32-bits

I see no single reason to keep it big-endian except a reason to keep it.

Since tx_metadata is going away and metadata was used for Tx only
before it is a good moment to fix it.

Yes, I think that rte_flow_item_meta should be fixed since both mark and
tx_metadata are host-endian.

(it says nothing about HW interface which is vendor specific and vendor
PMDs should care about it)
Handling this in PMD might introduce the extra endianness conversion in 
datapath,
impacting performance. Not nice, IMO.

These concerns should not affect external interface since
it could be different for different HW vendors.

Reply via email to