> -----Original Message-----
> From: Andrew Rybchenko <arybche...@solarflare.com>
> Sent: Wednesday, October 30, 2019 11:20
> To: Slava Ovsiienko <viachesl...@mellanox.com>; Thomas Monjalon
> <tho...@monjalon.net>; olivier.m...@6wind.com
> Cc: dev@dpdk.org; Matan Azrad <ma...@mellanox.com>; Ori Kam
> <or...@mellanox.com>; Yongseok Koh <ys...@mellanox.com>
> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
> 
> On 10/30/19 11:59 AM, Slava Ovsiienko wrote:
> >> -----Original Message-----
> >> From: Andrew Rybchenko <arybche...@solarflare.com>
> >> Sent: Wednesday, October 30, 2019 9:35
> >> To: Slava Ovsiienko <viachesl...@mellanox.com>; Thomas Monjalon
> >> <tho...@monjalon.net>; olivier.m...@6wind.com
> >> Cc: dev@dpdk.org; Matan Azrad <ma...@mellanox.com>; Ori Kam
> >> <or...@mellanox.com>; Yongseok Koh <ys...@mellanox.com>
> >> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
> >>
> >> @Olivier, please, take a look at the end of the mail.
> >>
> >> On 10/29/19 8:19 PM, Slava Ovsiienko wrote:
> >>> Hi, Andrew
> >>>
> >>> Thank you for the review.
> >>>
> >>>> -----Original Message-----
> >>>> From: Andrew Rybchenko <arybche...@solarflare.com>
> >>>> Sent: Tuesday, October 29, 2019 18:22
> >>>> To: Slava Ovsiienko <viachesl...@mellanox.com>; dev@dpdk.org
> >>>> Cc: Thomas Monjalon <tho...@monjalon.net>; Matan Azrad
> >>>> <ma...@mellanox.com>; olivier.m...@6wind.com; Ori Kam
> >>>> <or...@mellanox.com>; Yongseok Koh <ys...@mellanox.com>
> >>>> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
> >>>>
> >>>> On 10/27/19 9:40 PM, Viacheslav Ovsiienko wrote:
> >>>>> Currently, metadata can be set on egress path via mbuf tx_metadata
> >>>>> field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META
> >>>> matches metadata.
> >>>>> This patch extends the metadata feature usability.
> >>>>>
> >>>>> 1) RTE_FLOW_ACTION_TYPE_SET_META
> >>>>>
> >>>>> When supporting multiple tables, Tx metadata can also be set by a
> >>>>> rule and matched by another rule. This new action allows metadata
> >>>>> to be set as a result of flow match.
> >>>>>
> >>>>> 2) Metadata on ingress
> >>>>>
> >>>>> There's also need to support metadata on ingress. Metadata can be
> >>>>> set by SET_META action and matched by META item like Tx. The final
> >>>>> value set by the action will be delivered to application via
> >>>>> metadata dynamic field of mbuf which can be accessed by
> >>>> RTE_FLOW_DYNF_METADATA().
> >>>>> PKT_RX_DYNF_METADATA flag will be set along with the data.
> >>>>>
> >>>>> The mbuf dynamic field must be registered by calling
> >>>>> rte_flow_dynf_metadata_register() prior to use SET_META action.
> >>>>>
> >>>>> The availability of dynamic mbuf metadata field can be checked
> >>>>> with
> >>>>> rte_flow_dynf_metadata_avail() routine.
> >>>>>
> >>>>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not
> >>>>> be propagated to the other path depending on hardware capability.
> >>>>>
> >>>>> Signed-off-by: Yongseok Koh <ys...@mellanox.com>
> >>>>> Signed-off-by: Viacheslav Ovsiienko <viachesl...@mellanox.com>
> >>>> Above explanations lack information about "meta" vs "mark" which
> >>>> may be set on Rx as well and delivered in other mbuf field.
> >>>> It should be explained by one more field is required and rules defined.
> >>> There is some story about metadata features.
> >>> Initially, there were proposed two metadata related actions:
> >>>
> >>> - RTE_FLOW_ACTION_TYPE_FLAG
> >>> - RTE_FLOW_ACTION_TYPE_MARK
> >>>
> >>> These actions set the special flag in the packet metadata, MARK
> >>> action stores some specified value in the metadata storage, and, on
> >>> the packet receiving PMD puts the flag and value to the mbuf and
> >>> applications can see the packet was threated inside flow engine
> >>> according to the appropriate RTE flow(s). MARK and FLAG are like
> >>> some kind of gateway to transfer some per-packet information from
> >>> the flow
> >> engine to the application via receiving datapath.
> >>>   From the datapath point of view, the MARK and FLAG are related to
> >>> the
> >> receiving side only.
> >>> It would useful to have the same gateway on the transmitting side
> >>> and there was the feature of type RTE_FLOW_ITEM_TYPE_META was
> >> proposed.
> >>> The application can fill the field in mbuf and this value will be
> >>> transferred to
> >> some field in the packet metadata inside the flow engine.
> >>> It did not matter whether these metadata fields are shared because
> >>> of MARK and META items belonged to different domains (receiving and
> >> transmitting) and could be vendor-specific.
> >>> So far, so good, DPDK proposes some entities to control metadata
> >>> inside the flow engine and gateways to exchange these values on a
> >>> per-
> >> packet basis via datapaths.
> >>> As we can see, the MARK and META means are not symmetric, there is
> >>> absent action which would allow us to set META value on the
> >>> transmitting
> >> path. So, the action of type:
> >>> - RTE_FLOW_ACTION_TYPE_SET_META is proposed.
> >>>
> >>> The next, applications raise the new requirements for packet metadata.
> >>> The flow engines are getting more complex, internal switches are
> >>> introduced, multiple ports might be supported within the same flow
> >>> engine namespace. From the DPDK points of view, it means the packets
> >>> might be sent on one eth_dev port and received on the other one, and
> >>> the
> >> packet path inside the flow engine entirely belongs to the same
> >> hardware device. The simplest example is SR-IOV with PF, VFs and the
> representors.
> >>> And there is a brilliant opportunity to provide some out-of-band
> >>> channel to
> >> transfer some extra data
> >>>    from one port to another one, besides the packet data itself.
> >>>
> >>>
> >>>> Above explanations lack information about "meta" vs "mark" which
> >>>> may be set on Rx as well and delivered in other mbuf field.
> >>>> It should be explained by one more field is required and rules defined.
> >>>> Otherwise we can endup in half PMDs supporting mark only, half PMDs
> >>>> supporting meta only and applications in an interesting situation
> >>>> to make a choice which one to use.
> >>> There is no "mark" vs "meta". MARK and META means are kept for
> >>> compatibility issues and legacy part works exactly as before. The
> >>> trials (with flow_validate)  is supposed to check whether PMD
> >>> supports MARK or META feature on appropriate domain. It depends on
> >>> PMD implementation, configuration and underlaying HW/FW/kernel
> >>> capabilities
> >> and should be resolved in runtime.
> >>
> >> The trials a way, but very tricky way. My imagination draws me
> >> pictures how an application code could look like in attempt to use
> >> either mark or meta for Rx only and these pictures are not nice.
> > Agree, trials is not the best way.
> > For now there is no application trying to choose mark or meta, because
> > these ones belonged to other domains, and extension is newly introduced.
> > So, the applications using mark will continue use mark, the same is
> regarding meta.
> > The new application definitely will ask for both mark and of them, we
> > have the requirements from customers  - "give us as many through bits as
> you can".
> > This new application just may refuse to work if metadata features are
> > not detected, because relays on it strongly.
> > BTW, trials are not so complex: rte_flow_validate(mark),
> > rte_flow_validate_metadata() and that's it.
> 
> In assumption that pattern is supported and fate action used together with
> mark is supported as well. Not that easy, but OK.
> 
> >> May be it will look acceptable when mark becomes a dynamic since
> >> usage of either one or another dynamic field is definitely easier
> >> than usage of either fixed or dynamic field.
> > At least in PMD datapath it does not look very ugly.
> >> May be dynamic field for mark at fixed offset should be introduced in
> >> the release or the nearest future? It will allow to preserve ABI up
> >> to 20.11 and provide future proof API.
> >> The trick is to register dynamic meta field at fixed offset at start
> >> of a day to be sure that it is guaranteed to succeed.
> >> It sounds like it is a transition mechanism from fixed to dynamic fields.
> >>
> >> [snip]
> >>
> >>>>> diff --git a/lib/librte_ethdev/rte_flow.h
> >>>>> b/lib/librte_ethdev/rte_flow.h index 4fee105..b821557 100644
> >>>>> --- a/lib/librte_ethdev/rte_flow.h
> >>>>> +++ b/lib/librte_ethdev/rte_flow.h
> >>>>> @@ -28,6 +28,8 @@
> >>>>>     #include <rte_byteorder.h>
> >>>>>     #include <rte_esp.h>
> >>>>>     #include <rte_higig.h>
> >>>>> +#include <rte_mbuf.h>
> >>>>> +#include <rte_mbuf_dyn.h>
> >>>>>
> >>>>>     #ifdef __cplusplus
> >>>>>     extern "C" {
> >>>>> @@ -418,7 +420,8 @@ enum rte_flow_item_type {
> >>>>>         /**
> >>>>>          * [META]
> >>>>>          *
> >>>>> -        * Matches a metadata value specified in mbuf metadata field.
> >>>>> +        * Matches a metadata value.
> >>>>> +        *
> >>>>>          * See struct rte_flow_item_meta.
> >>>>>          */
> >>>>>         RTE_FLOW_ITEM_TYPE_META,
> >>>>> @@ -1263,9 +1266,17 @@ struct
> rte_flow_item_icmp6_nd_opt_tla_eth
> >> {
> >>>>>     #endif
> >>>>>
> >>>>>     /**
> >>>>> - * RTE_FLOW_ITEM_TYPE_META.
> >>>>> + * @warning
> >>>>> + * @b EXPERIMENTAL: this structure may change without prior
> >>>>> + notice
> >>>> Is it allowed to make experimental back?
> >>> I think we should remove EXPERIMENTAL here. We do not introduce
> new
> >>> feature, but just extend the apply area.
> >> Agreed.
> >>
> >>>>>      *
> >>>>> - * Matches a specified metadata value.
> >>>>> + * RTE_FLOW_ITEM_TYPE_META
> >>>>> + *
> >>>>> + * Matches a specified metadata value. On egress, metadata can be
> >>>>> + set either by
> >>>>> + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> >>>>> + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> >>>>> + RTE_FLOW_ACTION_TYPE_SET_META sets
> >>>>> + * metadata for a packet and the metadata will be reported via
> >>>>> + mbuf metadata
> >>>>> + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic
> >> mbuf
> >>>>> + field must be
> >>>>> + * registered in advance by rte_flow_dynf_metadata_register().
> >>>>>      */
> >>>>>     struct rte_flow_item_meta {
> >>>>>         rte_be32_t data;
> >>>> [snip]
> >>>>
> >>>>> @@ -2429,6 +2447,55 @@ struct rte_flow_action_set_mac {
> >>>>>         uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
> >>>>>     };
> >>>>>
> >>>>> +/**
> >>>>> + * @warning
> >>>>> + * @b EXPERIMENTAL: this structure may change without prior
> >>>>> +notice
> >>>>> + *
> >>>>> + * RTE_FLOW_ACTION_TYPE_SET_META
> >>>>> + *
> >>>>> + * Set metadata. Metadata set by mbuf tx_metadata field with
> >>>>> + * PKT_TX_METADATA flag on egress will be overridden by this
> action.
> >>>>> +On
> >>>>> + * ingress, the metadata will be carried by mbuf metadata dynamic
> >>>>> +field
> >>>>> + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
> >>>>> +must be
> >>>>> + * registered in advance by rte_flow_dynf_metadata_register().
> >>>>> + *
> >>>>> + * Altering partial bits is supported with mask. For bits which
> >>>>> +have never
> >>>>> + * been set, unpredictable value will be seen depending on driver
> >>>>> + * implementation. For loopback/hairpin packet, metadata set on
> >>>>> +Rx/Tx may
> >>>>> + * or may not be propagated to the other path depending on HW
> >>>> capability.
> >>>>> + *
> >>>>> + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> >>>>> + */
> >>>>> +struct rte_flow_action_set_meta {
> >>>>> +       rte_be32_t data;
> >>>>> +       rte_be32_t mask;
> >>>> As I understand tx_metadata is host endian. Just double-checking.
> >>>> Is a new dynamic field host endian or big endian?
> >>>> I definitely would like to see motivation in comments why data/mask
> >>>> are big- endian here.
> >>> metadata is opaque value, endianness does not matter, there are no
> >>> some special motivations for choosing endiannes.
> >>> rte_flow_item_meta() structure provides data with rte_be32_t type,
> >>> so meta related action does
> >> the same.
> >>
> >> Endianness of meta in mbuf and flow API should match and it must be
> > Yes, and they match (for meta), both are rte_be32_t. OK, will emphasize it
> in docs.
> >
> >> documented. Endianness is important if a HW supports less bits since
> >> it makes a hit for application to use LSB first if the bit space is 
> >> sufficient.
> >> mark is defined as host-endian (uint32_t) and I think meta should be
> >> the same. Otherwise it complicates even more either mark or meta
> >> usage as discussed above .
> > mark is mark, meta is meta, these ones are not interrelated (despite
> > they are becoming similar). And there is no choice between mark and meta.
> > It is not obvious we should have the same endianness for both.
> > There are some applications using this legacy features, so there might
> > be compatibility issues either.
> 
> There are few reasons above to make meta host endian:
> - match mark endianness (explained above, I still think that my reasons
> vaild)
> - make it easier for application to use it without endianness conversion
>    if a sequence number is used to allocate metas (similar to mark in OVS)
>    and bit space is less than 32-bits
> 
> I see no single reason to keep it big-endian except a reason to keep it.
> 
> Since tx_metadata is going away and metadata was used for Tx only before
> it is a good moment to fix it.
> 
> >> Yes, I think that rte_flow_item_meta should be fixed since both mark
> >> and tx_metadata are host-endian.
> >>
> >> (it says nothing about HW interface which is vendor specific and
> >> vendor PMDs should care about it)
> > Handling this in PMD might introduce the extra endianness conversion
> > in datapath, impacting performance. Not nice, IMO.
> 
> These concerns should not affect external interface since it could be
> different for different HW vendors.

OK, will alter metadata endianness to host one. There Is undefeatable argument
the other [META] items/actions are all in HE.

With best regards, Slava

Reply via email to