> -----Original Message----- > From: Andrew Rybchenko <arybche...@solarflare.com> > Sent: Wednesday, October 30, 2019 11:20 > To: Slava Ovsiienko <viachesl...@mellanox.com>; Thomas Monjalon > <tho...@monjalon.net>; olivier.m...@6wind.com > Cc: dev@dpdk.org; Matan Azrad <ma...@mellanox.com>; Ori Kam > <or...@mellanox.com>; Yongseok Koh <ys...@mellanox.com> > Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata > > On 10/30/19 11:59 AM, Slava Ovsiienko wrote: > >> -----Original Message----- > >> From: Andrew Rybchenko <arybche...@solarflare.com> > >> Sent: Wednesday, October 30, 2019 9:35 > >> To: Slava Ovsiienko <viachesl...@mellanox.com>; Thomas Monjalon > >> <tho...@monjalon.net>; olivier.m...@6wind.com > >> Cc: dev@dpdk.org; Matan Azrad <ma...@mellanox.com>; Ori Kam > >> <or...@mellanox.com>; Yongseok Koh <ys...@mellanox.com> > >> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata > >> > >> @Olivier, please, take a look at the end of the mail. > >> > >> On 10/29/19 8:19 PM, Slava Ovsiienko wrote: > >>> Hi, Andrew > >>> > >>> Thank you for the review. > >>> > >>>> -----Original Message----- > >>>> From: Andrew Rybchenko <arybche...@solarflare.com> > >>>> Sent: Tuesday, October 29, 2019 18:22 > >>>> To: Slava Ovsiienko <viachesl...@mellanox.com>; dev@dpdk.org > >>>> Cc: Thomas Monjalon <tho...@monjalon.net>; Matan Azrad > >>>> <ma...@mellanox.com>; olivier.m...@6wind.com; Ori Kam > >>>> <or...@mellanox.com>; Yongseok Koh <ys...@mellanox.com> > >>>> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata > >>>> > >>>> On 10/27/19 9:40 PM, Viacheslav Ovsiienko wrote: > >>>>> Currently, metadata can be set on egress path via mbuf tx_metadata > >>>>> field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META > >>>> matches metadata. > >>>>> This patch extends the metadata feature usability. > >>>>> > >>>>> 1) RTE_FLOW_ACTION_TYPE_SET_META > >>>>> > >>>>> When supporting multiple tables, Tx metadata can also be set by a > >>>>> rule and matched by another rule. This new action allows metadata > >>>>> to be set as a result of flow match. > >>>>> > >>>>> 2) Metadata on ingress > >>>>> > >>>>> There's also need to support metadata on ingress. Metadata can be > >>>>> set by SET_META action and matched by META item like Tx. The final > >>>>> value set by the action will be delivered to application via > >>>>> metadata dynamic field of mbuf which can be accessed by > >>>> RTE_FLOW_DYNF_METADATA(). > >>>>> PKT_RX_DYNF_METADATA flag will be set along with the data. > >>>>> > >>>>> The mbuf dynamic field must be registered by calling > >>>>> rte_flow_dynf_metadata_register() prior to use SET_META action. > >>>>> > >>>>> The availability of dynamic mbuf metadata field can be checked > >>>>> with > >>>>> rte_flow_dynf_metadata_avail() routine. > >>>>> > >>>>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not > >>>>> be propagated to the other path depending on hardware capability. > >>>>> > >>>>> Signed-off-by: Yongseok Koh <ys...@mellanox.com> > >>>>> Signed-off-by: Viacheslav Ovsiienko <viachesl...@mellanox.com> > >>>> Above explanations lack information about "meta" vs "mark" which > >>>> may be set on Rx as well and delivered in other mbuf field. > >>>> It should be explained by one more field is required and rules defined. > >>> There is some story about metadata features. > >>> Initially, there were proposed two metadata related actions: > >>> > >>> - RTE_FLOW_ACTION_TYPE_FLAG > >>> - RTE_FLOW_ACTION_TYPE_MARK > >>> > >>> These actions set the special flag in the packet metadata, MARK > >>> action stores some specified value in the metadata storage, and, on > >>> the packet receiving PMD puts the flag and value to the mbuf and > >>> applications can see the packet was threated inside flow engine > >>> according to the appropriate RTE flow(s). MARK and FLAG are like > >>> some kind of gateway to transfer some per-packet information from > >>> the flow > >> engine to the application via receiving datapath. > >>> From the datapath point of view, the MARK and FLAG are related to > >>> the > >> receiving side only. > >>> It would useful to have the same gateway on the transmitting side > >>> and there was the feature of type RTE_FLOW_ITEM_TYPE_META was > >> proposed. > >>> The application can fill the field in mbuf and this value will be > >>> transferred to > >> some field in the packet metadata inside the flow engine. > >>> It did not matter whether these metadata fields are shared because > >>> of MARK and META items belonged to different domains (receiving and > >> transmitting) and could be vendor-specific. > >>> So far, so good, DPDK proposes some entities to control metadata > >>> inside the flow engine and gateways to exchange these values on a > >>> per- > >> packet basis via datapaths. > >>> As we can see, the MARK and META means are not symmetric, there is > >>> absent action which would allow us to set META value on the > >>> transmitting > >> path. So, the action of type: > >>> - RTE_FLOW_ACTION_TYPE_SET_META is proposed. > >>> > >>> The next, applications raise the new requirements for packet metadata. > >>> The flow engines are getting more complex, internal switches are > >>> introduced, multiple ports might be supported within the same flow > >>> engine namespace. From the DPDK points of view, it means the packets > >>> might be sent on one eth_dev port and received on the other one, and > >>> the > >> packet path inside the flow engine entirely belongs to the same > >> hardware device. The simplest example is SR-IOV with PF, VFs and the > representors. > >>> And there is a brilliant opportunity to provide some out-of-band > >>> channel to > >> transfer some extra data > >>> from one port to another one, besides the packet data itself. > >>> > >>> > >>>> Above explanations lack information about "meta" vs "mark" which > >>>> may be set on Rx as well and delivered in other mbuf field. > >>>> It should be explained by one more field is required and rules defined. > >>>> Otherwise we can endup in half PMDs supporting mark only, half PMDs > >>>> supporting meta only and applications in an interesting situation > >>>> to make a choice which one to use. > >>> There is no "mark" vs "meta". MARK and META means are kept for > >>> compatibility issues and legacy part works exactly as before. The > >>> trials (with flow_validate) is supposed to check whether PMD > >>> supports MARK or META feature on appropriate domain. It depends on > >>> PMD implementation, configuration and underlaying HW/FW/kernel > >>> capabilities > >> and should be resolved in runtime. > >> > >> The trials a way, but very tricky way. My imagination draws me > >> pictures how an application code could look like in attempt to use > >> either mark or meta for Rx only and these pictures are not nice. > > Agree, trials is not the best way. > > For now there is no application trying to choose mark or meta, because > > these ones belonged to other domains, and extension is newly introduced. > > So, the applications using mark will continue use mark, the same is > regarding meta. > > The new application definitely will ask for both mark and of them, we > > have the requirements from customers - "give us as many through bits as > you can". > > This new application just may refuse to work if metadata features are > > not detected, because relays on it strongly. > > BTW, trials are not so complex: rte_flow_validate(mark), > > rte_flow_validate_metadata() and that's it. > > In assumption that pattern is supported and fate action used together with > mark is supported as well. Not that easy, but OK. > > >> May be it will look acceptable when mark becomes a dynamic since > >> usage of either one or another dynamic field is definitely easier > >> than usage of either fixed or dynamic field. > > At least in PMD datapath it does not look very ugly. > >> May be dynamic field for mark at fixed offset should be introduced in > >> the release or the nearest future? It will allow to preserve ABI up > >> to 20.11 and provide future proof API. > >> The trick is to register dynamic meta field at fixed offset at start > >> of a day to be sure that it is guaranteed to succeed. > >> It sounds like it is a transition mechanism from fixed to dynamic fields. > >> > >> [snip] > >> > >>>>> diff --git a/lib/librte_ethdev/rte_flow.h > >>>>> b/lib/librte_ethdev/rte_flow.h index 4fee105..b821557 100644 > >>>>> --- a/lib/librte_ethdev/rte_flow.h > >>>>> +++ b/lib/librte_ethdev/rte_flow.h > >>>>> @@ -28,6 +28,8 @@ > >>>>> #include <rte_byteorder.h> > >>>>> #include <rte_esp.h> > >>>>> #include <rte_higig.h> > >>>>> +#include <rte_mbuf.h> > >>>>> +#include <rte_mbuf_dyn.h> > >>>>> > >>>>> #ifdef __cplusplus > >>>>> extern "C" { > >>>>> @@ -418,7 +420,8 @@ enum rte_flow_item_type { > >>>>> /** > >>>>> * [META] > >>>>> * > >>>>> - * Matches a metadata value specified in mbuf metadata field. > >>>>> + * Matches a metadata value. > >>>>> + * > >>>>> * See struct rte_flow_item_meta. > >>>>> */ > >>>>> RTE_FLOW_ITEM_TYPE_META, > >>>>> @@ -1263,9 +1266,17 @@ struct > rte_flow_item_icmp6_nd_opt_tla_eth > >> { > >>>>> #endif > >>>>> > >>>>> /** > >>>>> - * RTE_FLOW_ITEM_TYPE_META. > >>>>> + * @warning > >>>>> + * @b EXPERIMENTAL: this structure may change without prior > >>>>> + notice > >>>> Is it allowed to make experimental back? > >>> I think we should remove EXPERIMENTAL here. We do not introduce > new > >>> feature, but just extend the apply area. > >> Agreed. > >> > >>>>> * > >>>>> - * Matches a specified metadata value. > >>>>> + * RTE_FLOW_ITEM_TYPE_META > >>>>> + * > >>>>> + * Matches a specified metadata value. On egress, metadata can be > >>>>> + set either by > >>>>> + * mbuf tx_metadata field with PKT_TX_METADATA flag or > >>>>> + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress, > >>>>> + RTE_FLOW_ACTION_TYPE_SET_META sets > >>>>> + * metadata for a packet and the metadata will be reported via > >>>>> + mbuf metadata > >>>>> + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic > >> mbuf > >>>>> + field must be > >>>>> + * registered in advance by rte_flow_dynf_metadata_register(). > >>>>> */ > >>>>> struct rte_flow_item_meta { > >>>>> rte_be32_t data; > >>>> [snip] > >>>> > >>>>> @@ -2429,6 +2447,55 @@ struct rte_flow_action_set_mac { > >>>>> uint8_t mac_addr[RTE_ETHER_ADDR_LEN]; > >>>>> }; > >>>>> > >>>>> +/** > >>>>> + * @warning > >>>>> + * @b EXPERIMENTAL: this structure may change without prior > >>>>> +notice > >>>>> + * > >>>>> + * RTE_FLOW_ACTION_TYPE_SET_META > >>>>> + * > >>>>> + * Set metadata. Metadata set by mbuf tx_metadata field with > >>>>> + * PKT_TX_METADATA flag on egress will be overridden by this > action. > >>>>> +On > >>>>> + * ingress, the metadata will be carried by mbuf metadata dynamic > >>>>> +field > >>>>> + * with PKT_RX_DYNF_METADATA flag if set. The dynamic mbuf field > >>>>> +must be > >>>>> + * registered in advance by rte_flow_dynf_metadata_register(). > >>>>> + * > >>>>> + * Altering partial bits is supported with mask. For bits which > >>>>> +have never > >>>>> + * been set, unpredictable value will be seen depending on driver > >>>>> + * implementation. For loopback/hairpin packet, metadata set on > >>>>> +Rx/Tx may > >>>>> + * or may not be propagated to the other path depending on HW > >>>> capability. > >>>>> + * > >>>>> + * RTE_FLOW_ITEM_TYPE_META matches metadata. > >>>>> + */ > >>>>> +struct rte_flow_action_set_meta { > >>>>> + rte_be32_t data; > >>>>> + rte_be32_t mask; > >>>> As I understand tx_metadata is host endian. Just double-checking. > >>>> Is a new dynamic field host endian or big endian? > >>>> I definitely would like to see motivation in comments why data/mask > >>>> are big- endian here. > >>> metadata is opaque value, endianness does not matter, there are no > >>> some special motivations for choosing endiannes. > >>> rte_flow_item_meta() structure provides data with rte_be32_t type, > >>> so meta related action does > >> the same. > >> > >> Endianness of meta in mbuf and flow API should match and it must be > > Yes, and they match (for meta), both are rte_be32_t. OK, will emphasize it > in docs. > > > >> documented. Endianness is important if a HW supports less bits since > >> it makes a hit for application to use LSB first if the bit space is > >> sufficient. > >> mark is defined as host-endian (uint32_t) and I think meta should be > >> the same. Otherwise it complicates even more either mark or meta > >> usage as discussed above . > > mark is mark, meta is meta, these ones are not interrelated (despite > > they are becoming similar). And there is no choice between mark and meta. > > It is not obvious we should have the same endianness for both. > > There are some applications using this legacy features, so there might > > be compatibility issues either. > > There are few reasons above to make meta host endian: > - match mark endianness (explained above, I still think that my reasons > vaild) > - make it easier for application to use it without endianness conversion > if a sequence number is used to allocate metas (similar to mark in OVS) > and bit space is less than 32-bits > > I see no single reason to keep it big-endian except a reason to keep it. > > Since tx_metadata is going away and metadata was used for Tx only before > it is a good moment to fix it. > > >> Yes, I think that rte_flow_item_meta should be fixed since both mark > >> and tx_metadata are host-endian. > >> > >> (it says nothing about HW interface which is vendor specific and > >> vendor PMDs should care about it) > > Handling this in PMD might introduce the extra endianness conversion > > in datapath, impacting performance. Not nice, IMO. > > These concerns should not affect external interface since it could be > different for different HW vendors.
OK, will alter metadata endianness to host one. There Is undefeatable argument the other [META] items/actions are all in HE. With best regards, Slava