Hi Slava,

> -----Original Message-----
> From: dev <dev-boun...@dpdk.org> On Behalf Of Viacheslav Ovsiienko
> Sent: Monday, October 11, 2021 9:15 PM
> Subject: [dpdk-dev] [PATCH v3 1/5] ethdev: introduce configurable flexible 
> item
> 
> 1. Introduction and Retrospective
> 
> Nowadays the networks are evolving fast and wide, the network structures are 
> getting more and more
> complicated, the new application areas are emerging. To address these 
> challenges the new network
> protocols are continuously being developed, considered by technical 
> communities, adopted by industry
> and, eventually implemented in hardware and software. The DPDK framework 
> follows the common
> trends and if we bother to glance at the RTE Flow API header we see the 
> multiple new items were
> introduced during the last years since the initial release.
> 
> The new protocol adoption and implementation process is not straightforward 
> and takes time, the new
> protocol passes development, consideration, adoption, and implementation 
> phases. The industry tries to
> mitigate and address the forthcoming network protocols, for example, many 
> hardware vendors are
> implementing flexible and configurable network protocol parsers. As DPDK 
> developers, could we
> anticipate the near future in the same fashion and introduce the similar 
> flexibility in RTE Flow API?
> 
> Let's check what we already have merged in our project, and we see the nice 
> raw item
> (rte_flow_item_raw). At the first glance, it looks superior and we can try to 
> implement a flow matching on
> the header of some relatively new tunnel protocol, say on the GENEVE header 
> with variable length
> options. And, under further consideration, we run into the raw item
> limitations:
> 
> - only fixed size network header can be represented
> - the entire network header pattern of fixed format
>   (header field offsets are fixed) must be provided
> - the search for patterns is not robust (the wrong matches
>   might be triggered), and actually is not supported
>   by existing PMDs
> - no explicitly specified relations with preceding
>   and following items
> - no tunnel hint support
> 
> As the result, implementing the support for tunnel protocols like 
> aforementioned GENEVE with variable
> extra protocol option with flow raw item becomes very complicated and would 
> require multiple flows and
> multiple raw items chained in the same flow (by the way, there is no support 
> found for chained raw items
> in implemented drivers).
> 
> This RFC introduces the dedicated flex item (rte_flow_item_flex) to handle 
> matches with existing and new
> network protocol headers in a unified fashion.
> 
> 2. Flex Item Life Cycle
> 
> Let's assume there are the requirements to support the new network protocol 
> with RTE Flows. What is
> given within protocol
> specification:
> 
>   - header format
>   - header length, (can be variable, depending on options)
>   - potential presence of extra options following or included
>     in the header the header
>   - the relations with preceding protocols. For example,
>     the GENEVE follows UDP, eCPRI can follow either UDP
>     or L2 header
>   - the relations with following protocols. For example,
>     the next layer after tunnel header can be L2 or L3
>   - whether the new protocol is a tunnel and the header
>     is a splitting point between outer and inner layers
> 
> The supposed way to operate with flex item:
> 
>   - application defines the header structures according to
>     protocol specification
> 
>   - application calls rte_flow_flex_item_create() with desired
>     configuration according to the protocol specification, it
>     creates the flex item object over specified ethernet device
>     and prepares PMD and underlying hardware to handle flex
>     item. On item creation call PMD backing the specified
>     ethernet device returns the opaque handle identifying
>     the object has been created
> 
>   - application uses the rte_flow_item_flex with obtained handle
>     in the flows, the values/masks to match with fields in the
>     header are specified in the flex item per flow as for regular
>     items (except that pattern buffer combines all fields)
> 
>   - flows with flex items match with packets in a regular fashion,
>     the values and masks for the new protocol header match are
>     taken from the flex items in the flows
> 
>   - application destroys flows with flex items
> 
>   - application calls rte_flow_flex_item_release() as part of
>     ethernet device API and destroys the flex item object in
>     PMD and releases the engaged hardware resources
> 
> 3. Flex Item Structure
> 
> The flex item structure is intended to be used as part of the flow pattern 
> like regular RTE flow items and
> provides the mask and value to match with fields of the protocol item was 
> configured for.
> 
>   struct rte_flow_item_flex {
>     void *handle;
>     uint32_t length;
>     const uint8_t* pattern;
>   };
> 
> The handle is some opaque object maintained on per device basis by underlying 
> driver.
> 
> The protocol header fields are considered as bit fields, all offsets and 
> widths are expressed in bits. The
> pattern is the buffer containing the bit concatenation of all the fields 
> presented at item configuration time,
> in the same order and same amount. If byte boundary alignment is needed an 
> application can use a
> dummy type field, this is just some kind of gap filler.
> 
> The length field specifies the pattern buffer length in bytes and is needed 
> to allow rte_flow_copy()
> operations. The approach of multiple pattern pointers and lengths (per field) 
> was considered and found
> clumsy - it seems to be much suitable for the application to maintain the 
> single structure within the single
> pattern buffer.
> 
> 4. Flex Item Configuration
> 
> The flex item configuration consists of the following parts:
> 
>   - header field descriptors:
>     - next header
>     - next protocol
>     - sample to match
>   - input link descriptors
>   - output link descriptors
> 
> The field descriptors tell the driver and hardware what data should be 
> extracted from the packet and then
> control the packet handling in the flow engine. Besides this, sample fields 
> can be presented to match with
> patterns in the flows. Each field is a bit pattern.
> It has width, offset from the header beginning, mode of offset calculation, 
> and offset related parameters.
> 
> The next header field is special, no data are actually taken from the packet, 
> but its offset is used as a
> pointer to the next header in the packet, in other words the next header 
> offset specifies the size of the
> header being parsed by flex item.
> 
> There is one more special field - next protocol, it specifies where the next 
> protocol identifier is contained
> and packet data sampled from this field will be used to determine the next 
> protocol header type to
> continue packet parsing. The next protocol field is like eth_type field in 
> MAC2, or proto field in IPv4/v6
> headers.
> 
> The sample fields are used to represent the data be sampled from the packet 
> and then matched with
> established flows.
> 
> There are several methods supposed to calculate field offset in runtime 
> depending on configuration and
> packet content:
> 
>   - FIELD_MODE_FIXED - fixed offset. The bit offset from
>     header beginning is permanent and defined by field_base
>     configuration parameter.
> 
>   - FIELD_MODE_OFFSET - the field bit offset is extracted
>     from other header field (indirect offset field). The
>     resulting field offset to match is calculated from as:
> 
>   field_base + (*offset_base & offset_mask) << offset_shift
> 
>     This mode is useful to sample some extra options following
>     the main header with field containing main header length.
>     Also, this mode can be used to calculate offset to the
>     next protocol header, for example - IPv4 header contains
>     the 4-bit field with IPv4 header length expressed in dwords.
>     One more example - this mode would allow us to skip GENEVE
>     header variable length options.
> 
>   - FIELD_MODE_BITMASK - the field bit offset is extracted
>     from other header field (indirect offset field), the latter
>     is considered as bitmask containing some number of one bits,
>     the resulting field offset to match is calculated as:
> 
>   field_base + bitcount(*offset_base & offset_mask) << offset_shift
> 
>     This mode would be useful to skip the GTP header and its
>     extra options with specified flags.
> 
>   - FIELD_MODE_DUMMY - dummy field, optionally used for byte
>     boundary alignment in pattern. Pattern mask and data are
>     ignored in the match. All configuration parameters besides
>     field size and offset are ignored.
> 
>   Note:  "*" - means the indirect field offset is calculated
>   and actual data are extracted from the packet by this
>   offset (like data are fetched by pointer *p from memory).
> 
> The offset mode list can be extended by vendors according to hardware 
> supported options.
> 
> The input link configuration section tells the driver after what protocols 
> and at what conditions the flex
> item can follow.
> Input link specified the preceding header pattern, for example for GENEVE it 
> can be UDP item specifying
> match on destination port with value 6081. The flex item can follow multiple 
> header types and multiple
> input links should be specified. At flow creation time the item with one of 
> the input link types should
> precede the flex item and driver will select the correct flex item settings, 
> depending on the actual flow
> pattern.
> 
> The output link configuration section tells the driver how to continue packet 
> parsing after the flex item
> protocol.
> If multiple protocols can follow the flex item header the flex item should 
> contain the field with the next
> protocol identifier and the parsing will be continued depending on the data 
> contained in this field in the
> actual packet.
> 
> The flex item fields can participate in RSS hash calculation, the dedicated 
> flag is present in the field
> description to specify what fields should be provided for hashing.
> 
> 5. Flex Item Chaining
> 
> If there are multiple protocols supposed to be supported with flex items in 
> chained fashion - two or more
> flex items within the same flow and these ones might be neighbors in the 
> pattern, it means the flex items
> are mutual referencing.  In this case, the item that occurred first should be 
> created with empty output link
> list or with the list including existing items, and then the second flex item 
> should be created referencing the
> first flex item as input arc, drivers should adjust the item confgiuration.
> 
> Also, the hardware resources used by flex items to handle the packet can be 
> limited. If there are multiple
> flex items that are supposed to be used within the same flow it would be nice 
> to provide some hint for the
> driver that these two or more flex items are intended for simultaneous usage.
> The fields of items should be assigned with hint indices and these indices 
> from two or more flex items
> supposed to be provided within the same flow should be the same as well. In 
> other words, the field hint
> index specifies the group of fields that can be matched simultaneously within 
> a single flow. If hint indices
> are specified, the driver will try to engage not overlapping hardware 
> resources and provide independent
> handling of the field groups with unique indices. If the hint index is zero 
> the driver assigns resources on its
> own.
> 
> 6. Example of New Protocol Handling
> 
> Let's suppose we have the requirements to handle the new tunnel protocol that 
> follows UDP header with
> destination port 0xFADE and is followed by MAC header. Let the new protocol 
> header format be like this:
> 
>   struct new_protocol_header {
>     rte_be32 header_length; /* length in dwords, including options */
>     rte_be32 specific0;     /* some protocol data, no intention */
>     rte_be32 specific1;     /* to match in flows on these fields */
>     rte_be32 crucial;       /* data of interest, match is needed */
>     rte_be32 options[0];    /* optional protocol data, variable length */
>   };
> 
> The supposed flex item configuration:
> 
>   struct rte_flow_item_flex_field field0 = {
>     .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
>     .field_size = 96,                /* three dwords from the beginning */
>   };
>   struct rte_flow_item_flex_field field1 = {
>     .field_mode = FIELD_MODE_FIXED,
>     .field_size = 32,       /* Field size is one dword */
>     .field_base = 96,       /* Skip three dwords from the beginning */
>   };
>   struct rte_flow_item_udp spec0 = {
>     .hdr = {
>       .dst_port = RTE_BE16(0xFADE),
>     }
>   };
>   struct rte_flow_item_udp mask0 = {
>     .hdr = {
>       .dst_port = RTE_BE16(0xFFFF),
>     }
>   };
>   struct rte_flow_item_flex_link link0 = {
>     .item = {
>        .type = RTE_FLOW_ITEM_TYPE_UDP,
>        .spec = &spec0,
>        .mask = &mask0,
>   };
> 
>   struct rte_flow_item_flex_conf conf = {
>     .next_header = {
>       .tunnel = FLEX_TUNNEL_MODE_SINGLE,
>       .field_mode = FIELD_MODE_OFFSET,
>       .field_base = 0,
>       .offset_base = 0,
>       .offset_mask = 0xFFFFFFFF,
>       .offset_shift = 2          /* Expressed in dwords, shift left by 2 */
>     },
>     .sample = {
>        &field0,
>        &field1,
>     },
>     .nb_samples = 2,
>     .input_link[0] = &link0,
>     .nb_inputs = 1
>   };
> 
> Let's suppose we have created the flex item successfully, and PMD returned 
> the handle 0x123456789A.
> We can use the following item pattern to match the crucial field in the 
> packet with value 0x00112233:
> 
>   struct new_protocol_header spec_pattern =
>   {
>     .crucial = RTE_BE32(0x00112233),
>   };
>   struct new_protocol_header mask_pattern =
>   {
>     .crucial = RTE_BE32(0xFFFFFFFF),
>   };
>   struct rte_flow_item_flex spec_flex = {
>     .handle = 0x123456789A
>     .length = sizeiof(struct new_protocol_header),
>     .pattern = &spec_pattern,
>   };
>   struct rte_flow_item_flex mask_flex = {
>     .length = sizeof(struct new_protocol_header),
>     .pattern = &mask_pattern,
>   };
>   struct rte_flow_item item_to_match = {
>     .type = RTE_FLOW_ITEM_TYPE_FLEX,
>     .spec = &spec_flex,
>     .mask = &mask_flex,
>   };
> 
> Signed-off-by: Viacheslav Ovsiienko <viachesl...@nvidia.com>
> ---

Acked-by: Ori Kam <or...@nvidia.com>
Thanks,
Ori

Reply via email to