Hi Andrew, PSB,
Best, Ori > -----Original Message----- > From: Andrew Rybchenko <andrew.rybche...@oktetlabs.ru> > > On 4/8/21 2:39 PM, Ori Kam wrote: > > Hi Andrew, > > > > Thanks for your comments. > > > > PSB, > > > > Best, > > Ori > > > >> -----Original Message----- > >> From: Andrew Rybchenko <andrew.rybche...@oktetlabs.ru> > >> Sent: Thursday, April 8, 2021 11:05 AM > >> Subject: Re: [PATCH] ethdev: add packet integrity checks > >> > >> On 4/5/21 9:04 PM, Ori Kam wrote: > >>> Currently, DPDK application can offload the checksum check, > >>> and report it in the mbuf. > >>> > >>> However, as more and more applications are offloading some or all > >>> logic and action to the HW, there is a need to check the packet > >>> integrity so the right decision can be taken. > >>> > >>> The application logic can be positive meaning if the packet is > >>> valid jump / do actions, or negative if packet is not valid > >>> jump to SW / do actions (like drop) a, and add default flow > >>> (match all in low priority) that will direct the miss packet > >>> to the miss path. > >>> > >>> Since currenlty rte_flow works in positive way the assumtion is > >>> that the postive way will be the common way in this case also. > >>> > >>> When thinking what is the best API to implement such feature, > >>> we need to considure the following (in no specific order): > >>> 1. API breakage. > >> > >> First of all I disagree that "API breakage" is put as a top > >> priority. Design is a top priority, since it is a long term. > >> aPI breakage is just a short term inconvenient. Of course, > >> others may disagree, but that's my point of view. > >> > > I agree with you, and like I said the order of the list is not > > according to priorities. > > I truly believe that what I'm suggesting is the best design. > > > > > >>> 2. Simplicity. > >>> 3. Performance. > >>> 4. HW capabilities. > >>> 5. rte_flow limitation. > >>> 6. Flexability. > >>> > >>> First option: Add integrity flags to each of the items. > >>> For example add checksum_ok to ipv4 item. > >>> > >>> Pros: > >>> 1. No new rte_flow item. > >>> 2. Simple in the way that on each item the app can see > >>> what checks are available. > >> > >> 3. Natively supports various tunnels without any extra > >> changes in a shared item for all layers. > >> > > Also in the current suggested approach, we have the level member, > > So tunnels are supported by default. If someone wants to check also tunnel > > he just need to add this item again with the right level. (just like with > > other > > items) > > Thanks, missed it. Is it OK to have just one item with > level 1 or 2? > Yes, of course, if the application just wants to check the sanity of the inner packet he can just use one integrity item with level of 2. > What happens if two items with level 0 and level 1 are > specified, but the packet has no encapsulation? > Level zero is the default one (the default just like in RSS case is PMD dependent but in any case from my knowledge layer 0 if there is no tunnel will point to the header) and level 1 is the outer most so in this case both of them are pointing to the same checks. But if for example we use level = 2 then the checks for level 2 should fail. Since the packet doesn't hold such info, just like if you check state of l4 and there is no l4 it should fails. > >>> > >>> Cons: > >>> 1. API breakage. > >>> 2. increase number of flows, since app can't add global rule and > >>> must have dedicated flow for each of the flow combinations, for example > >>> matching on icmp traffic or UDP/TCP traffic with IPv4 / IPv6 will > >>> result in 5 flows. > >> > >> Could you expand it? Shouldn't HW offloaded flows with good > >> checksums go into dedicated queues where as bad packets go > >> via default path (i.e. no extra rules)? > >> > > I'm not sure what do you mean, in a lot of the cases > > Application will use that to detect valid packets and then > > forward only valid packets down the flow. (check valid jump > > --> on next group decap ....) > > In other cases the app may choose to drop the bad packets or count > > and then drop, maybe sample them to check this is not part of an attack. > > > > This is what is great about this feature we just give the app > > the ability to offload the sanity checks and be that enables it > > to offload the traffic itself > > Please, when you say "increase number of flows... in 5 flows" > just try to express in flow rules in both cases. Just for my > understanding. Since you calculated flows you should have a > real example. > Sure, you are right I should have a better example. Lets take the example that the application want all valid traffic to jump to group 2. The possibilities of valid traffic can be: Eth / ipv4. Eth / ipv6 Eth / ipv4 / udp Eth/ ivp4 / tcp Eth / ipv6 / udp Eth / ipv6 / tcp So if we use the existing items we will get the following 6 flows: Flow create 0 ingress pattern eth / ipv4 valid = 1 / end action jump group 2 Flow create 0 ingress pattern eth / ipv6 valid = 1 / end action jump group 2 Flow create 0 ingress pattern eth / ipv4 valid = 1 / udp valid = 1/ end action jump group 2 Flow create 0 ingress pattern eth / ipv4 valid = 1 / tcp valid = 1/ end action jump group 2 Flow create 0 ingress pattern eth / ipv6 valid = 1 / udp valid = 1/ end action jump group 2 Flow create 0 ingress pattern eth / ipv6 valid = 1 / udp valid = 1/ end action jump group 2 While if we use the new item approach: Flow create 0 ingress pattern integrity_check packet_ok =1 / end action jump group 2 If we take the case that we just want valid l4 packets then the flows with existing items will be: Flow create 0 ingress pattern eth / ipv4 valid = 1 / udp valid = 1/ end action jump group 2 Flow create 0 ingress pattern eth / ipv4 valid = 1 / tcp valid = 1/ end action jump group 2 Flow create 0 ingress pattern eth / ipv6 valid = 1 / udp valid = 1/ end action jump group 2 Flow create 0 ingress pattern eth / ipv6 valid = 1 / udp valid = 1/ end action jump group 2 While with the new item: Flow create 0 ingress pattern integrity_check l4_ok =1 / end action jump group 2 Is this clearer? > >>> > >>> Second option: dedicated item > >>> > >>> Pros: > >>> 1. No API breakage, and there will be no for some time due to having > >>> extra space. (by using bits) > >>> 2. Just one flow to support the icmp or UDP/TCP traffic with IPv4 / > >>> IPv6. > >> > >> It depends on how bad (or good0 packets are handled. > >> > > Not sure what do you mean, > > Again, I hope we understand each other when we talk in terms > of real example and flow rules. > Please see answer above. I hope it will make things clearer. > >>> 3. Simplicity application can just look at one place to see all possible > >>> checks. > >> > >> It is a drawback from my point of view, since IPv4 checksum > >> check is out of IPv4 match item. I.e. analyzing IPv4 you should > >> take a look at 2 different flow items. > >> > > Are you talking from application view point, PMD or HW? > > From application yes it is true he needs to add one more item > > to the list, (depending on his flows, since he can have just > > one flow that checks all packet like I said and move the good > > ones to a different group and in that group he will match the > > ipv4 item. > > For example: > > ... pattern integrity = valid action jump group 3 > > Group 3 pattern .... ipv4 ... actions ..... > > Group 3 pattern ....ipv6 .... actions ... > > > > In any case at worse case it is just adding one more item > > to the flow. > > > > From PMD/HW extra items doesn't mean extra action in HW > > they can be combined, just like they would have it the > > condition was in the item itself. > > > >>> 4. Allow future support for more tests. > >> > >> It is the same for both solution since per-item solution > >> can keep reserved bits which may be used in the future. > >> > > Yes I agree, > > > >>> > >>> Cons: > >>> 1. New item, that holds number of fields from different items. > >> > >> 2. Not that nice for tunnels. > > > > Please look at above (not direct ) response since we have the level member > > tunnels are handled very nicely. > > > >> > >>> > >>> For starter the following bits are suggested: > >>> 1. packet_ok - means that all HW checks depending on packet layer have > >>> passed. This may mean that in some HW such flow should be splited to > >>> number of flows or fail. > >>> 2. l2_ok - all check flor layer 2 have passed. > >>> 3. l3_ok - all check flor layer 2 have passed. If packet doens't have > >>> l3 layer this check shoudl fail. > >>> 4. l4_ok - all check flor layer 2 have passed. If packet doesn't > >>> have l4 layer this check should fail. > >>> 5. l2_crc_ok - the layer 2 crc is O.K. it is possible that the crc will > >>> be O.K. but the l3_ok will be 0. it is not possible that l2_crc_ok will > >>> be 0 and the l3_ok will be 0. > >>> 6. ipv4_csum_ok - IPv4 checksum is O.K. > >>> 7. l4_csum_ok - layer 4 checksum is O.K. > >>> 8. l3_len_OK - check that the reported layer 3 len is smaller than the > >>> packet len. > >>> > >>> Example of usage: > >>> 1. check packets from all possible layers for integrity. > >>> flow create integrity spec packet_ok = 1 mask packet_ok = 1 ..... > >>> > >>> 2. Check only packet with layer 4 (UDP / TCP) > >>> flow create integrity spec l3_ok = 1, l4_ok = 1 mask l3_ok = 1 l4_ok = > >>> 1 > >>> > >>> Signed-off-by: Ori Kam <or...@nvidia.com> > >>> --- > >>> doc/guides/prog_guide/rte_flow.rst | 19 ++++++++++++++++ > >>> lib/librte_ethdev/rte_flow.h | 46 > >> ++++++++++++++++++++++++++++++++++++++ > >>> 2 files changed, 65 insertions(+) > >>> > >>> diff --git a/doc/guides/prog_guide/rte_flow.rst > >> b/doc/guides/prog_guide/rte_flow.rst > >>> index aec2ba1..58b116e 100644 > >>> --- a/doc/guides/prog_guide/rte_flow.rst > >>> +++ b/doc/guides/prog_guide/rte_flow.rst > >>> @@ -1398,6 +1398,25 @@ Matches a eCPRI header. > >>> - ``hdr``: eCPRI header definition (``rte_ecpri.h``). > >>> - Default ``mask`` matches nothing, for all eCPRI messages. > >>> > >>> +Item: ``PACKET_INTEGRITY_CHECKS`` > >>> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > >>> + > >>> +Matches packet integrity. > >>> + > >>> +- ``level``: the encapsulation level that should be checked. level 0 > >>> means > the > >>> + default PMD mode (Can be inner most / outermost). value of 1 means > >> outermost > >>> + and higher value means inner header. See also RSS level. > >>> +- ``packet_ok``: All HW packet integrity checks have passed based on the > >> max > >>> + layer of the packet. > >>> + layer of the packet. > >>> +- ``l2_ok``: all layer 2 HW integrity checks passed. > >>> +- ``l3_ok``: all layer 3 HW integrity checks passed. > >>> +- ``l4_ok``: all layer 3 HW integrity checks passed. > >>> +- ``l2_crc_ok``: layer 2 crc check passed. > >>> +- ``ipv4_csum_ok``: ipv4 checksum check passed. > >>> +- ``l4_csum_ok``: layer 4 checksum check passed. > >>> +- ``l3_len_ok``: the layer 3 len is smaller than the packet len. > >>> + > >>> Actions > >>> ~~~~~~~ > >>> > >>> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h > >>> index 6cc5713..f6888a1 100644 > >>> --- a/lib/librte_ethdev/rte_flow.h > >>> +++ b/lib/librte_ethdev/rte_flow.h > >>> @@ -551,6 +551,15 @@ enum rte_flow_item_type { > >>> * See struct rte_flow_item_geneve_opt > >>> */ > >>> RTE_FLOW_ITEM_TYPE_GENEVE_OPT, > >>> + > >>> + /** > >>> + * [META] > >>> + * > >>> + * Matches on packet integrity. > >>> + * > >>> + * See struct rte_flow_item_packet_integrity_checks. > >>> + */ > >>> + RTE_FLOW_ITEM_TYPE_PACKET_INTEGRITY_CHECKS, > >>> }; > >>> > >>> /** > >>> @@ -1685,6 +1694,43 @@ struct rte_flow_item_geneve_opt { > >>> }; > >>> #endif > >>> > >>> +struct rte_flow_item_packet_integrity_checks { > >>> + uint32_t level; > >>> + /**< Packet encapsulation level the item should apply to. > >>> + * @see rte_flow_action_rss > >>> + */ > >>> +RTE_STD_C11 > >>> + union { > >>> + struct { > >>> + uint64_t packet_ok:1; > >>> + /** The packet is valid after passing all HW checks. */ > >>> + uint64_t l2_ok:1; > >>> + /**< L2 layer is valid after passing all HW checks. */ > >>> + uint64_t l3_ok:1; > >>> + /**< L3 layer is valid after passing all HW checks. */ > >>> + uint64_t l4_ok:1; > >>> + /**< L4 layer is valid after passing all HW checks. */ > >>> + uint64_t l2_crc_ok:1; > >>> + /**< L2 layer checksum is valid. */ > >>> + uint64_t ipv4_csum_ok:1; > >>> + /**< L3 layer checksum is valid. */ > >>> + uint64_t l4_csum_ok:1; > >>> + /**< L4 layer checksum is valid. */ > >>> + uint64_t l3_len_ok:1; > >>> + /**< The l3 len is smaller than the packet len. */ > >>> + uint64_t reserved:56; > >>> + }; > >>> + uint64_t value; > >>> + }; > >>> +}; > >>> + > >>> +#ifndef __cplusplus > >>> +static const struct rte_flow_item_sanity_checks > >>> + rte_flow_item_sanity_checks_mask = { > >>> + .value = 0, > >>> +}; > >>> +#endif > >>> + > >>> /** > >>> * Matching pattern item definition. > >>> * > >>> > >