Hi Cunming, On 07/06/2016 08:44 AM, Liang, Cunming wrote: > Hi Olivier, > > On 7/5/2016 11:41 PM, Olivier Matz wrote: >> Introduce the function rte_pktmbuf_get_ptype() that parses a >> mbuf and returns its packet type. For now, the following packet >> types are parsed: >> L2: Ether >> L3: IPv4, IPv6 >> L4: TCP, UDP, SCTP >> >> The goal here is to provide a reference implementation for packet type >> parsing. This function will be used by testpmd in next commits, allowing >> to compare its result with the value given by the hardware. >> >> This function will also be useful when implementing Rx offload support >> in virtio pmd. Indeed, the virtio protocol gives the csum start and >> offset, but it does not give the L4 protocol nor it tells if the >> checksum is relevant for inner or outer. This information has to be >> known to properly set the ol_flags in mbuf. >> >> Signed-off-by: Didier Pallard <didier.pallard at 6wind.com> >> Signed-off-by: Jean Dao <jean.dao at 6wind.com> >> Signed-off-by: Olivier Matz <olivier.matz at 6wind.com> >> --- >> doc/guides/rel_notes/release_16_11.rst | 5 + >> lib/librte_mbuf/Makefile | 5 +- >> lib/librte_mbuf/rte_mbuf_ptype.c | 234 >> +++++++++++++++++++++++++++++++++ >> lib/librte_mbuf/rte_mbuf_ptype.h | 43 ++++++ >> lib/librte_mbuf/rte_mbuf_version.map | 1 + >> 5 files changed, 286 insertions(+), 2 deletions(-) >> create mode 100644 lib/librte_mbuf/rte_mbuf_ptype.c >> >> [...] >> + >> +/* parse mbuf data to get packet type */ >> +uint32_t rte_pktmbuf_get_ptype(const struct rte_mbuf *m, >> + struct rte_mbuf_hdr_lens *hdr_lens) >> +{ >> + struct rte_mbuf_hdr_lens local_hdr_lens; >> + const struct ether_hdr *eh; >> + struct ether_hdr eh_copy; >> + uint32_t pkt_type = RTE_PTYPE_L2_ETHER; >> + uint32_t off = 0; >> + uint16_t proto; >> + >> + if (hdr_lens == NULL) >> + hdr_lens = &local_hdr_lens; >> + >> + eh = rte_pktmbuf_read(m, off, sizeof(*eh), &eh_copy); >> + if (unlikely(eh == NULL)) >> + return 0; >> + proto = eh->ether_type; >> + off = sizeof(*eh); >> + hdr_lens->l2_len = off; >> + >> + if (proto == rte_cpu_to_be_16(ETHER_TYPE_IPv4)) { >> + const struct ipv4_hdr *ip4h; >> + struct ipv4_hdr ip4h_copy; >> + >> + ip4h = rte_pktmbuf_read(m, off, sizeof(*ip4h), &ip4h_copy); >> + if (unlikely(ip4h == NULL)) >> + return pkt_type; >> + >> + pkt_type |= ptype_l3_ip(ip4h->version_ihl); >> + hdr_lens->l3_len = ip4_hlen(ip4h); >> + off += hdr_lens->l3_len; >> + if (ip4h->fragment_offset & >> + rte_cpu_to_be_16(IPV4_HDR_OFFSET_MASK | >> + IPV4_HDR_MF_FLAG)) { >> + pkt_type |= RTE_PTYPE_L4_FRAG; >> + hdr_lens->l4_len = 0; >> + return pkt_type; >> + } >> + proto = ip4h->next_proto_id; >> + pkt_type |= ptype_l4(proto); >> + } else if (proto == rte_cpu_to_be_16(ETHER_TYPE_IPv6)) { >> + const struct ipv6_hdr *ip6h; >> + struct ipv6_hdr ip6h_copy; >> + int frag = 0; >> + >> + ip6h = rte_pktmbuf_read(m, off, sizeof(*ip6h), &ip6h_copy); >> + if (unlikely(ip6h == NULL)) >> + return pkt_type; >> + >> + proto = ip6h->proto; >> + hdr_lens->l3_len = sizeof(*ip6h); >> + off += hdr_lens->l3_len; >> + pkt_type |= ptype_l3_ip6(proto); >> + if ((pkt_type & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_IPV6_EXT) { >> + proto = skip_ip6_ext(proto, m, &off, &frag); >> + hdr_lens->l3_len = off - hdr_lens->l2_len; >> + } >> + if (proto == 0) >> + return pkt_type; >> + if (frag) { >> + pkt_type |= RTE_PTYPE_L4_FRAG; >> + hdr_lens->l4_len = 0; >> + return pkt_type; >> + } >> + pkt_type |= ptype_l4(proto); >> + } >> + >> + if ((pkt_type & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_UDP) { >> + hdr_lens->l4_len = sizeof(struct udp_hdr); >> + } else if ((pkt_type & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP) { >> + const struct tcp_hdr *th; >> + struct tcp_hdr th_copy; >> + >> + th = rte_pktmbuf_read(m, off, sizeof(*th), &th_copy); >> + if (unlikely(th == NULL)) >> + return pkt_type & (RTE_PTYPE_L2_MASK | >> + RTE_PTYPE_L3_MASK); >> + hdr_lens->l4_len = (th->data_off & 0xf0) >> 2; >> + } else if ((pkt_type & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_SCTP) { >> + hdr_lens->l4_len = sizeof(struct sctp_hdr); >> + } else { >> + hdr_lens->l4_len = 0; >> + } >> + >> + return pkt_type; >> +} >> diff --git a/lib/librte_mbuf/rte_mbuf_ptype.h >> b/lib/librte_mbuf/rte_mbuf_ptype.h >> index 4a34678..f468520 100644 >> --- a/lib/librte_mbuf/rte_mbuf_ptype.h >> +++ b/lib/librte_mbuf/rte_mbuf_ptype.h >> @@ -545,6 +545,49 @@ extern "C" { >> RTE_PTYPE_INNER_L3_MASK | \ >> RTE_PTYPE_INNER_L4_MASK)) >> +struct rte_mbuf; >> + >> +/** >> + * Structure containing header lengths associated to a packet. >> + */ >> +struct rte_mbuf_hdr_lens { >> + uint8_t l2_len; >> + uint8_t l3_len; >> + uint8_t l4_len; >> + uint8_t tunnel_len; >> + uint8_t inner_l2_len; >> + uint8_t inner_l3_len; >> + uint8_t inner_l4_len; >> +}; > [LC] The header parsing graph usually is not unique. The definition > maybe nice for the basic IP and L4 tunnel. > However it can't scale out to other cases, e.g. qinq, mac-in-mac, mpls > l2/l3 tunnel. > The parsing logic of "rte_pktmbuf_get_ptype()" and the definition of > "struct rte_mbuf_hdr_lens" consist a pair for one specific parser scheme. > In this case, the fixed function is to support below. > > + * Supported packet types are: > + * L2: Ether > + * L3: IPv4, IPv6 > + * L4: TCP, UDP, SCTP > > Of course, it can add more packet type detection logic in future. But > the more support, the higher the cost. > > One of the alternative way is to allow registering parser pair. APP > decides to choose the predefined scheme(by DPDK LIB), or to self-define > the parsing logic. > In this way, the scheme can do some assumption for the specific case and > ignore some useless graph detection. > In addition, besides the SW parser, the HW parser(identified by > packet_type in mbuf) can be turn on/off by leveraging the same manner.
Sorry, I'm not sure I'm fully getting what you are saying. If I understand well, you would like to have something more flexible that supports the registration of protocol to be recognized? I'm not sure having a function with a dynamic registration method would really increase the performance compared to a static complete function. Actually, we will never support a tons of protocols since each layer packet type is 4 bits, and since it requires that at least one hw supports it. As described in the cover letter, the 2 main goals of this patchset are to provide a reference implementation for packet type recognition, and enable the support of virtio offloads (I'll send the patchset soon). This function is adapted to these 2 usages. Are you thinking of another use-case that would not be covered? Regards, Olivier