Hi Konstantin, On 01/20/2015 06:23 PM, Ananyev, Konstantin wrote: >> Sure, it does not make a big difference in terms of code. But >> in terms of API, the naming of the flag is coherent to what it is >> used for. And it's easier to find a simple definition, like: >> >> * Packet is IPv4. This flag must be set when using any offload feature >> * (TSO, L3 or L4 checksum) to tell the NIC that the packet is an IPv4 >> * packet. > > Ok, and what's wrong with: > "Packet is IPv4. This flag must be set when using any offload feature > (TSO, L3 or L4 checksum) to tell the NIC that the packet is an IPv4 > packet and no HW offload for IPv4 header checksum calculation is required" > ?
I honestly find the first one simpler. Again, I understand both are possible, but I think choosing the most trivial one is the right way for an API. >>> Ok, and why it should be our problem? >>> We have a lot of things done in a different manner then linux/freebsd >>> kernel drivers, >>> Why now it became a problem? >> >> If linux doesn't need an equivalent flag for doing the same thing, >> it probably means we don't need it either. > > Probably yes .... Or probably not. > Why do we need to guess what was the intention of guys who wrote that part of > linux driver? Because the dpdk looks very similar to that part of linux driver. > BTW, the macro for GRE is here: > find lib/librte_pmd_i40e/i40e -type f | xargs grep TUN | grep TXD > lib/librte_pmd_i40e/i40e/i40e_type.h:#define I40E_TXD_CTX_UDP_TUNNELING > (0x1ULL << I40E_TXD_CTX_QW0_NATT_SHIFT) > lib/librte_pmd_i40e/i40e/i40e_type.h:#define I40E_TXD_CTX_GRE_TUNNELING > (0x2ULL << I40E_TXD_CTX_QW0_NATT_SHIFT) > > Though it not used (yet?) by some reason. > >> >> In a performance-oriented software like dpdk, having a flag that we >> don't know what the hardware does with, that is not needed in other >> drivers of the same harware, that makes the API harder to understand >> could be a problem. > > Here is a HW spec, that says what values have to be setup for L4TUNT. > Yes, I am not sure why they need to distinguish between VXLAN/GRE tunnelling. > Though, I suppose that wouldn't eliminate the requirement. > But for same, there is no good explanation why FVL HW need to know that it is > IPv4 or IPv6 packet, > in the case when only L4 checksum offload is required (IIPT field). > Niantic, as I remember, is able to work ok without that requirement. > Though, we still have to set it up. > >> Another argument: if we can remove this flag, it would make the >> testpmd commands reworkd proposed by Jijiang much more easy to >> understand: only a new "csum parse-tunnel on|off" would be required, >> and it can be explained in a few words. > > Well, from my point - testpmd commands that Jijiang proposed are perfectly > clear and understandable. > Another thing, as I remember, our primary concern should be public API, no > testpmd. OK let's talk about testpmd later. >> We should avoid the need to specify the tunnel type in the OUTER >> checksum API if we can, else it would limit us to specific >> supported protocols. > > From the FVL spec it is required by HW, it is not what we introducing on our > own. > Spec stays explicitly that L4TUNT (L4 tunneling type) has to be setup for > tunnelling packets. > Again from the spec, there are 3 different values it can take. > If you have an idea how to pass that information to PMD without using flags, > sure we can consider it. > >> >>>>>> I think the following cases should be *forbidden by the API*: >>>>>> >>>>>> case 9) calculate checksum of in_ip and in_tcp (was case B.1 in [1]) >>>>>> >>>>>> mb->outer_l2_len = len(out_eth) >>>>>> mb->outer_l3_len = len(out_ip) >>>>>> mb->l2_len = len(out_udp + vxlan + in_eth) >>>>>> mb->l3_len = len(out_ip) >>>>>> mb->ol_flags |= PKT_TX_IPV4 | PKT_TX_UDP_TUNNEL_PKT | \ >>>>>> PKT_TX_IP_CSUM | PKT_TX_UDP_CKSUM; >>>>>> set out_ip checksum to 0 in the packet >>>>>> set out_udp checksum to pseudo header using rte_ipv4_phdr_cksum() >>>>>> >>>>>> If we remove the flag PKT_TX_UDP_TUNNEL_PKT, this cannot be >>>>>> supported, but there is no reason to support it as there is >>>>>> already one way to do the same. >>>>>> >>>>>> I think the driver should not even look at mb->outer_l2_len >>>>>> and mb->outer_l3_len if no flag PKT_TX_OUTER_* is set. >>>>> >>>>> Why it should be forbidden? >>>>> I admit it might be a bit slower than case 4), >>>>> but I think absolutely legal way to setup HW offloads for inner L3/L4. >>>>> As I said we need a PKT_TX_UDP_TUNNEL_PKT anyway, so I suppose >>>>> PKT_TX_*_TUNNEL_PKT should be an indication is it a tunnel packet or not. >>>>> PKT_TX_OUTER_* flags indicate does outer cksum offload is required or not. >>>> >>>> I don't understand. The result in terms of hardware is exactly the >>>> same than case 4). Why should we have 2 different ways for doing the >>>> same thing? >>> >>> If HW supports that capability, why should we forbid it? >>> Let user to choose himself what way to use. >>> FVL spec lists it as a valid approach. >> >> It is not a hardware feature. > > It is. > >> Case 4) and case 9) would fill the hardware registers exactly the same. > > No, they wouldn't. > Please read corresponding section of FVL spec and i40e_rxtx.c > For case 4) we only need to setup TDD (TX data descriptor) with the following > values: > IIPT, IPLEN, L4T, L4LEN > For case 9) we need to setup both TDD and TCD (TX context descriptor) with > the following values: > TDD: IIPT, IPLEN, L4T, L4LEN > TCD: EIPT, EIPLEN, L4TUNT, L4TUNLEN > >> To me, it's just an API question. > > No, it is not. > > I still don't understand why you are so eager to 'forbid' it. > Yes we support it for FVL, but no one forces you to use it. Well, how would you describe this 2 ways of doing the same thing in the offload API? Would you talk about the i40e registers? It's not because i40e has 2 ways to do the same operation that the DPDK should do the same. How will you explain to a user how to choose between these 2 cases? Having to support these 2 different cases for the same thing will complexify all future drivers that will not work the same way than i40e. >>> As one of possible use-cases: HW VLAN tags insertion for both inner and >>> outer packets. >>> FVL can do that, though as I know our PMD doesn't implement it yet. >>> For that, we'll need to specify at least: >>> outer_l2_len, outer_l3_len, l2_len. >>> While PKT_TX_OUTER_* might stay cleared. >> >> If a VLAN flag has to be inserted in outer header, a new flag >> PKT_TX_OUTER_INSERT_VLAN would be added. So my specification >> would still be correct: >> >> The driver should look at mb->outer_lX_len only if a >> PKT_TX_OUTER_* flag is present. >> > > Introducing PKT_TX_OUTER_INSERT_VLAN is ok. > Though I still think we'll need TX_*_TUNNEL flags and no need to 'forbid' > case 9). > BTW, as I can see linux i40e driver for tunnelling packets uses case 9), not > case 4), right? I need to check this. Regards, Olivier