On Tue, Jul 26, 2016 at 10:56 AM, pravin shelar <pshe...@ovn.org> wrote: > On Tue, Jul 26, 2016 at 10:06 AM, Jesse Gross <je...@kernel.org> wrote: >> On Mon, Jul 25, 2016 at 5:49 PM, Pravin B Shelar <pshe...@ovn.org> wrote: >>> OVS compat layer can handle tunnel GSO packets. but it does >>> keep skb encapsulation on for packet handled in GSO. This can >>> confuse some NIC drivers. I have seen this issue on intel devices: >>> >>> i40e 0000:42:00.0: TX driver issue detected, PF reset issued >>> >>> Following patch resets this bit in case compat layer handles the packet. >>> >>> VMware-BZ: 1698877 >>> Signed-off-by: Pravin B Shelar <pshe...@ovn.org> >> >> In upstream, this is done as part of the GSO code (for example, in >> __skb_udp_tunnel_segment()) so that probably makes more sense and is >> safer if this is GSO specific. There is already code in >> ovs_iptunnel_handle_offloads() that will clear the encapsulation bit >> in the case of checksum offload on the outer header. >> > ovs_iptunnel_handle_offloads() is not equivalent to > skb_udp_tunnel_segment(). If handle-offload clear the bit it would not > be possible to check the encapsulation bit after handle-offload call > in tunnel implementation. At this point this bit is not checked, so it > is not issue. But that would be different behavior compared to > upstream.
I was actually referring to existing behavior in ovs_iptunnel_handle_offlads(). Here is the code that I was talking about: #if LINUX_VERSION_CODE >= KERNEL_VERSION(3,8,0) /* If packet is not gso and we are resolving any partial checksum, * clear encapsulation flag. This allows setting CHECKSUM_PARTIAL * on the outer header without confusing devices that implement * NETIF_F_IP_CSUM with encapsulation. */ if (csum_help) skb->encapsulation = 0; #endif In the case of outer checksums being enabled and offloaded and no GSO, we will have cleared skb->encapsulation bit already, so there should be no confusion for the NIC driver. As a result, this seems like a GSO-only problem and we can just mirror what upstream does and clear the bit in the GSO code. I'm a little bit nervous about indiscriminately clearing it in all cases, since it seems like it is possible that we will accidentally do it in some case where we are trying to use more of the network stack. >> Something that I noticed while looking at this is it looks like the >> recent patch that moved the check for gso_type_mask into a GSO-only >> block in ovs_iptunnel_handle_offloads() might cause a bit of a >> performance regression. Even though that field is GSO-specific, it is >> also used to control whether we resolve partial checksums. Even in >> cases where we do need to use the OVS offload compat code, I suspect >> we could take better advantage of hardware offloads - not computing >> the checksum when !skb->encapsulation (since every kernel can do UDP >> checksum offload), using scatter/gather in GSO, checking for backport >> support when clearing type in rpl_udp_tunnel_handle_offloads(). > > I am not sure I understand it correctly. the recent change actually > using compat gso code only for GSO packet, earlier it was using it for > non GSO packets too. By not setting "OVS_GSO_CB(skb)->fix_segment" we > are using networking stack. That's true - it will default to NULL. In that case, I don't know if this is entirely safe though. When encapsulation offloads first went in, many drivers essentially assumed that this meant VXLAN. It wasn't until 3.18 that ndo_features_check was available and drivers started implementing it. While this affects TSO more than checksum, issues are still possible if we try to pass a different type of encapsulation. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev