On Wed, Mar 24, 2021 at 10:51 AM Paolo Abeni <pab...@redhat.com> wrote: > > On Tue, 2021-03-23 at 21:54 -0400, Willem de Bruijn wrote: > > > I did not look at that before your suggestion. Thanks for pointing out. > > > > > > I think the problem is specific to UDP: when processing the outer UDP > > > header that is potentially eligible for both NETIF_F_GSO_UDP_L4 and > > > gro_receive aggregation and that is the root cause of the problem > > > addressed here. > > > > Can you elaborate on the exact problem? The commit mentions "inner > > protocol corruption, as no overaly network parameters is taken in > > account at aggregation time." > > > > My understanding is that these are udp gro aggregated GSO_UDP_L4 > > packets forwarded to a udp tunnel device. They are not encapsulated > > yet. Which overlay network parameters are not, but should have been, > > taken account at aggregation time? > > The scenario is as follow: > > * a NIC has NETIF_F_GRO_UDP_FWD or NETIF_F_GRO_FRAGLIST enabled > * an UDP tunnel is configured/enabled in the system > * the above NIC receives some UDP-tunneled packets, targeting the > mentioned tunnel > * the packets go through gro_receive and they reache > 'udp_gro_receive()' while processing the outer UDP header. > > without this patch, udp_gro_receive_segment() will kick in and the > outer UDP header will be aggregated according to SKB_GSO_FRAGLIST > or SKB_GSO_UDP_L4, even if this is really e.g. a vxlan packet. > > Different vxlan ids will be ignored/aggregated to the same GSO packet. > Inner headers will be ignored, too, so that e.g. TCP over vxlan push > packets will be held in the GRO engine till the next flush, etc. > > Please let me know if the above is more clear.
Yes, thanks a lot! That's very concrete. When processing the outer UDP tunnel header in the gro completion path, it is incorrectly identified as an inner UDP transport layer due to NAPI_GRO_CB(skb) identifying that such a layer is present (is_flist). The issue is that the UDP GRO layer distinguishes between tunnel and transport layer too late, in udp_gro_complete, while an offending assumption of that UDP == transport layer was already made in the callers udp4_gro_complete and udp6_gro_complete.