> > This is a UDP GSO packet egress packet that was further encapsulated > > with a GSO_UDP_TUNNEL on egress, then looped to the ingress path? > > > > Then in the ingress path it has traversed the GRO layer. > > > > Is this veth with XDP? That seems unlikely for GSO packets. But there > > aren't many paths that will loop a packet through napi_gro_receive or > > napi_gro_frags. > > This patch addresses the following scenario: > > sk ->vxlan -> veth -> (xdp in use, TSO disabled, GRO on) -> veth -> vxlan -> > sk > > What I meant here is that the issue is not visible with: > > (baremetal, NETIF_F_GRO_UDP_FWD | NETIF_F_GRO_FRAGLIST enabled -> vxlan -> sk > > > > with the appropriate features bit set, will validate the checksum for > > > both the inner and the outer header - udp{4,6}_gro_receive will be > > > traversed twice, the fist one for the outer header, the 2nd for the > > > inner. > > > > GRO will validate multiple levels of checksums with CHECKSUM_COMPLETE. > > It can only validate the outer checksum with CHECKSUM_UNNECESSARY, I > > believe? > > I possibly miss some bits here ?!? > > AFAICS: > > udp4_gro_receive() -> skb_gro_checksum_validate_zero_check() -> > __skb_gro_checksum_validate -> (if not already valid) > __skb_gro_checksum_validate_complete() -> (if not CHECKSUM_COMPLETE) > __skb_gro_checksum_complete() > > the latter will validate the UDP checksum at the current nesting level > (and set the csum-related bits in the GRO skb cb to the same status > as CHECKSUM_COMPLETE) > > When processing UDP over UDP tunnel packet, the gro call chain will be: > > [l2/l3 GRO] -> udp4_gro_receive (validate outher header csum) -> > udp_sk(sk)->gro_receive -> [other gro layers] -> > udp4_gro_receive (validate inner header csum) -> ...
Agreed. But __skb_gro_checksum_validate on the first invocation will call skb_gro_incr_csum_unnecessary, so that on the second invocation csum_cnt == 0 and triggers a real checksum validation? At least, that is my understanding. Intuitively, CHECKSUM_UNNECESSARY only validates the first checksum, so says nothing about the validity of any subsequent ones. But it seems I'm mistaken? __skb_gro_checksum_validate has an obvious exception for locally generated packets by excluding CHECKSUM_PARTIAL. Looped packets usually have CHECKSUM_PARTIAL set. Unfortunately, this is similar to the issue that I looked at earlier this year with looped UDP packets with MSG_MORE: those are also changed to CHECKSUM_NONE (and exposed a checksum bug: 52cbd23a119c). Is there perhaps some other way that we can identify that these are local packets? They should trivially avoid all checksum checks. > > As for looped packets with CHECKSUM_PARTIAL: we definitely have found > > bugs in that path before. I think it's fine to set csum_valid on any > > packets that can unambiguously be identified as such. Hence the > > detailed questions above on which exact packets this code is > > targeting, so that there are not accidental false positives that look > > the same but have a different ip_summed. > > I see this change is controversial. I have no concerns with the fix. It is just a very narrow path (veth + xdp - tso + gro ..), and to minimize risk I would try to avoid updating state of unrelated packets. That was my initial comment: I don't understand the need for the else clause. > Since the addressed scenario is > really a corner case, a simpler alternative would be > replacing udp_post_segment_fix_csum with: > > static inline void udp_post_segment_fix_cb(struct sk_buff *skb, int level) > { > /* UDP-lite can't land here - no GRO */ > WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov); > > /* UDP CB mirrors the GSO packet, we must re-init it */ > UDP_SKB_CB(skb)->cscov = skb->len; > } > > the downside will be an additional, later, unneeded csum validation for the > > sk ->vxlan -> veth -> (xdp in use, TSO disabled, GRO on) -> veth -> vxlan -> > sk > > scenario. WDYT? No, let's definitely avoid an unneeded checksum verification.