On 3/22/24 15:30, Felix Huettner wrote: >>> >>> Thanks for the report and analysis! >>> >>> I agree, this is indeed a bug in recirculation + tunnel offload logic. >>> >>> To recap: >>> >>> 1. TCP packet enters OVS. >>> 2. TCP packet gets encapsulated into UDP tunnel. >>> 3. Recirculation happens. >>> 4. Packet is re-parsed after recirculation with miniflow_extract() >>> or similar function. >>> 5. Packet is marked for UDP checksumming because we parse the outermost set >>> of headers. But since it is tunneled, it means inner UDP checksumming. >>> And that makes no sense, because the inner packet is TCP. >> >> Hi everyone, >> >> we are actually seeing a similar issue, allthough in our case we are >> crashing ovs with the following assertion when sending pings: >> >> util(pmd-c07/id:9)|EMER|lib/dp-packet.h:1425: assertion ip failed in >> dp_packet_ip_set_header_csum() >> >> or a segmentation fault when sending tcp traffic (because >> packet_tcp_complete_csum has not asserts) >> >>> >>> I'm assuming that this re-parsing is also messing up all the offsets in the >>> dp-packet. We likely need fix all the checksums before recirculation and >>> clear all the flags. For TSO however that would mean that we will need to >>> drop the packet as we don't have GSO yet and we can't actually segment >>> traffic in the middle of pipeline anyway... >> >> So miniflow_extract resets inner_l3_ofs and inner_l4_ofs quite at the >> start of the function and has no logic to add these values back (at >> least as far as i saw that). >> >>> >>> Mike, any thoughts on this? >>> >>> >>> A workaround for this particular situation hwoever would be enabling support >>> for lb-output action on the bond: >>> >>> ovs-vsctl set Port bond0 other_config:lb-output-action=true >> >> This helps at least for our issue. >> >>> >>> This should avoid the recirculation in this particular scenario. >>> But we still need to find a solution for the recirculation issue. >> >> I currently have no real idea how we could recover inner_l3_ofs for >> geneve/vxlan packets. Since for UDP traffic we do not know the upper >> protocol it is basically impossible to find out for sure if a given >> communication is geneve/vxlan. Therefor recovering inner_l3_ofs for a >> given packet sounds to me either like guessing or not really possible. >> >> An alternative i have though about is not resetting the offsets in >> miniflow_extract for packets that are recirculating. In this case we >> should have started with the original packet at some point and only >> modified it by actions we defined. If all of these actions modify the >> offsets as appropriate is there any reason at all to recalculate them? > > Hi everyone, > > i have built a first version of this here and it does indeed solve our > issues. Not sure though if there is more. I would like to send it to > ovs-dev sometime next week once i added some tests. > In the mean time you can take a look here: > > https://github.com/openvswitch/ovs/compare/master...felixhuettner:ovs:fix_it
Thanks, Felix! This is a possible route to take, however, there is still an issue with miniflow_extract() requesting UDP checksum offload not realizing that it requests it for an inner header. <rant> IMHO, the fact that flag can mean different things at different times is the root of many of the problems with offloading and we'll keep getting issues like this until API is fixed. </rant> I sent my version of the fix here: https://patchwork.ozlabs.org/project/openvswitch/patch/20240322144213.1073946-1-i.maxim...@ovn.org/ This should resolve the issue, but at a cost of dropping tunneled traffic when userspace-tso is enabled. Would be great if you can test it out. The other big issue is that these crashes are happening in the default configuration, i.e. do not actually require users to opt-in for userspace-tso. Best regards, Ilya Maximets. > > Thanks > Felix > >> >> The really ugly alternative that is probably also rather guesswork would >> be: >> <<< start of miniflow_extract >> tun_l3_ofs = inner_l3_ofs - l4_ofs; >> tun_l4_ofs = inner_l4_ofs - inner_l3_ofs; >> <<< main miniflow_extract code >> packet->inner_l3_ofs = packet->l4_ofs + tun_l3_ofs; >> packet->inner_l4_ofs = packet->inner_l3_ofs + tun_l4_ofs; >> <<< end of miniflow_extract >> >> But i do not think that is a good idea. >> >> Thanks >> Felix >> >> _______________________________________________ >> discuss mailing list >> disc...@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss