On Sat, Nov 7, 2020 at 11:46 AM Jakub Kicinski <k...@kernel.org> wrote: > > On Tue, 3 Nov 2020 16:11:34 -0800 Yi-Hung Wei wrote: > > TUNNEL_GENEVE_OPT is set on tun_flags in struct sw_flow_key when > > a packet is coming from a geneve tunnel no matter the size of geneve > > option is zero or not. On the other hand, TUNNEL_VXLAN_OPT or > > TUNNEL_ERSPAN_OPT is set when the VXLAN or ERSPAN option is available. > > Currently, ovs kernel module only generates > > OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS when the tun_opts_len is non-zero. > > As a result, for a geneve packet without tun_metadata, when the packet > > is reinserted from userspace after upcall, we miss the TUNNEL_GENEVE_OPT > > in the tun_flags on struct sw_flow_key, and that will further cause > > megaflow matching issue. > > > > This patch changes the way that we deal with the upcall netlink message > > generation to make sure the geneve tun_flags is set consistently > > as the packet is firstly received from the geneve tunnel in order to > > avoid megaflow matching issue demonstrated by the following flows. > > This issue is only observed on ovs kernel datapath. > > > > Consider the following two flows, and the two cases. > > * flow1: icmp traffic from gnv0 to gnv1, without any tun_metadata > > * flow2: icmp traffic form gnv0 to gnv1 with tun_metadata0 > > > > Case 1) > > Send flow2 first, and then send flow1. When both flows are running, > > both the following two flows are hit, which is expected. > > > > table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x0/0xff > > action=output:gnv1 > > table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x1/0xff > > action=output:gnv1 > > > > Case 2) > > Send flow1 first, then send flow2. When both flows are running, > > only the following flow is hit. > > table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x0/0xff > > action=output:gnv1 > > > > Example flows) > > > > table=0, arp, actions=NORMAL > > table=0, in_port=gnv1, icmp, action=ct(table=1) > > table=0, in_port=gnv0, icmp > > action=move:NXM_NX_TUN_METADATA0[0..7]->NXM_NX_REG9[0..7], resubmit(,1) > > table=1, in_port=gnv1, icmp, action=output:gnv0 > > table=1, in_port=gnv0, icmp action=ct(table=2) > > table=2, priority=300, in_port=gnv0, icmp, ct_state=+trk+new, > > action=ct(commit),output:gnv1 > > table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x0/0xff > > action=output:gnv1 > > table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x1/0xff > > action=output:gnv1 > > > > Fixes: fc4099f17240 ("openvswitch: Fix egress tunnel info.") > > Signed-off-by: Yi-Hung Wei <yihung....@gmail.com> > > Wouldn't it make more sense to make GENEVE behave like the other > tunnels rather than hack this logic into consumer of the flag?
Thanks Jakub for the review. Yes, it makes sense to fix on the tunnel side. I submit another patch to fix it: https://lore.kernel.org/netdev/1605053800-74072-1-git-send-email-yihung....@gmail.com/T/#u > Please make sure that you CC authors of the patch you're fixing > and the maintainers of the code you're changing. > > ./scripts/get_maintainer.pl is your friend. Thanks for reminding me. Will do that from now on. -Yi-Hung