On 2/19/24 17:19, Ilya Maximets wrote:
> On 2/7/24 03:21, Lim, Derrick wrote:
>> Hi Ilya Maximets,
>>
>> From the tcpdump, with or without the rewrite, the link-local address was 
>> used.
>>
>> ===
>> $ ovs-tcpdump -nn -i exit_p0
>> 11:10:26.323938 IP6 fe80::dc03:37ff:fee2:1fef.51513 > 
>> 2403:400:31da:ffff::18:3.4789: VXLAN, flags [I] (0x08), vni 1
>> IP 100.87.18.60 > 192.168.1.33: ICMP echo request, id 70, seq 1, length 64
>> 11:10:27.326875 IP6 fe80::dc03:37ff:fee2:1fef.51513 > 
>> 2403:400:31da:ffff::18:3.4789: VXLAN, flags [I] (0x08), vni 1
>> IP 100.87.18.60 > 192.168.1.33: ICMP echo request, id 70, seq 2, length 64
>> ===
>>
>> Here is the output of the trace without the rewrite.
>> ===
>> $ ovs-appctl ofproto/trace --names br-int 'in_port=dpdk-vm101,
>> eth_src=52:54:00:3d:cd:0c,eth_dst=00:00:00:00:00:01,eth_type=0x0800,
>> nw_src=100.87.18.60,nw_dst=192.168.1.33,nw_proto=1,nw_ttl=64,nw_frag=no,
>> icmp_type=8,icmp_code=0'
>> Flow: 
>> icmp,in_port="dpdk-vm101",vlan_tci=0x0000,dl_src=52:54:00:3d:cd:0c,dl_dst=00:00:00:00:00:01,nw_src=100.87.18.60,nw_dst=192.168.1.33,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,icmp_type=8,icmp_code=0
>>
>> bridge("br-int")
>> ------------------
>> 0. in_port="dpdk-vm101", priority 32768
>>     output:vxlan0
>>      -> output to native tunnel
>>      -> tunneling to fe80::920a:84ff:fe9e:9570 via br-phy
>>      -> tunneling from de:03:37:e2:1f:ef fe80::dc03:37ff:fee2:1fef to 
>> 90:0a:84:9e:95:70 fe80::920a:84ff:fe9e:9570
>>
>> bridge("br-phy")
>> -------------------
>> 0. priority 10
>>     NORMAL
>>      -> forwarding to learned port
>>
>> Final flow: unchanged
>> Megaflow: recirc_id=0,eth,ip,in_port="dpdk-vm101",nw_ecn=0,nw_frag=no
>> Datapath actions: 
>> tnl_push(tnl_port(vxlan_sys_4789),header(size=70,type=4,eth(dst=90:0a:84:9e:95:70,src=de:03:37:e2:1f:ef,dl_type=0x86dd),ipv6(src=fe80::dc03:37ff:fee2:1fef,dst=2403:400:31da:ffff::18:3,label=0,proto=17,tclass=0x0,hlimit=64),udp(src=0,dst=4789,csum=0xffff),vxlan(flags=0x8000000,vni=0x1)),out_port(br-phy)),push_vlan(vid=304,pcp=0),exit_p0
>> ===
>>
>> The "tunneling to fe80::920a:84ff:fe9e:9570 via br-phy" looks a bit curious.
>> I'm not sure why this was picked instead of the `remote_ip` specified in the
>> tunnel configuration. But then the final datapath actions shows the correct
>> `dst` address.
> 
> Hi.  Sorry for the late reply, was caught up in the releases.
> 
> The 'tunneling to' message may be a little misleading, it prints out the
> result of a route lookup, and we only use the device name from it while
> building a tunnel header.  The correct remote ip will be taken from a tunnel
> configuration, not the IP from a route lookup.  Maybe the wording in the
> trace needs some adjustment.

On a second look, it does seem a little strange.  The likley cause of
having fe80::920a:84ff:fe9e:9570 instead of the configured remote_ip
is that OVS found a route to 2403:400:31da:ffff::18:3 via a gateway
fe80::920a:84ff:fe9e:9570.  But in one of the previous route lookups
you provided the fe80::920a:84ff:fe9e:9570 was indeed a gatewey IP,
so it checks out.  The correct remote_ip is used in the actions because
though we're sending the packet via gateway, we're not send it to the
gateway.  The gateway IP is only needed to get the destination MAC.

'tunneling to fe80::920a:84ff:fe9e:9570 via br-phy' should probbaly be
'tunneling via fe80::920a:84ff:fe9e:9570 and br-phy' in this case.

> 
>> Why is the `local_ip` specified in the VXLAN tunnel options
>> not considered?
> I see there is a bug in the tunnel lookup code that doesn't take into
> account IPv6 local ip.  It only checks for IPv4 one.  The following
> change should fix it:
> 
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 1cf4d5f7c..89f183182 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -3815,6 +3815,8 @@ native_tunnel_output(struct xlate_ctx *ctx, const 
> struct xport *xport,
>  
>      if (flow->tunnel.ip_src) {
>          in6_addr_set_mapped_ipv4(&s_ip6, flow->tunnel.ip_src);
> +    } else if (ipv6_addr_is_set(&flow->tunnel.ipv6_src)) {
> +        s_ip6 = flow->tunnel.ipv6_src;
>      }
>  
>      err = tnl_route_lookup_flow(ctx, flow, &d_ip6, &s_ip6, &out_dev);
> ---
> 
> Could you try it in your setup?
> 
> Without this change the route lookup is performed without taking the
> local_ip into account and later the local_ip is not used for the packet
> header.
> 
> I'll work on a proper patch for this.

FWIW, I posted a fix here:
  
https://patchwork.ozlabs.org/project/openvswitch/patch/20240220223547.2368878-4-i.maxim...@ovn.org/

> 
> Best regards, Ilya Maximets.
> 
>>
>> Here is the out of the trace with the rewrite. It seems the flow entry was
>> matched but the rewrite didn't happen.
>>
>> ===
>> $ ovs-appctl ofproto/trace --names br-int 'in_port=dpdk-vm101,
>> eth_src=52:54:00:3d:cd:0c,eth_dst=00:00:00:00:00:01,eth_type=0x0800,
>> nw_src=100.87.18.60,nw_dst=192.168.1.33,nw_proto=1,nw_ttl=64,nw_frag=no,
>> icmp_type=8,icmp_code=0'
>> Flow: 
>> icmp,in_port="dpdk-vm101",vlan_tci=0x0000,dl_src=52:54:00:3d:cd:0c,dl_dst=00:00:00:00:00:01,nw_src=100.87.18.60,nw_dst=192.168.1.33,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,icmp_type=8,icmp_code=0
>>
>> bridge("br-int")
>> ------------------
>> 0. in_port="dpdk-vm101", priority 32768
>>     output:vxlan0
>>      -> output to native tunnel
>>      -> tunneling to fe80::920a:84ff:fe9e:9570 via br-phy
>>      -> tunneling from de:03:37:e2:1f:ef fe80::dc03:37ff:fee2:1fef to 
>> 90:0a:84:9e:95:70 fe80::920a:84ff:fe9e:9570
>>
>> bridge("br-phy")
>> -------------------
>> 0. ipv6,ipv6_dst=2403:400:31da:ffff::18:3, priority 499
>>     load:0x180006->NXM_NX_IPV6_SRC[0..63]
>>     load:0x2403040031daffff->NXM_NX_IPV6_SRC[64..127]
>>     NORMAL
>>      -> forwarding to learned port
>>
>> Final flow: unchanged
>> Megaflow: recirc_id=0,eth,ip,in_port="dpdk-vm101",nw_ecn=0,nw_frag=no
>> Datapath actions: 
>> tnl_push(tnl_port(vxlan_sys_4789),header(size=70,type=4,eth(dst=90:0a:84:9e:95:70,src=de:03:37:e2:1f:ef,dl_type=0x86dd),ipv6(src=fe80::dc03:37ff:fee2:1fef,dst=2403:400:31da:ffff::18:3,label=0,proto=17,tclass=0x0,hlimit=64),udp(src=0,dst=4789,csum=0xffff),vxlan(flags=0x8000000,vni=0x1)),out_port(br-phy)),push_vlan(vid=304,pcp=0),exit_p0
>> ===
>>
>> # VXLAN interface on bridge configuration
>> ===
>>     Port vxlan0
>>         Interface vxlan0
>>             type: vxlan
>>             options: {dst_port="4789", key="1", 
>> local_ip="2403:400:31da:ffff::18:6", remote_ip="2403:400:31da:ffff::18:3"}
>> ===
>>
>> Thank you,
>> Derrick
>>
>>  
>>
>> *From: *Ilya Maximets <i.maxim...@ovn.org>
>> *Date: *Friday, February 2, 2024 at 22:27
>> *To: *Lim, Derrick | Derrick | CMD <derrick....@rakuten.com>, 
>> ovs-discuss@openvswitch.org <ovs-discuss@openvswitch.org>
>> *Cc: *i.maxim...@ovn.org <i.maxim...@ovn.org>
>> *Subject: *Re: [ovs-discuss] Encapsulate VXLAN and then process other flows
>>
>> [EXTERNAL] This message comes from an external organization.
>>
>> On 2/2/24 08:58, Lim, Derrick via discuss wrote:
>>> Hi Ilya Maximets,
>>>
>>>>  The rules look mostly fine.  I think the main problem you have is 
>>>> priority.
>>>> Default priority for OF rules (if not specified) is 32768, so your new 
>>>> rules
>>>> with priority 50 are too low in a priority list and not getting hit.
>>>
>>> I tried this again with the default flow at priority 50 and mine at 499 but 
>>> I
>>> still couldn't get the flow to hit.
>>>
>>> However, I observed that if the source address is set to anything other than
>>> `2403:400:31da:ffff::18:6`, which is an address that exists on the phy 
>>> `br-phy`
>>> interface, the lookup is a hit and the action taken.
>>>
>>> Is there anything that prevents the address from being set to that of 
>>> something
>>> that is already configured on an interface?
>>>
>>> For example,
>>>
>>> $ ip addr
>>> 35: br-phy: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc 
>>> fq_codel state UNKNOWN group default qlen 1000
>>>      link/ether de:03:37:e2:1f:ef brd ff:ff:ff:ff:ff:ff
>>>      inet6 2403:400:31da:ffff::18:6/128 scope global
>>>         valid_lft forever preferred_lft forever
>>>      inet6 fe80::dc03:37ff:fee2:1fef/64 scope link
>>>         valid_lft forever preferred_lft forever
>>>
>>> Set the source address to `2403:400:31da:ffff::18:5`. In the flow entry,
>>> `set(ipv6(src=2403:400:31da:ffff::18:5))` is applied in actions.
>>>
>>> ===
>>>
>>> $ /usr/bin/ovs-ofctl add-flow br-phy \
>>>      
>>> "priority=499,ipv6,ipv6_dst=2403:400:31da:ffff::18:3,actions=set_field:2403:400:31da:ffff::18:5->ipv6_src,normal"
>>>
>>> $ /usr/bin/ovs-appctl dpctl/dump-flows -m netdev@ovs-netdev | grep 
>>> 192.168.1.33
>>>
>>> ufid:acc4b3bc-4958-412d-90c2-9bc4b3fbfac7,
>>>      
>>> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),
>>>      recirc_id(0),dp_hash(0/0),in_port(dpdk-vm101),packet_type(ns=0,id=0),
>>>      
>>> eth(src=52:54:00:3d:cd:0c/00:00:00:00:00:00,dst=00:00:00:00:00:01/00:00:00:00:00:00),eth_type(0x0800),
>>>      
>>> ipv4(src=100.87.18.60/0.0.0.0,dst=192.168.1.33/0.0.0.0,proto=1/0,tos=0/0x3,ttl=64/0,frag=no),
>>>      icmp(type=8/0,code=0/0), packets:407, bytes:39886, used:0.661s, dp:ovs,
>>> actions:
>>>      tnl_push(tnl_port(vxlan_sys_4789),header(size=70,type=4,
>>>               
>>> eth(dst=90:0a:84:9e:95:70,src=de:03:37:e2:1f:ef,dl_type=0x86dd),
>>>               
>>> ipv6(src=fe80::dc03:37ff:fee2:1fef,dst=2403:400:31da:ffff::18:3,label=0,proto=17,tclass=0x0,hlimit=64),
>>>               
>>> udp(src=0,dst=4789,csum=0xffff),vxlan(flags=0x8000000,vni=0x1)),out_port(br-phy)),
>>>      set(ipv6(src=2403:400:31da:ffff::18:5)),
>>>      push_vlan(vid=304,pcp=0),
>>>      exit_p0,
>>> dp-extra-info:miniflow_bits(4,1)
>>>
>>> ===
>>>
>>>
>>> Set the source address to `2403:400:31da:ffff::18:6`. This is a configured
>>> address on `br-phy`. The `set(ipv6(src=<addr>))` part is no longer applied.
>>>
>>> ===
>>>
>>> $ /usr/bin/ovs-ofctl add-flow br-phy \
>>>      
>>> "priority=499,ipv6,ipv6_dst=2403:400:31da:ffff::18:3,actions=set_field:2403:400:31da:ffff::18:6->ipv6_src,normal"
>>>
>>> $ /usr/bin/ovs-appctl dpctl/dump-flows -m netdev@ovs-netdev | grep 
>>> 192.168.1.33
>>>
>>> ufid:acc4b3bc-4958-412d-90c2-9bc4b3fbfac7,
>>>      
>>> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),
>>>      recirc_id(0),dp_hash(0/0),in_port(dpdk-vm101),packet_type(ns=0,id=0),
>>>      
>>> eth(src=52:54:00:3d:cd:0c/00:00:00:00:00:00,dst=00:00:00:00:00:01/00:00:00:00:00:00),eth_type(0x0800),
>>>      
>>> ipv4(src=100.87.18.60/0.0.0.0,dst=192.168.1.33/0.0.0.0,proto=1/0,tos=0/0x3,ttl=64/0,frag=no),
>>>      icmp(type=8/0,code=0/0), packets:423, bytes:41454, used:0.803s, dp:ovs,
>>> actions:
>>>      tnl_push(tnl_port(vxlan_sys_4789),header(size=70,type=4,
>>>               
>>> eth(dst=90:0a:84:9e:95:70,src=de:03:37:e2:1f:ef,dl_type=0x86dd),
>>>               
>>> ipv6(src=fe80::dc03:37ff:fee2:1fef,dst=2403:400:31da:ffff::18:3,label=0,proto=17,tclass=0x0,hlimit=64),
>>>               
>>> udp(src=0,dst=4789,csum=0xffff),vxlan(flags=0x8000000,vni=0x1)),out_port(br-phy)),
>>>      push_vlan(vid=304,pcp=0),
>>>      exit_p0,
>>> dp-extra-info:miniflow_bits(4,1)
>>>
>>> $ /usr/bin/ovs-ofctl dump-flows br-phy
>>>
>>> cookie=0x0, duration=170.787s, table=0, n_packets=251, n_bytes=40328,
>>>      priority=499,ipv6,ipv6_dst=2403:400:31da:ffff::18:3
>>>      
>>> actions=load:0x180006->NXM_NX_IPV6_SRC[0..63],load:0x2403040031daffff->NXM_NX_IPV6_SRC[64..127],NORMAL
>>>
>>> cookie=0x0, duration=1136.132s, table=0, n_packets=10175, n_bytes=1116852, 
>>> priority=10 actions=NORMAL
>>>
>>> ===
>>
>>
>> Hmm.  This is interesting.  We can see that some packets do actually hit the 
>> OpenFlow
>> rule (n_packets=251).  The decision to not include the set(ipv6(src=<addr>)) 
>> action
>> is likely made because it is the same as one already in the packet.  But we 
>> can see
>> from the datapath flow dump that it is supposed to be different:
>>
>>   tnl_push( ... ipv6(src=fe80::dc03:37ff:fee2:1fef, ... )
>>
>> This is a link-local IP of that interface.
>>
>> I suspect that a mishap happened somewhere and 2403:400:31da:ffff::18:6 was 
>> used for
>> the actual tunnel header, or it was used to updated the local flow structure 
>> during
>> the flow translation instead of a link-local address and hence OVS thinks 
>> that the
>> address is already in the packet and doesn't set a new one.
>>
>> Can you capture this packet on the exit_p0 interface?  e.g. with ovs-tcpdump.
>> So we can see what is the actual IP of the outgoing packet.
>>
>> Also, it might be useful if you can run ofproto/trace command to check how 
>> OVS makes
>> the decision during the flow translation.  It should look something like 
>> this:
>>
>> $ ovs-appctl ofproto/trace --names br-int \
>> 'in_port=dpdk-vm101,
>>  eth_src=52:54:00:3d:cd:0c,eth_dst=00:00:00:00:00:01,eth_type=0x0800,
>>  nw_src=100.87.18.60,nw_dst=192.168.1.33,nw_proto=1,nw_ttl=64,nw_frag=no,
>>  icmp_type=8,icmp_code=0'
>>
>> This command doesn't have any side effects by default, i.e. it doesn't 
>> generate any
>> actual traffic.  It only shows the logic of how the datapath actions are 
>> generated.
>>
>> Best regards, Ilya Maximets.
>>
> 

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to