On 5/14/24 12:06, Brendan Doyle via discuss wrote:
> 
> 
> On 14/05/2024 09:50, Dumitru Ceara wrote:
>> On 5/7/24 12:38, Brendan Doyle via discuss wrote:
>>> Hi,
>>>
>>> Seems there is a regression with the latest LTS release in terms of Port
>>> Group
>>> ACLs when ports are in multiple Port Groups. As an example I have 3
>>> ports in a
>>> Port Group, and  two of them in another Port Group, that has an ACL to
>>> allow
>>> IP protocol 112, and this use to work, but now I am seeing lots of:
>>>
>>> 2024-05-07T05:24:06.692Z|01357|acl_log(ovn_pinctrl0)|INFO|name="def-10",
>>> verdict=drop, severity=info, direction=to-lport:
>>> ip,vlan_tci=0x0000,dl_src=00:13:97:ec:6a:3d,dl_dst=01:00:5e:00:00:12,nw_src=10.1.2.21,nw_dst=224.0.0.18,nw_proto=112,nw_tos=192,nw_ecn=0,nw_ttl=255,nw_frag=no
>>>
>>> If I place the ACL in the PG that has all ports then the 112 pkts are
>>> allowed.
>>>
>> Hi Brendan,
>>
>> Thanks for reporting this!  I tried in a simplified setup and I can't
>> really reproduce the problem.  We probably need some more info to debug
>> this, please see below.
> 
> Hi Dumitru,
> 
> I'll need to reproduce it myself, I've noticed that it seems to be
> intermittent, and
> sometimes works and other times does not.  But yes, once I reproduce

Hmm, that's interesting information though.  Can it be an incremental
processing bug in ovn-northd or ovn-controller?

If you reproduce the issue the following additional info might help us
debug:

# Enable northd jsonrpc and i-p logs.
ovn-appctl -t ovn-northd vlog/disable-rate-limit
ovn-appctl -t ovn-northd vlog/set jsonrpc:DBG
ovn-appctl -t ovn-northd vlog/set inc_proc_eng:DBG
ovn-appctl -t ovn-northd vlog/set inc_proc_northd:DBG

# Enable ovn-controller jsonrpc and i-p logs.

ovn-appctl vlog/disable-rate-limit
ovn-appctl vlog/set jsonrpc:DBG
ovn-appctl vlog/set inc_proc_eng:DBG

# Trigger an ovn-controller recompute.
ovn-appctl inc-engine/recompute

# Check if traffic is allowed properly.
# If not, collect northd and ovn-controller logs and share them.

# Trigger an ovn-northd recompute.
ovn-appctl -t ovn-northd inc-engine/recompute

# Check if traffic is allowed properly.
# If not, collect northd and ovn-controller logs and share them.


> I'll get an
> ovn-trace and full NB DB dump.
> 
> Thanks
> 

Thanks!

>>> Here are the details (an example of just one)
>>>
>>> The PG with all ports:
>>>
>>> ovn-nbctl list Port_Group pg_vcn4958117_net72295_sl42074
>>> _uuid               : a928b8f5-3fce-4d32-afaa-8a9cc46ac902
>>> acls                : [29cd825f-73c5-461d-8603-c78a1e89e799,
>>>                         30971674-3471-49aa-ab55-8df17528b250,
>>>                         4f311ad6-0eae-4e23-aea8-a48ac8d503af,
>>>                         6d142bf8-30fc-46df-adde-2318c55a62f2,
>>>                         6d448692-ac65-4d40-aa30-c87fa478ed96,
>>>                         a647d19a-eaf6-4240-a323-0065d3fc8b4e,
>>>                         c323482c-b748-41f3-85a9-de67aa541ff5,
>>>                         d3d923d1-0e89-4165-a747-05a5908e5f08]
>>> external_ids        : {}
>>> name                : pg_vcn4958117_net72295_sl42074
>>>
>>> ports               : [192c7921-8091-4008-878a-5b2f6ae7aadf,
>>> lbv4958117L650B
>>>                              a3e67f89-a989-469d-b7f2-a01631ad4f46,
>>> 766e9eda-0943-11ef-9ba6-0010e0daa67a
>>>                              ee13c9d9-576b-4fa7-9c5b-c2accbc0a811]
>>> lbv4958117L650A
>>>
>>> The Port Group that has just two Ports (which are also in the other PG)
>>>
>>> ovn-nbctl list Port_Group lb_pg_vcn4958117_L650
>>>
>>> _uuid               : 4b04b824-7d27-4a9d-addc-818919544b0f
>>> acls                : [444c209a-8ef6-41c7-97d7-aa66a9c38d66,
>>> 6b76680a-395c-4be1-80df-dcde36426acd]
>>> external_ids        : {}
>>> name                : lb_pg_vcn4958117_L650
>>> ports               : [192c7921-8091-4008-878a-5b2f6ae7aadf,
>>> lbv4958117L650B
>>>                         ee13c9d9-576b-4fa7-9c5b-c2accbc0a811]
>>> lbv4958117L650A
>>>
>>>
>>> The ACL associated with the PG:
>>>
>>> ovn-nbctl acl-list lb_pg_vcn4958117_L650
>>> from-lport 32000 (inport == @lb_pg_vcn4958117_L650 && (ip4.dst ==
>>> 224.0.0.18 && ip.proto == 112)) allow-related
>>>    to-lport 32000 (outport == @lb_pg_vcn4958117_L650 && (ip4.dst ==
>>> 224.0.0.18 && ip.proto == 112)) allow-related
>>>
>>> That should allow IP 112, but it does not (anymore, like I said it use
>>> to work)!
>>>
>>> The ACL associated with pg_vcn4958117_net72295_sl42074:
>>>
>>> from-lport 32767 (inport == @pg_vcn4958117_net72295_sl42074 && (arp ||
>>> udp.dst == 67 || udp.dst == 68)) allow-related
>>> from-lport 32767 (inport == @pg_vcn4958117_net72295_sl42074 && (ip4.dst
>>> == 169.254.0.2 && tcp.dst == 3260)) reject
>>> log(name=pg_vcn4958117_net72295_sl42074_reject,severity=info)
>>> from-lport 32766 (inport == @pg_vcn4958117_net72295_sl42074 && (ip4.src
>>> == 169.254.0.0/16 ||ip4.dst == 169.254.0.0/16)) allow-related
>>> from-lport 16000 (inport == @pg_vcn4958117_net72295_sl42074)
>>> allow-related
>>> from-lport     0 (inport == @pg_vcn4958117_net72295_sl42074) drop
>>> log(name=def-4,severity=info)
>>>    to-lport 32767 (outport == @pg_vcn4958117_net72295_sl42074 && (arp ||
>>> udp.dst == 67 || udp.dst == 68)) allow-related
>>>    to-lport 32767 (outport == @pg_vcn4958117_net72295_sl42074 &&
>>> (ip4.src
>>> == 169.254.0.0/16 ||ip4.dst == 169.254.0.0/16)) allow-related
>>>    to-lport     0 (outport == @pg_vcn4958117_net72295_sl42074) drop
>>> log(name=def-10,severity=info)
>>>
>>> So Even though the lb_pg_vcn4958117_L650 allows 112, we are hitting the
>>> drop in the ACL above!
>>>
>>>
>>> I looked at the sbflows and it seems to have flows for the 112 ACL
>>> entry:
>>>
>>> ovn-sbctl lflow-list
>>> --------------------
>>> Datapath: "ls_vcn4958117_net72295"
>>> (7bba2e8d-6612-487b-8f1e-f2e98d281a85)  Pipeline: ingress
>>>
>>>   table=9 (ls_in_acl          ), priority=33000, match=(reg0[7] == 1 &&
>>> (inport == @lb_pg_vcn4958117_L650 && (ip4.dst == 224.0.0.18 && ip.proto
>>> == 112))), action=(reg0[1] = 1; next;)
>>>   table=9 (ls_in_acl          ), priority=33000, match=(reg0[8] == 1 &&
>>> (inport == @lb_pg_vcn4958117_L650 && (ip4.dst == 224.0.0.18 && ip.proto
>>> == 112))), action=(next;)
>>>
>>> Datapath: "ls_vcn4958117_net72295"
>>> (7bba2e8d-6612-487b-8f1e-f2e98d281a85)  Pipeline: egress
>>>
>>>    table=4 (ls_out_acl         ), priority=33000, match=(reg0[7] == 1 &&
>>> (outport == @lb_pg_vcn4958117_L650 && (ip4.dst == 224.0.0.18 && ip.proto
>>> == 112))), action=(reg0[1] = 1; next;)
>>>    table=4 (ls_out_acl         ), priority=33000, match=(reg0[8] == 1 &&
>>> (outport == @lb_pg_vcn4958117_L650 && (ip4.dst == 224.0.0.18 && ip.proto
>>> == 112))), action=(next;)
>>>
>> Would it be possible to try an ovn-trace for the packets?  The output
>> should be quite accurate, VRRP (IP multicast packets will always be
>> considered as having ct_state=+trk+new if I'm not wrong).
>>
>> An example of ovn-trace invocation (you need to replace macs, IPs and
>> ports to match your setup):
>>
>> ovn-trace 'inport=="vm1" && eth.src == 00:00:00:00:00:01 && eth.dst ==
>> 01:00:5e:00:00:12 && ip4.src == 42.42.42.2 && ip4.dst == 224.0.0.18 &&
>> ip.proto == 112'
>>
>>> And OVS flows also seem to have entries:
>>> ovs-ofctl dump-flows br-int
>>>
>>> cookie=0x48b6ec2c, table=17,
>>> priority=33000,ip,reg0=0x100/0x100,reg14=0x5,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=resubmit(,18)
>>>   cookie=0xbb05b3e, table=17,
>>> priority=33000,ip,reg0=0x100/0x100,reg14=0x26,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=resubmit(,18)
>>>   cookie=0x45ade6ea, table=17,
>>> priority=33000,ip,reg0=0x100/0x100,reg14=0x28,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=resubmit(,18)
>>>   cookie=0x9ea8bc7e, table=17,
>>> priority=33000,ip,reg0=0x100/0x100,reg14=0x7,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=resubmit(,18)
>>>   cookie=0xc3b6660c, table=17,
>>> priority=33000,ip,reg0=0x100/0x100,reg14=0x31,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=resubmit(,18)
>>>   cookie=0x1574390e, table=17,
>>> priority=33000,ip,reg0=0x100/0x100,reg14=0x32,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=resubmit(,18)
>>>   cookie=0xba0820b4, table=17,
>>> priority=33000,ip,reg0=0x100/0x100,reg14=0x4,metadata=0x61,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=resubmit(,18)
>>>   cookie=0x3ab5481d, table=17,
>>> priority=33000,ip,reg0=0x80/0x80,reg14=0x5,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18)
>>>   cookie=0x246464a0, table=17,
>>> priority=33000,ip,reg0=0x80/0x80,reg14=0x26,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18)
>>>   cookie=0xb7597534, table=17,
>>> priority=33000,ip,reg0=0x80/0x80,reg14=0x28,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18)
>>>   cookie=0xa39db4b2, table=17,
>>> priority=33000,ip,reg0=0x80/0x80,reg14=0x7,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18)
>>>   cookie=0x1ba59849, table=17,
>>> priority=33000,ip,reg0=0x80/0x80,reg14=0x31,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18)
>>>   cookie=0x845680ed, table=17,
>>> priority=33000,ip,reg0=0x80/0x80,reg14=0x32,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18)
>>>   cookie=0x5b88a1c8, table=17,
>>> priority=33000,ip,reg0=0x80/0x80,reg14=0x4,metadata=0x61,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18)
>>>
>>> cookie=0x43eec225, table=44,
>>> priority=33000,ip,reg0=0x80/0x80,reg15=0x5,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45)
>>>   cookie=0xb2e98e98, table=44,
>>> priority=33000,ip,reg0=0x80/0x80,reg15=0x26,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45)
>>>   cookie=0x22d2bdce, table=44,
>>> priority=33000,ip,reg0=0x80/0x80,reg15=0x28,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45)
>>>   cookie=0xcbc86a80, table=44,
>>> priority=33000,ip,reg0=0x80/0x80,reg15=0x7,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45)
>>>   cookie=0x788cf92d, table=44,
>>> priority=33000,ip,reg0=0x80/0x80,reg15=0x31,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45)
>>>   cookie=0x93fc9ed, table=44,
>>> priority=33000,ip,reg0=0x80/0x80,reg15=0x32,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45)
>>>   cookie=0xa41bd3e7, table=44,
>>> priority=33000,ip,reg0=0x80/0x80,reg15=0x4,metadata=0x61,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45)
>>>   cookie=0xb0c3031a, table=44,
>>> priority=33000,ip,reg0=0x100/0x100,reg15=0x5,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=resubmit(,45)
>>>   cookie=0x13552aa4, table=44,
>>> priority=33000,ip,reg0=0x100/0x100,reg15=0x26,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=resubmit(,45)
>>>   cookie=0xda88db5f, table=44,
>>> priority=33000,ip,reg0=0x100/0x100,reg15=0x28,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=resubmit(,45)
>>>   cookie=0xf29c5f9c, table=44,
>>> priority=33000,ip,reg0=0x100/0x100,reg15=0x7,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=resubmit(,45)
>>>   cookie=0x73e20386, table=44,
>>> priority=33000,ip,reg0=0x100/0x100,reg15=0x31,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=resubmit(,45)
>>>   cookie=0xe72b74cd, table=44,
>>> priority=33000,ip,reg0=0x100/0x100,reg15=0x32,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=resubmit(,45)
>>>   cookie=0x302c1113, table=44,
>>> priority=33000,ip,reg0=0x100/0x100,reg15=0x4,metadata=0x61,nw_dst=224.0.0.18,nw_proto=112
>>>  actions=resubmit(,45)
>>>
>>>
>>> And as I said if I add the rule to the PG that has all the ports in it,
>>> then things work:
>>>
>>>
>>> ovn-nbctl acl-list pg_vcn4958117_net72295_sl42074
>>> from-lport 32767 (inport == @pg_vcn4958117_net72295_sl42074 && (arp ||
>>> udp.dst == 67 || udp.dst == 68)) allow-related
>>> from-lport 32767 (inport == @pg_vcn4958117_net72295_sl42074 && (ip4.dst
>>> == 169.254.0.2 && tcp.dst == 3260)) reject
>>> log(name=pg_vcn4958117_net72295_sl42074_reject,severity=info)
>>> from-lport 32766 (inport == @pg_vcn4958117_net72295_sl42074 && (ip4.src
>>> == 169.254.0.0/16 ||ip4.dst == 169.254.0.0/16)) allow-related
>>> from-lport 32000 (inport == @lb_pg_vcn4958117_L650 && (ip.proto == 112))
>>> allow-related log(name=BJD)
>>> from-lport 32000 (inport == @pg_vcn4958117_net72295_sl42074 && (ip4.dst
>>> == 224.0.0.18 && ip.proto == 112)) allow-related log(name=BJD)
>>> from-lport 16000 (inport == @pg_vcn4958117_net72295_sl42074)
>>> allow-related
>>>
>>> from-lport     0 (inport == @pg_vcn4958117_net72295_sl42074) drop
>>> log(name=def-4,severity=info)
>>>    to-lport 32767 (outport == @pg_vcn4958117_net72295_sl42074 && (arp ||
>>> udp.dst == 67 || udp.dst == 68)) allow-related
>>>    to-lport 32767 (outport == @pg_vcn4958117_net72295_sl42074 &&
>>> (ip4.src
>>> == 169.254.0.0/16 ||ip4.dst == 169.254.0.0/16)) allow-related
>>>    to-lport 32000 (outport == @lb_pg_vcn4958117_L650 && (ip.proto ==
>>> 112)) allow-related log(name=BJD)
>>>    to-lport 32000 (outport == @pg_vcn4958117_net72295_sl42074 &&
>>> (ip4.dst
>>> == 224.0.0.18 && ip.proto == 112)) allow-related log(name=BJD)
>>>    to-lport     0 (outport == @pg_vcn4958117_net72295_sl42074) drop
>>> log(name=def-10,severity=info)
>> An ovn-trace in the working scenario might help us spot the difference.
>>
>> Ideally, if you could share the whole northbound database content that
>> would make it easier to debug.
>>
>> Best regards,
>> Dumitru
>>
> 
> _______________________________________________
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to