On 14/05/2024 09:50, Dumitru Ceara wrote:
On 5/7/24 12:38, Brendan Doyle via discuss wrote:
Hi,

Seems there is a regression with the latest LTS release in terms of Port
Group
ACLs when ports are in multiple Port Groups. As an example I have 3
ports in a
Port Group, and  two of them in another Port Group, that has an ACL to
allow
IP protocol 112, and this use to work, but now I am seeing lots of:

2024-05-07T05:24:06.692Z|01357|acl_log(ovn_pinctrl0)|INFO|name="def-10",
verdict=drop, severity=info, direction=to-lport:
ip,vlan_tci=0x0000,dl_src=00:13:97:ec:6a:3d,dl_dst=01:00:5e:00:00:12,nw_src=10.1.2.21,nw_dst=224.0.0.18,nw_proto=112,nw_tos=192,nw_ecn=0,nw_ttl=255,nw_frag=no

If I place the ACL in the PG that has all ports then the 112 pkts are
allowed.

Hi Brendan,

Thanks for reporting this!  I tried in a simplified setup and I can't
really reproduce the problem.  We probably need some more info to debug
this, please see below.

Hi Dumitru,

I'll need to reproduce it myself, I've noticed that it seems to be intermittent, and sometimes works and other times does not.  But yes, once I reproduce I'll get an
ovn-trace and full NB DB dump.

Thanks

Here are the details (an example of just one)

The PG with all ports:

ovn-nbctl list Port_Group pg_vcn4958117_net72295_sl42074
_uuid               : a928b8f5-3fce-4d32-afaa-8a9cc46ac902
acls                : [29cd825f-73c5-461d-8603-c78a1e89e799,
                        30971674-3471-49aa-ab55-8df17528b250,
                        4f311ad6-0eae-4e23-aea8-a48ac8d503af,
                        6d142bf8-30fc-46df-adde-2318c55a62f2,
                        6d448692-ac65-4d40-aa30-c87fa478ed96,
                        a647d19a-eaf6-4240-a323-0065d3fc8b4e,
                        c323482c-b748-41f3-85a9-de67aa541ff5,
                        d3d923d1-0e89-4165-a747-05a5908e5f08]
external_ids        : {}
name                : pg_vcn4958117_net72295_sl42074

ports               : [192c7921-8091-4008-878a-5b2f6ae7aadf,
lbv4958117L650B
                             a3e67f89-a989-469d-b7f2-a01631ad4f46,
766e9eda-0943-11ef-9ba6-0010e0daa67a
                             ee13c9d9-576b-4fa7-9c5b-c2accbc0a811]
lbv4958117L650A

The Port Group that has just two Ports (which are also in the other PG)

ovn-nbctl list Port_Group lb_pg_vcn4958117_L650

_uuid               : 4b04b824-7d27-4a9d-addc-818919544b0f
acls                : [444c209a-8ef6-41c7-97d7-aa66a9c38d66,
6b76680a-395c-4be1-80df-dcde36426acd]
external_ids        : {}
name                : lb_pg_vcn4958117_L650
ports               : [192c7921-8091-4008-878a-5b2f6ae7aadf,
lbv4958117L650B
                        ee13c9d9-576b-4fa7-9c5b-c2accbc0a811]
lbv4958117L650A


The ACL associated with the PG:

ovn-nbctl acl-list lb_pg_vcn4958117_L650
from-lport 32000 (inport == @lb_pg_vcn4958117_L650 && (ip4.dst ==
224.0.0.18 && ip.proto == 112)) allow-related
   to-lport 32000 (outport == @lb_pg_vcn4958117_L650 && (ip4.dst ==
224.0.0.18 && ip.proto == 112)) allow-related

That should allow IP 112, but it does not (anymore, like I said it use
to work)!

The ACL associated with pg_vcn4958117_net72295_sl42074:

from-lport 32767 (inport == @pg_vcn4958117_net72295_sl42074 && (arp ||
udp.dst == 67 || udp.dst == 68)) allow-related
from-lport 32767 (inport == @pg_vcn4958117_net72295_sl42074 && (ip4.dst
== 169.254.0.2 && tcp.dst == 3260)) reject
log(name=pg_vcn4958117_net72295_sl42074_reject,severity=info)
from-lport 32766 (inport == @pg_vcn4958117_net72295_sl42074 && (ip4.src
== 169.254.0.0/16 ||ip4.dst == 169.254.0.0/16)) allow-related
from-lport 16000 (inport == @pg_vcn4958117_net72295_sl42074) allow-related
from-lport     0 (inport == @pg_vcn4958117_net72295_sl42074) drop
log(name=def-4,severity=info)
   to-lport 32767 (outport == @pg_vcn4958117_net72295_sl42074 && (arp ||
udp.dst == 67 || udp.dst == 68)) allow-related
   to-lport 32767 (outport == @pg_vcn4958117_net72295_sl42074 && (ip4.src
== 169.254.0.0/16 ||ip4.dst == 169.254.0.0/16)) allow-related
   to-lport     0 (outport == @pg_vcn4958117_net72295_sl42074) drop
log(name=def-10,severity=info)

So Even though the lb_pg_vcn4958117_L650 allows 112, we are hitting the
drop in the ACL above!


I looked at the sbflows and it seems to have flows for the 112 ACL entry:

ovn-sbctl lflow-list
--------------------
Datapath: "ls_vcn4958117_net72295"
(7bba2e8d-6612-487b-8f1e-f2e98d281a85)  Pipeline: ingress

  table=9 (ls_in_acl          ), priority=33000, match=(reg0[7] == 1 &&
(inport == @lb_pg_vcn4958117_L650 && (ip4.dst == 224.0.0.18 && ip.proto
== 112))), action=(reg0[1] = 1; next;)
  table=9 (ls_in_acl          ), priority=33000, match=(reg0[8] == 1 &&
(inport == @lb_pg_vcn4958117_L650 && (ip4.dst == 224.0.0.18 && ip.proto
== 112))), action=(next;)

Datapath: "ls_vcn4958117_net72295"
(7bba2e8d-6612-487b-8f1e-f2e98d281a85)  Pipeline: egress

   table=4 (ls_out_acl         ), priority=33000, match=(reg0[7] == 1 &&
(outport == @lb_pg_vcn4958117_L650 && (ip4.dst == 224.0.0.18 && ip.proto
== 112))), action=(reg0[1] = 1; next;)
   table=4 (ls_out_acl         ), priority=33000, match=(reg0[8] == 1 &&
(outport == @lb_pg_vcn4958117_L650 && (ip4.dst == 224.0.0.18 && ip.proto
== 112))), action=(next;)

Would it be possible to try an ovn-trace for the packets?  The output
should be quite accurate, VRRP (IP multicast packets will always be
considered as having ct_state=+trk+new if I'm not wrong).

An example of ovn-trace invocation (you need to replace macs, IPs and
ports to match your setup):

ovn-trace 'inport=="vm1" && eth.src == 00:00:00:00:00:01 && eth.dst ==
01:00:5e:00:00:12 && ip4.src == 42.42.42.2 && ip4.dst == 224.0.0.18 &&
ip.proto == 112'

And OVS flows also seem to have entries:
ovs-ofctl dump-flows br-int

cookie=0x48b6ec2c, table=17,
priority=33000,ip,reg0=0x100/0x100,reg14=0x5,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=resubmit(,18)
  cookie=0xbb05b3e, table=17,
priority=33000,ip,reg0=0x100/0x100,reg14=0x26,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=resubmit(,18)
  cookie=0x45ade6ea, table=17,
priority=33000,ip,reg0=0x100/0x100,reg14=0x28,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=resubmit(,18)
  cookie=0x9ea8bc7e, table=17,
priority=33000,ip,reg0=0x100/0x100,reg14=0x7,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=resubmit(,18)
  cookie=0xc3b6660c, table=17,
priority=33000,ip,reg0=0x100/0x100,reg14=0x31,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=resubmit(,18)
  cookie=0x1574390e, table=17,
priority=33000,ip,reg0=0x100/0x100,reg14=0x32,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=resubmit(,18)
  cookie=0xba0820b4, table=17,
priority=33000,ip,reg0=0x100/0x100,reg14=0x4,metadata=0x61,nw_dst=224.0.0.18,nw_proto=112
 actions=resubmit(,18)
  cookie=0x3ab5481d, table=17,
priority=33000,ip,reg0=0x80/0x80,reg14=0x5,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18)
  cookie=0x246464a0, table=17,
priority=33000,ip,reg0=0x80/0x80,reg14=0x26,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18)
  cookie=0xb7597534, table=17,
priority=33000,ip,reg0=0x80/0x80,reg14=0x28,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18)
  cookie=0xa39db4b2, table=17,
priority=33000,ip,reg0=0x80/0x80,reg14=0x7,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18)
  cookie=0x1ba59849, table=17,
priority=33000,ip,reg0=0x80/0x80,reg14=0x31,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18)
  cookie=0x845680ed, table=17,
priority=33000,ip,reg0=0x80/0x80,reg14=0x32,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18)
  cookie=0x5b88a1c8, table=17,
priority=33000,ip,reg0=0x80/0x80,reg14=0x4,metadata=0x61,nw_dst=224.0.0.18,nw_proto=112
 actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18)

cookie=0x43eec225, table=44,
priority=33000,ip,reg0=0x80/0x80,reg15=0x5,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45)
  cookie=0xb2e98e98, table=44,
priority=33000,ip,reg0=0x80/0x80,reg15=0x26,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45)
  cookie=0x22d2bdce, table=44,
priority=33000,ip,reg0=0x80/0x80,reg15=0x28,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45)
  cookie=0xcbc86a80, table=44,
priority=33000,ip,reg0=0x80/0x80,reg15=0x7,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45)
  cookie=0x788cf92d, table=44,
priority=33000,ip,reg0=0x80/0x80,reg15=0x31,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45)
  cookie=0x93fc9ed, table=44,
priority=33000,ip,reg0=0x80/0x80,reg15=0x32,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45)
  cookie=0xa41bd3e7, table=44,
priority=33000,ip,reg0=0x80/0x80,reg15=0x4,metadata=0x61,nw_dst=224.0.0.18,nw_proto=112
 actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45)
  cookie=0xb0c3031a, table=44,
priority=33000,ip,reg0=0x100/0x100,reg15=0x5,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=resubmit(,45)
  cookie=0x13552aa4, table=44,
priority=33000,ip,reg0=0x100/0x100,reg15=0x26,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=resubmit(,45)
  cookie=0xda88db5f, table=44,
priority=33000,ip,reg0=0x100/0x100,reg15=0x28,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=resubmit(,45)
  cookie=0xf29c5f9c, table=44,
priority=33000,ip,reg0=0x100/0x100,reg15=0x7,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=resubmit(,45)
  cookie=0x73e20386, table=44,
priority=33000,ip,reg0=0x100/0x100,reg15=0x31,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=resubmit(,45)
  cookie=0xe72b74cd, table=44,
priority=33000,ip,reg0=0x100/0x100,reg15=0x32,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112
 actions=resubmit(,45)
  cookie=0x302c1113, table=44,
priority=33000,ip,reg0=0x100/0x100,reg15=0x4,metadata=0x61,nw_dst=224.0.0.18,nw_proto=112
 actions=resubmit(,45)


And as I said if I add the rule to the PG that has all the ports in it,
then things work:


ovn-nbctl acl-list pg_vcn4958117_net72295_sl42074
from-lport 32767 (inport == @pg_vcn4958117_net72295_sl42074 && (arp ||
udp.dst == 67 || udp.dst == 68)) allow-related
from-lport 32767 (inport == @pg_vcn4958117_net72295_sl42074 && (ip4.dst
== 169.254.0.2 && tcp.dst == 3260)) reject
log(name=pg_vcn4958117_net72295_sl42074_reject,severity=info)
from-lport 32766 (inport == @pg_vcn4958117_net72295_sl42074 && (ip4.src
== 169.254.0.0/16 ||ip4.dst == 169.254.0.0/16)) allow-related
from-lport 32000 (inport == @lb_pg_vcn4958117_L650 && (ip.proto == 112))
allow-related log(name=BJD)
from-lport 32000 (inport == @pg_vcn4958117_net72295_sl42074 && (ip4.dst
== 224.0.0.18 && ip.proto == 112)) allow-related log(name=BJD)
from-lport 16000 (inport == @pg_vcn4958117_net72295_sl42074) allow-related

from-lport     0 (inport == @pg_vcn4958117_net72295_sl42074) drop
log(name=def-4,severity=info)
   to-lport 32767 (outport == @pg_vcn4958117_net72295_sl42074 && (arp ||
udp.dst == 67 || udp.dst == 68)) allow-related
   to-lport 32767 (outport == @pg_vcn4958117_net72295_sl42074 && (ip4.src
== 169.254.0.0/16 ||ip4.dst == 169.254.0.0/16)) allow-related
   to-lport 32000 (outport == @lb_pg_vcn4958117_L650 && (ip.proto ==
112)) allow-related log(name=BJD)
   to-lport 32000 (outport == @pg_vcn4958117_net72295_sl42074 && (ip4.dst
== 224.0.0.18 && ip.proto == 112)) allow-related log(name=BJD)
   to-lport     0 (outport == @pg_vcn4958117_net72295_sl42074) drop
log(name=def-10,severity=info)
An ovn-trace in the working scenario might help us spot the difference.

Ideally, if you could share the whole northbound database content that
would make it easier to debug.

Best regards,
Dumitru


_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to