On 5/14/24 12:06, Brendan Doyle via discuss wrote: > > > On 14/05/2024 09:50, Dumitru Ceara wrote: >> On 5/7/24 12:38, Brendan Doyle via discuss wrote: >>> Hi, >>> >>> Seems there is a regression with the latest LTS release in terms of Port >>> Group >>> ACLs when ports are in multiple Port Groups. As an example I have 3 >>> ports in a >>> Port Group, and two of them in another Port Group, that has an ACL to >>> allow >>> IP protocol 112, and this use to work, but now I am seeing lots of: >>> >>> 2024-05-07T05:24:06.692Z|01357|acl_log(ovn_pinctrl0)|INFO|name="def-10", >>> verdict=drop, severity=info, direction=to-lport: >>> ip,vlan_tci=0x0000,dl_src=00:13:97:ec:6a:3d,dl_dst=01:00:5e:00:00:12,nw_src=10.1.2.21,nw_dst=224.0.0.18,nw_proto=112,nw_tos=192,nw_ecn=0,nw_ttl=255,nw_frag=no >>> >>> If I place the ACL in the PG that has all ports then the 112 pkts are >>> allowed. >>> >> Hi Brendan, >> >> Thanks for reporting this! I tried in a simplified setup and I can't >> really reproduce the problem. We probably need some more info to debug >> this, please see below. > > Hi Dumitru, > > I'll need to reproduce it myself, I've noticed that it seems to be > intermittent, and > sometimes works and other times does not. But yes, once I reproduce
Hmm, that's interesting information though. Can it be an incremental processing bug in ovn-northd or ovn-controller? If you reproduce the issue the following additional info might help us debug: # Enable northd jsonrpc and i-p logs. ovn-appctl -t ovn-northd vlog/disable-rate-limit ovn-appctl -t ovn-northd vlog/set jsonrpc:DBG ovn-appctl -t ovn-northd vlog/set inc_proc_eng:DBG ovn-appctl -t ovn-northd vlog/set inc_proc_northd:DBG # Enable ovn-controller jsonrpc and i-p logs. ovn-appctl vlog/disable-rate-limit ovn-appctl vlog/set jsonrpc:DBG ovn-appctl vlog/set inc_proc_eng:DBG # Trigger an ovn-controller recompute. ovn-appctl inc-engine/recompute # Check if traffic is allowed properly. # If not, collect northd and ovn-controller logs and share them. # Trigger an ovn-northd recompute. ovn-appctl -t ovn-northd inc-engine/recompute # Check if traffic is allowed properly. # If not, collect northd and ovn-controller logs and share them. > I'll get an > ovn-trace and full NB DB dump. > > Thanks > Thanks! >>> Here are the details (an example of just one) >>> >>> The PG with all ports: >>> >>> ovn-nbctl list Port_Group pg_vcn4958117_net72295_sl42074 >>> _uuid : a928b8f5-3fce-4d32-afaa-8a9cc46ac902 >>> acls : [29cd825f-73c5-461d-8603-c78a1e89e799, >>> 30971674-3471-49aa-ab55-8df17528b250, >>> 4f311ad6-0eae-4e23-aea8-a48ac8d503af, >>> 6d142bf8-30fc-46df-adde-2318c55a62f2, >>> 6d448692-ac65-4d40-aa30-c87fa478ed96, >>> a647d19a-eaf6-4240-a323-0065d3fc8b4e, >>> c323482c-b748-41f3-85a9-de67aa541ff5, >>> d3d923d1-0e89-4165-a747-05a5908e5f08] >>> external_ids : {} >>> name : pg_vcn4958117_net72295_sl42074 >>> >>> ports : [192c7921-8091-4008-878a-5b2f6ae7aadf, >>> lbv4958117L650B >>> a3e67f89-a989-469d-b7f2-a01631ad4f46, >>> 766e9eda-0943-11ef-9ba6-0010e0daa67a >>> ee13c9d9-576b-4fa7-9c5b-c2accbc0a811] >>> lbv4958117L650A >>> >>> The Port Group that has just two Ports (which are also in the other PG) >>> >>> ovn-nbctl list Port_Group lb_pg_vcn4958117_L650 >>> >>> _uuid : 4b04b824-7d27-4a9d-addc-818919544b0f >>> acls : [444c209a-8ef6-41c7-97d7-aa66a9c38d66, >>> 6b76680a-395c-4be1-80df-dcde36426acd] >>> external_ids : {} >>> name : lb_pg_vcn4958117_L650 >>> ports : [192c7921-8091-4008-878a-5b2f6ae7aadf, >>> lbv4958117L650B >>> ee13c9d9-576b-4fa7-9c5b-c2accbc0a811] >>> lbv4958117L650A >>> >>> >>> The ACL associated with the PG: >>> >>> ovn-nbctl acl-list lb_pg_vcn4958117_L650 >>> from-lport 32000 (inport == @lb_pg_vcn4958117_L650 && (ip4.dst == >>> 224.0.0.18 && ip.proto == 112)) allow-related >>> to-lport 32000 (outport == @lb_pg_vcn4958117_L650 && (ip4.dst == >>> 224.0.0.18 && ip.proto == 112)) allow-related >>> >>> That should allow IP 112, but it does not (anymore, like I said it use >>> to work)! >>> >>> The ACL associated with pg_vcn4958117_net72295_sl42074: >>> >>> from-lport 32767 (inport == @pg_vcn4958117_net72295_sl42074 && (arp || >>> udp.dst == 67 || udp.dst == 68)) allow-related >>> from-lport 32767 (inport == @pg_vcn4958117_net72295_sl42074 && (ip4.dst >>> == 169.254.0.2 && tcp.dst == 3260)) reject >>> log(name=pg_vcn4958117_net72295_sl42074_reject,severity=info) >>> from-lport 32766 (inport == @pg_vcn4958117_net72295_sl42074 && (ip4.src >>> == 169.254.0.0/16 ||ip4.dst == 169.254.0.0/16)) allow-related >>> from-lport 16000 (inport == @pg_vcn4958117_net72295_sl42074) >>> allow-related >>> from-lport 0 (inport == @pg_vcn4958117_net72295_sl42074) drop >>> log(name=def-4,severity=info) >>> to-lport 32767 (outport == @pg_vcn4958117_net72295_sl42074 && (arp || >>> udp.dst == 67 || udp.dst == 68)) allow-related >>> to-lport 32767 (outport == @pg_vcn4958117_net72295_sl42074 && >>> (ip4.src >>> == 169.254.0.0/16 ||ip4.dst == 169.254.0.0/16)) allow-related >>> to-lport 0 (outport == @pg_vcn4958117_net72295_sl42074) drop >>> log(name=def-10,severity=info) >>> >>> So Even though the lb_pg_vcn4958117_L650 allows 112, we are hitting the >>> drop in the ACL above! >>> >>> >>> I looked at the sbflows and it seems to have flows for the 112 ACL >>> entry: >>> >>> ovn-sbctl lflow-list >>> -------------------- >>> Datapath: "ls_vcn4958117_net72295" >>> (7bba2e8d-6612-487b-8f1e-f2e98d281a85) Pipeline: ingress >>> >>> table=9 (ls_in_acl ), priority=33000, match=(reg0[7] == 1 && >>> (inport == @lb_pg_vcn4958117_L650 && (ip4.dst == 224.0.0.18 && ip.proto >>> == 112))), action=(reg0[1] = 1; next;) >>> table=9 (ls_in_acl ), priority=33000, match=(reg0[8] == 1 && >>> (inport == @lb_pg_vcn4958117_L650 && (ip4.dst == 224.0.0.18 && ip.proto >>> == 112))), action=(next;) >>> >>> Datapath: "ls_vcn4958117_net72295" >>> (7bba2e8d-6612-487b-8f1e-f2e98d281a85) Pipeline: egress >>> >>> table=4 (ls_out_acl ), priority=33000, match=(reg0[7] == 1 && >>> (outport == @lb_pg_vcn4958117_L650 && (ip4.dst == 224.0.0.18 && ip.proto >>> == 112))), action=(reg0[1] = 1; next;) >>> table=4 (ls_out_acl ), priority=33000, match=(reg0[8] == 1 && >>> (outport == @lb_pg_vcn4958117_L650 && (ip4.dst == 224.0.0.18 && ip.proto >>> == 112))), action=(next;) >>> >> Would it be possible to try an ovn-trace for the packets? The output >> should be quite accurate, VRRP (IP multicast packets will always be >> considered as having ct_state=+trk+new if I'm not wrong). >> >> An example of ovn-trace invocation (you need to replace macs, IPs and >> ports to match your setup): >> >> ovn-trace 'inport=="vm1" && eth.src == 00:00:00:00:00:01 && eth.dst == >> 01:00:5e:00:00:12 && ip4.src == 42.42.42.2 && ip4.dst == 224.0.0.18 && >> ip.proto == 112' >> >>> And OVS flows also seem to have entries: >>> ovs-ofctl dump-flows br-int >>> >>> cookie=0x48b6ec2c, table=17, >>> priority=33000,ip,reg0=0x100/0x100,reg14=0x5,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=resubmit(,18) >>> cookie=0xbb05b3e, table=17, >>> priority=33000,ip,reg0=0x100/0x100,reg14=0x26,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=resubmit(,18) >>> cookie=0x45ade6ea, table=17, >>> priority=33000,ip,reg0=0x100/0x100,reg14=0x28,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=resubmit(,18) >>> cookie=0x9ea8bc7e, table=17, >>> priority=33000,ip,reg0=0x100/0x100,reg14=0x7,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=resubmit(,18) >>> cookie=0xc3b6660c, table=17, >>> priority=33000,ip,reg0=0x100/0x100,reg14=0x31,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=resubmit(,18) >>> cookie=0x1574390e, table=17, >>> priority=33000,ip,reg0=0x100/0x100,reg14=0x32,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=resubmit(,18) >>> cookie=0xba0820b4, table=17, >>> priority=33000,ip,reg0=0x100/0x100,reg14=0x4,metadata=0x61,nw_dst=224.0.0.18,nw_proto=112 >>> actions=resubmit(,18) >>> cookie=0x3ab5481d, table=17, >>> priority=33000,ip,reg0=0x80/0x80,reg14=0x5,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18) >>> cookie=0x246464a0, table=17, >>> priority=33000,ip,reg0=0x80/0x80,reg14=0x26,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18) >>> cookie=0xb7597534, table=17, >>> priority=33000,ip,reg0=0x80/0x80,reg14=0x28,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18) >>> cookie=0xa39db4b2, table=17, >>> priority=33000,ip,reg0=0x80/0x80,reg14=0x7,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18) >>> cookie=0x1ba59849, table=17, >>> priority=33000,ip,reg0=0x80/0x80,reg14=0x31,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18) >>> cookie=0x845680ed, table=17, >>> priority=33000,ip,reg0=0x80/0x80,reg14=0x32,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18) >>> cookie=0x5b88a1c8, table=17, >>> priority=33000,ip,reg0=0x80/0x80,reg14=0x4,metadata=0x61,nw_dst=224.0.0.18,nw_proto=112 >>> actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,18) >>> >>> cookie=0x43eec225, table=44, >>> priority=33000,ip,reg0=0x80/0x80,reg15=0x5,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45) >>> cookie=0xb2e98e98, table=44, >>> priority=33000,ip,reg0=0x80/0x80,reg15=0x26,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45) >>> cookie=0x22d2bdce, table=44, >>> priority=33000,ip,reg0=0x80/0x80,reg15=0x28,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45) >>> cookie=0xcbc86a80, table=44, >>> priority=33000,ip,reg0=0x80/0x80,reg15=0x7,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45) >>> cookie=0x788cf92d, table=44, >>> priority=33000,ip,reg0=0x80/0x80,reg15=0x31,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45) >>> cookie=0x93fc9ed, table=44, >>> priority=33000,ip,reg0=0x80/0x80,reg15=0x32,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45) >>> cookie=0xa41bd3e7, table=44, >>> priority=33000,ip,reg0=0x80/0x80,reg15=0x4,metadata=0x61,nw_dst=224.0.0.18,nw_proto=112 >>> actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45) >>> cookie=0xb0c3031a, table=44, >>> priority=33000,ip,reg0=0x100/0x100,reg15=0x5,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=resubmit(,45) >>> cookie=0x13552aa4, table=44, >>> priority=33000,ip,reg0=0x100/0x100,reg15=0x26,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=resubmit(,45) >>> cookie=0xda88db5f, table=44, >>> priority=33000,ip,reg0=0x100/0x100,reg15=0x28,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=resubmit(,45) >>> cookie=0xf29c5f9c, table=44, >>> priority=33000,ip,reg0=0x100/0x100,reg15=0x7,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=resubmit(,45) >>> cookie=0x73e20386, table=44, >>> priority=33000,ip,reg0=0x100/0x100,reg15=0x31,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=resubmit(,45) >>> cookie=0xe72b74cd, table=44, >>> priority=33000,ip,reg0=0x100/0x100,reg15=0x32,metadata=0x7,nw_dst=224.0.0.18,nw_proto=112 >>> actions=resubmit(,45) >>> cookie=0x302c1113, table=44, >>> priority=33000,ip,reg0=0x100/0x100,reg15=0x4,metadata=0x61,nw_dst=224.0.0.18,nw_proto=112 >>> actions=resubmit(,45) >>> >>> >>> And as I said if I add the rule to the PG that has all the ports in it, >>> then things work: >>> >>> >>> ovn-nbctl acl-list pg_vcn4958117_net72295_sl42074 >>> from-lport 32767 (inport == @pg_vcn4958117_net72295_sl42074 && (arp || >>> udp.dst == 67 || udp.dst == 68)) allow-related >>> from-lport 32767 (inport == @pg_vcn4958117_net72295_sl42074 && (ip4.dst >>> == 169.254.0.2 && tcp.dst == 3260)) reject >>> log(name=pg_vcn4958117_net72295_sl42074_reject,severity=info) >>> from-lport 32766 (inport == @pg_vcn4958117_net72295_sl42074 && (ip4.src >>> == 169.254.0.0/16 ||ip4.dst == 169.254.0.0/16)) allow-related >>> from-lport 32000 (inport == @lb_pg_vcn4958117_L650 && (ip.proto == 112)) >>> allow-related log(name=BJD) >>> from-lport 32000 (inport == @pg_vcn4958117_net72295_sl42074 && (ip4.dst >>> == 224.0.0.18 && ip.proto == 112)) allow-related log(name=BJD) >>> from-lport 16000 (inport == @pg_vcn4958117_net72295_sl42074) >>> allow-related >>> >>> from-lport 0 (inport == @pg_vcn4958117_net72295_sl42074) drop >>> log(name=def-4,severity=info) >>> to-lport 32767 (outport == @pg_vcn4958117_net72295_sl42074 && (arp || >>> udp.dst == 67 || udp.dst == 68)) allow-related >>> to-lport 32767 (outport == @pg_vcn4958117_net72295_sl42074 && >>> (ip4.src >>> == 169.254.0.0/16 ||ip4.dst == 169.254.0.0/16)) allow-related >>> to-lport 32000 (outport == @lb_pg_vcn4958117_L650 && (ip.proto == >>> 112)) allow-related log(name=BJD) >>> to-lport 32000 (outport == @pg_vcn4958117_net72295_sl42074 && >>> (ip4.dst >>> == 224.0.0.18 && ip.proto == 112)) allow-related log(name=BJD) >>> to-lport 0 (outport == @pg_vcn4958117_net72295_sl42074) drop >>> log(name=def-10,severity=info) >> An ovn-trace in the working scenario might help us spot the difference. >> >> Ideally, if you could share the whole northbound database content that >> would make it easier to debug. >> >> Best regards, >> Dumitru >> > > _______________________________________________ > discuss mailing list > disc...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss