Hi Dumitru, I think you got something wrong about the logical_switch_port id. > The a2b9537d-d8a1-4cb9-9582-f41e49ed22a3 logical switch port is part of the following port group. This port does not belong to my two instances. It's just a port from another instance.
As I mentioned, my topology below, does not contain this id: a2b9537d-d8a1-4cb9-9582-f41e49ed22a3. Logical switch 1 id: 70974da0-2e9d-469a-9782-455a0380ab95 Logical switch 2 id: ec22da44-9964-49ff-9c29-770a26794ba4 Instance A: port 1 (connect to ls1): 61a871bc-7709-4072-9991-8e3a1096b02a port 2 (connect to ls2): 63d76c2b-2960-4a89-97ac-9f7a7d4bb718 Instance B: port 1: 46848e3c-7a73-46ce-8b3a-b6331e14fc74 port 2: 7d39750a-29d6-40df-b42b-54a17efcc423 You can check in the DB that all 4 ports above do not belong to any port group. I hope you can check this out. Best regards, Ice Bear Vào Th 2, 26 thg 5, 2025 vào lúc 17:55 Dumitru Ceara <dce...@redhat.com> đã viết: > On 5/26/25 12:31 PM, Q Kay wrote: > > Hi Dumitru, > > > > Hi Ice Bear, > > > I think this is the file you want. > > > Yes, that's it, thanks! > > > Thanks for guiding me. > > No problem. > > So, after looking at the DB contents I see that logical switch 1 > (70974da0-2e9d-469a-9782-455a0380ab95) has no ACLs applied (directly or > indirectly through port groups). > > On the other hand, for logical switch 2: > > > ovn-nbctl show neutron-6aba7876-b3bc-4d71-99bc-7b2644f326e9 > switch ec22da44-9964-49ff-9c29-770a26794ba4 > (neutron-6aba7876-b3bc-4d71-99bc-7b2644f326e9) (aka Logical_switch_2) > port b8f1e947-7d06-4899-8c1c-206e81e70e74 > type: localport > addresses: ["fa:16:3e:55:88:90 10.10.20.2"] > port a2b9537d-d8a1-4cb9-9582-f41e49ed22a3 > addresses: ["fa:16:3e:9e:4d:93 10.10.20.137"] > port 97f2c854-44e9-4558-a0ef-81e42a08f414 > addresses: ["fa:16:3e:81:ed:92 10.10.20.102", "unknown"] > port 4b7aa4f3-d126-41b6-9f0e-591c6921698b > addresses: ["fa:16:3e:72:fd:e5 10.10.20.41", "unknown"] > port 43888846-637f-46e6-ad5d-0acd5e6d6064 > addresses: ["unknown"] > > The a2b9537d-d8a1-4cb9-9582-f41e49ed22a3 logical switch port is part of > the following port group: > > > ovn-nbctl list logical_switch_port 12869fa4-2f1f-4c2f-bf65-60ce796a1d51 > _uuid : 12869fa4-2f1f-4c2f-bf65-60ce796a1d51 <<<<<< UUID > addresses : ["fa:16:3e:9e:4d:93 10.10.20.137"] > dhcpv4_options : 159d49d0-964f-4ba6-aa58-dfbb8bfeb463 > dhcpv6_options : [] > dynamic_addresses : [] > enabled : true > external_ids : {"neutron:cidrs"="10.10.20.137/24", > "neutron:device_id"="1cda8c1a-b594-4942-8273-557c1e88c666", > "neutron:device_owner"="compute:nova", > "neutron:host_id"=khangtt-osp-compute-01-84, "neutron:mtu"="", > "neutron:network_name"=neutron-6aba7876-b3bc-4d71-99bc-7b2644f326e9, > "neutron:port_capabilities"="", "neutron:port_name"="", > "neutron:project_id"="7f19299bb3bd43d4978fff45783e4346", > "neutron:revision_number"="4", > "neutron:security_group_ids"="940e2484-bb38-463b-a15f-d05b9dc9f5f0", > "neutron:subnet_pool_addr_scope4"="", "neutron:subnet_pool_addr_scope6"="", > "neutron:vnic_type"=normal} > ha_chassis_group : [] > mirror_rules : [] > name : "a2b9537d-d8a1-4cb9-9582-f41e49ed22a3" > options : {requested-chassis=khangtt-osp-compute-01-84} > parent_name : [] > peer : [] > port_security : ["fa:16:3e:9e:4d:93 10.10.20.137"] > tag : [] > tag_request : [] > type : "" > up : false > > > ovn-nbctl list port_group pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0 > _uuid : 6d232961-a51c-48cb-aa4f-84eb3108c71f > acls : [d7e20fdb-f613-4147-b605-64b8ffbe9742, > dcae0790-6c86-4e4d-8f01-d9be12d26c48] > external_ids : > {"neutron:security_group_id"="940e2484-bb38-463b-a15f-d05b9dc9f5f0"} > name : pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0 > ports : [12869fa4-2f1f-4c2f-bf65-60ce796a1d51, > 1972206b-327a-496b-88fc-d17625d013e1, 2fb22d1a-bbfc-4173-b6fc-1ae3adc5ddcd, > 3947661b-4deb-4aed-bd15-65839933fea3, caf0fe63-61be-4b1a-b306-ff00fa578982, > fbfaeb2b-6e42-458a-a65f-8d2ef29b8b69, fd662347-4013-4306-b222-e29545f866ec] > > And this port group does have allow-related (stateful) ACLs that require > conntrack: > > > ovn-nbctl acl-list pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0 > from-lport 1002 (inport == @pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0 && > ip4) allow-related > to-lport 1002 (outport == @pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0 && > ip4 && ip4.src == 0.0.0.0/0) allow-related > > So, as suspected before this explains why traffic works in one direction > and doesn't work in the other direction. Only one logical switch has > stateful ACLs and needs conntrack. > > This is an unsupported configuration (so not a bug). The only way to make > it work is to set the use_ct_inv_match=false option in the NB. > > Just mentioning it again here to make sure it's not lost in the thread: > "asymmetric conntrack" and use_ct_inv_match=false means the datapath might > forward traffic with ct_state=+trk+inv and might cause HW offload to not > work. > > If that's OK for the use case then it's fine to set the option in the NB > database. > > Best regards, > Dumitru > > > > > Best regards, > > Ice Bear > > > > Vào Th 2, 26 thg 5, 2025 vào lúc 17:05 Dumitru Ceara < > dce...@redhat.com> > > đã viết: > > > >> On 5/26/25 11:38 AM, Q Kay wrote: > >>> Hi Dumitru, > >>> > >> > >> Hi Ice Bear, > >> > >>> Here is the NB DB in JSON format (attachment). > >>> > >> > >> Sorry, I think my request might have been confusing. > >> > >> I didn't mean running something like: > >> ovsdb-client -f json dump <path-to-database-socket> > >> > >> Instead I meant just attaching the actual database file. That's a file > >> (in json format) usually stored in /etc/ovn/ovnnb_db.db. For OpenStack > >> that might be /var/lib/openvswitch/ovn/ovnnb_db.db on controller nodes. > >> > >> Hope that helps. > >> > >> Regards, > >> Dumitru > >> > >>> Best regards, > >>> Ice Bear > >>> > >>> Vào Th 2, 26 thg 5, 2025 vào lúc 16:10 Dumitru Ceara < > >> dce...@redhat.com> > >>> đã viết: > >>> > >>>> On 5/22/25 9:05 AM, Q Kay wrote: > >>>>> Hi Dumitru, > >>>>> > >>>> > >>>> Hi Ice Bear, > >>>> > >>>> Please keep the ovs-discuss mailing list in CC. > >>>> > >>>>> I am very willing to provide NB DB file for you (attached). > >>>>> I will provide more information about the ports for you to check. > >>>>> > >>>>> Logical switch 1 id: 70974da0-2e9d-469a-9782-455a0380ab95 > >>>>> Logical switch 2 id: ec22da44-9964-49ff-9c29-770a26794ba4 > >>>>> > >>>>> Instance A: > >>>>> port 1 (connect to ls1): 61a871bc-7709-4072-9991-8e3a1096b02a > >>>>> port 2 (connect to ls2): 63d76c2b-2960-4a89-97ac-9f7a7d4bb718 > >>>>> > >>>>> > >>>>> Instance B: > >>>>> port 1: 46848e3c-7a73-46ce-8b3a-b6331e14fc74 > >>>>> port 2: 7d39750a-29d6-40df-b42b-54a17efcc423 > >>>>> > >>>> > >>>> Thanks for the info. However, it's easier to investigate if you just > >>>> share the actual NB DB (json) file instead of the ovsdb-client dump. > >>>> It's probably located in a path similar to /etc/ovn/ovnnb_db.db. > >>>> > >>>> Like that I could just load it in a sandbox and run ovn-nbctl commands > >>>> against it directly. > >>>> > >>>> Regards, > >>>> Dumitru > >>>> > >>>>> > >>>>> Best regards, > >>>>> Ice Bear > >>>>> Vào Th 4, 21 thg 5, 2025 vào lúc 16:19 Dumitru Ceara < > >>>> dce...@redhat.com> > >>>>> đã viết: > >>>>> > >>>>>> On 5/21/25 5:16 AM, Q Kay wrote: > >>>>>>> Hi Dumitru, > >>>>>> > >>>>>> Hi Ice Bear, > >>>>>> > >>>>>> CC: ovs-discuss@openvswitch.org > >>>>>> > >>>>>>> Thanks for your answer. First, I will address some of your > questions. > >>>>>>> > >>>>>>>>> The critical evidence is in the failed flow, where we see: > >>>>>>>>> > >>>>>> > >>>> > >> > 'recirc_id(0x3d77),in_port(28),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(),eth_type(0x0800),ipv4(frag=no), > >>>>>>>>> packets:48, bytes:4704, used:0.940s, actions:drop' > >>>>>>>>> The packet is being marked as invalid (+inv) and subsequently > >>>> dropped. > >>>>>>>>> It's a bit weird though that this isn't a +rpl traffic. Is this > >> hit > >>>> by > >>>>>> the ICMP echo or by the ICMP echo-reply packet? > >>>>>>> > >>>>>>> This recirc hit by icmp echo reply packet. > >>>>>>> > >>>>>> > >>>>>> OK, that's good. > >>>>>> > >>>>>>> I understand what you mean. The outgoing and return traffic from > >>>>>>> different logical switches will be flagged as inv. If that's the > >> case, > >>>>>>> it will work correctly with TCP (both are dropped). But for ICMP, I > >>>>>>> notice something a bit strange. > >>>>>>> > >>>>>>>>> My hypothesis is that the handling of ct_state flags is causing > the > >>>>>> return > >>>>>>>>> traffic to be dropped. This may be because the outgoing and > return > >>>>>>>>> connections do not share the same logical_switch datapath. > >>>>>>> > >>>>>>> According to your reasoning, ICMP reply packets from a different > >>>> logical > >>>>>>> switch than the request packets will be dropped. However, in > >> practice, > >>>>>>> when I initiate an ICMP request from 6.6.6.6 <https://6.6.6.6> to > >>>>>>> 5.5.5.5 <https://5.5.5.5>, the result I get is success (note that > >> echo > >>>>>>> request and reply come from different logical switches regardless > of > >>>>>>> whether they are initiated by 5.5.5.5 <https://5.5.5.5> or 6.6.6.6 > >>>>>>> <https://6.6.6.6>). You can compare the two recirculation flows to > >> see > >>>>>>> this oddity. You can take a look at the attached image for better > >>>>>>> visualization. > >>>>>>> > >>>>>> > >>>>>> OK. From the ovn-trace command you shared > >>>>>> > >>>>>>> 2. Using OVN trace: > >>>>>>> ovn-trace --no-leader-only 70974da0-2e9d-469a-9782-455a0380ab95 > >> 'inport > >>>>>> == > >>>>>>> "319cd637-10fb-4b45-9708-d02beefd698a" && > eth.src==fa:16:3e:ea:67:18 > >> && > >>>>>>> eth.dst==fa:16:3e:04:28:c7 && ip4.src==6.6.6.6 && ip4.dst==5.5.5.5 > && > >>>>>>> ip.proto==1 && ip.ttl==64' > >>>>>> > >>>>>> I'm guessing the fa:16:3e:ea:67:18 MAC is the one owned by 6.6.6.6. > >>>>>> > >>>>>> Now, after filtering only the ICMP ECHO reply flows in your initial > >>>>>> datapath > >>>>>> flow dump: > >>>>>> > >>>>>>> *For successful ping flow: 5.5.5.5 -> 6.6.6.6* > >>>>>> > >>>>>> Note: ICMP reply comes from 6.6.6.6 to 5.5.5.5 (B -> A). > >>>>>> > >>>>>>> *- On Compute 1 (containing source instance): * > >>>>>>> > >>>>>> > >>>> > >> > 'recirc_id(0),tunnel(tun_id=0x2,src=10.10.10.85,dst=10.10.10.84,geneve({class=0x102,type=0x80,len=4,0xb000a/0x7fffffff}),flags(-df+csum+key)),in_port(9),eth(src=fa:16:3e:ea:67:18,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=0/0xfe), > >>>>>>> packets:55, bytes:5390, used:0.204s, actions:29' > >>>>>> > >>>>>> We see no conntrack fields in the match. So, based on the diagram > you > >>>>>> shared, > >>>>>> I'm guessing there's no allow-related ACL or load balancer on > logical > >>>>>> switch 2. > >>>>>> > >>>>>> But then for the failed ping flow: > >>>>>> > >>>>>>> *For failed ping flow: 6.6.6.6 -> 5.5.5.5* > >>>>>> > >>>>>> Note: ICMP reply comes from 5.5.5.5 to 6.6.6.6 (A -> B). > >>>>>> > >>>>>>> *- On Compute 1: * > >>>>>> > >>>>>> [...] > >>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>> > >> > 'recirc_id(0),in_port(28),eth(src=fa:16:3e:81:ed:92,dst=fa:16:3e:72:fd:e5),eth_type(0x0800),ipv4(proto=1,frag=no), > >>>>>>> packets:48, bytes:4704, used:0.940s, > >>>> actions:ct(zone=87),recirc(0x3d77)' > >>>>>>> > >>>>>>> > >>>>>> > >>>> > >> > 'recirc_id(0x3d77),in_port(28),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(),eth_type(0x0800),ipv4(frag=no), > >>>>>>> packets:48, bytes:4704, used:0.940s, actions:drop' > >>>>>> > >>>>>> In this case we _do_ have conntrack fields in the match/actions. > >>>>>> Is it possible that logical switch 1 has allow-related ACLs or LBs? > >>>>>> > >>>>>> On the TCP side of things: it's kind of hard to tell what's going on > >>>>>> without having the complete configuration of your OVN deployment. > >>>>>> > >>>>>> NOTE: if an ACL is applied to a port group, that is equivalent to > >>>> applying > >>>>>> the ACL to all logical switches that have ports in that port group. > >>>>>> > >>>>>>>>> I'd say it's not a bug. However, if you want to change the > default > >>>>>>>>> behavior you can use the NB_Global.options:use_ct_inv_match=true > >> knob > >>>>>> to > >>>>>>>>> allow +inv packets in the logical switch pipeline. > >>>>>>> > >>>>>>> I tried setting the option use_ct_inv_match=. The result is just as > >> you > >>>>>>> said, everything works successfully with both ICMP and TCP. > >>>>>>> Based on this experiment, I suspect there might be a small bug when > >> OVN > >>>>>>> handles ICMP packets. Could you please let me know if my experiment > >> and > >>>>>>> reasoning are correct? > >>>>>>> > >>>>>> > >>>>>> As said above, it really depends on the full configuration. Maybe > we > >>>> can > >>>>>> tell more if you can share the NB database? Or at least if you > share > >>>> the > >>>>>> ACLs applied on the two logical switches (or port groups). > >>>>>> > >>>>>>> > >>>>>>> Thanks for your support. > >>>>>>> > >>>>>> > >>>>>> No problem. > >>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> Best regards, > >>>>>>> Ice Bear > >>>>>> > >>>>>> Regards, > >>>>>> Dumitru > >>>>>> > >>>>>> > >>>>> > >>>> > >>>> > >>> > >> > >> > > > >
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss