Hi Dumitru,

I think you got something wrong about the logical_switch_port id.
> The a2b9537d-d8a1-4cb9-9582-f41e49ed22a3 logical switch port is part of
the following port group.
This port does not belong to my two instances. It's just a port from
another instance.

As I mentioned, my topology below, does not contain this id:
a2b9537d-d8a1-4cb9-9582-f41e49ed22a3.

Logical switch 1 id: 70974da0-2e9d-469a-9782-455a0380ab95
Logical switch 2 id: ec22da44-9964-49ff-9c29-770a26794ba4

Instance A:
port 1 (connect to ls1): 61a871bc-7709-4072-9991-8e3a1096b02a
port 2 (connect to ls2): 63d76c2b-2960-4a89-97ac-9f7a7d4bb718

Instance B:
port 1: 46848e3c-7a73-46ce-8b3a-b6331e14fc74
port 2: 7d39750a-29d6-40df-b42b-54a17efcc423

You can check in the DB that all 4 ports above do not belong to any port
group.

I hope you can check this out.


Best regards,
Ice Bear

Vào Th 2, 26 thg 5, 2025 vào lúc 17:55 Dumitru Ceara <dce...@redhat.com>
đã viết:

> On 5/26/25 12:31 PM, Q Kay wrote:
> > Hi Dumitru,
> >
>
> Hi Ice Bear,
>
> > I think this is the file you want.
>
>
> Yes, that's it, thanks!
>
> > Thanks for guiding me.
>
> No problem.
>
> So, after looking at the DB contents I see that logical switch 1
> (70974da0-2e9d-469a-9782-455a0380ab95) has no ACLs applied (directly or
> indirectly through port groups).
>
> On the other hand, for logical switch 2:
>
> > ovn-nbctl show neutron-6aba7876-b3bc-4d71-99bc-7b2644f326e9
> switch ec22da44-9964-49ff-9c29-770a26794ba4
> (neutron-6aba7876-b3bc-4d71-99bc-7b2644f326e9) (aka Logical_switch_2)
>     port b8f1e947-7d06-4899-8c1c-206e81e70e74
>         type: localport
>         addresses: ["fa:16:3e:55:88:90 10.10.20.2"]
>     port a2b9537d-d8a1-4cb9-9582-f41e49ed22a3
>         addresses: ["fa:16:3e:9e:4d:93 10.10.20.137"]
>     port 97f2c854-44e9-4558-a0ef-81e42a08f414
>         addresses: ["fa:16:3e:81:ed:92 10.10.20.102", "unknown"]
>     port 4b7aa4f3-d126-41b6-9f0e-591c6921698b
>         addresses: ["fa:16:3e:72:fd:e5 10.10.20.41", "unknown"]
>     port 43888846-637f-46e6-ad5d-0acd5e6d6064
>         addresses: ["unknown"]
>
> The a2b9537d-d8a1-4cb9-9582-f41e49ed22a3 logical switch port is part of
> the following port group:
>
> > ovn-nbctl list logical_switch_port 12869fa4-2f1f-4c2f-bf65-60ce796a1d51
> _uuid               : 12869fa4-2f1f-4c2f-bf65-60ce796a1d51    <<<<<< UUID
> addresses           : ["fa:16:3e:9e:4d:93 10.10.20.137"]
> dhcpv4_options      : 159d49d0-964f-4ba6-aa58-dfbb8bfeb463
> dhcpv6_options      : []
> dynamic_addresses   : []
> enabled             : true
> external_ids        : {"neutron:cidrs"="10.10.20.137/24",
> "neutron:device_id"="1cda8c1a-b594-4942-8273-557c1e88c666",
> "neutron:device_owner"="compute:nova",
> "neutron:host_id"=khangtt-osp-compute-01-84, "neutron:mtu"="",
> "neutron:network_name"=neutron-6aba7876-b3bc-4d71-99bc-7b2644f326e9,
> "neutron:port_capabilities"="", "neutron:port_name"="",
> "neutron:project_id"="7f19299bb3bd43d4978fff45783e4346",
> "neutron:revision_number"="4",
> "neutron:security_group_ids"="940e2484-bb38-463b-a15f-d05b9dc9f5f0",
> "neutron:subnet_pool_addr_scope4"="", "neutron:subnet_pool_addr_scope6"="",
> "neutron:vnic_type"=normal}
> ha_chassis_group    : []
> mirror_rules        : []
> name                : "a2b9537d-d8a1-4cb9-9582-f41e49ed22a3"
> options             : {requested-chassis=khangtt-osp-compute-01-84}
> parent_name         : []
> peer                : []
> port_security       : ["fa:16:3e:9e:4d:93 10.10.20.137"]
> tag                 : []
> tag_request         : []
> type                : ""
> up                  : false
>
> > ovn-nbctl list port_group pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0
> _uuid               : 6d232961-a51c-48cb-aa4f-84eb3108c71f
> acls                : [d7e20fdb-f613-4147-b605-64b8ffbe9742,
> dcae0790-6c86-4e4d-8f01-d9be12d26c48]
> external_ids        :
> {"neutron:security_group_id"="940e2484-bb38-463b-a15f-d05b9dc9f5f0"}
> name                : pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0
> ports               : [12869fa4-2f1f-4c2f-bf65-60ce796a1d51,
> 1972206b-327a-496b-88fc-d17625d013e1, 2fb22d1a-bbfc-4173-b6fc-1ae3adc5ddcd,
> 3947661b-4deb-4aed-bd15-65839933fea3, caf0fe63-61be-4b1a-b306-ff00fa578982,
> fbfaeb2b-6e42-458a-a65f-8d2ef29b8b69, fd662347-4013-4306-b222-e29545f866ec]
>
> And this port group does have allow-related (stateful) ACLs that require
> conntrack:
>
> > ovn-nbctl acl-list pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0
> from-lport  1002 (inport == @pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0 &&
> ip4) allow-related
>   to-lport  1002 (outport == @pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0 &&
> ip4 && ip4.src == 0.0.0.0/0) allow-related
>
> So, as suspected before this explains why traffic works in one direction
> and doesn't work in the other direction.  Only one logical switch has
> stateful ACLs and needs conntrack.
>
> This is an unsupported configuration (so not a bug).  The only way to make
> it work is to set the use_ct_inv_match=false option in the NB.
>
> Just mentioning it again here to make sure it's not lost in the thread:
> "asymmetric conntrack" and use_ct_inv_match=false means the datapath might
> forward traffic with ct_state=+trk+inv and might cause HW offload to not
> work.
>
> If that's OK for the use case then it's fine to set the option in the NB
> database.
>
> Best regards,
> Dumitru
>
> >
> > Best regards,
> > Ice Bear
> >
> > Vào Th 2, 26 thg 5, 2025 vào lúc 17:05 Dumitru Ceara <
> dce...@redhat.com>
> > đã viết:
> >
> >> On 5/26/25 11:38 AM, Q Kay wrote:
> >>> Hi Dumitru,
> >>>
> >>
> >> Hi Ice Bear,
> >>
> >>> Here is the NB DB in JSON format (attachment).
> >>>
> >>
> >> Sorry, I think my request might have been confusing.
> >>
> >> I didn't mean running something like:
> >> ovsdb-client -f json dump <path-to-database-socket>
> >>
> >> Instead I meant just attaching the actual database file.  That's a file
> >> (in json format) usually stored in /etc/ovn/ovnnb_db.db.  For OpenStack
> >> that might be /var/lib/openvswitch/ovn/ovnnb_db.db on controller nodes.
> >>
> >> Hope that helps.
> >>
> >> Regards,
> >> Dumitru
> >>
> >>> Best regards,
> >>> Ice Bear
> >>>
> >>> Vào Th 2, 26 thg 5, 2025 vào lúc 16:10 Dumitru Ceara <
> >> dce...@redhat.com>
> >>> đã viết:
> >>>
> >>>> On 5/22/25 9:05 AM, Q Kay wrote:
> >>>>> Hi Dumitru,
> >>>>>
> >>>>
> >>>> Hi Ice Bear,
> >>>>
> >>>> Please keep the ovs-discuss mailing list in CC.
> >>>>
> >>>>> I am very willing to provide NB DB file for you (attached).
> >>>>> I will provide more information about the ports for you to check.
> >>>>>
> >>>>> Logical switch 1 id: 70974da0-2e9d-469a-9782-455a0380ab95
> >>>>> Logical switch 2 id: ec22da44-9964-49ff-9c29-770a26794ba4
> >>>>>
> >>>>> Instance A:
> >>>>> port 1 (connect to ls1): 61a871bc-7709-4072-9991-8e3a1096b02a
> >>>>> port 2 (connect to ls2): 63d76c2b-2960-4a89-97ac-9f7a7d4bb718
> >>>>>
> >>>>>
> >>>>> Instance B:
> >>>>> port 1: 46848e3c-7a73-46ce-8b3a-b6331e14fc74
> >>>>> port 2: 7d39750a-29d6-40df-b42b-54a17efcc423
> >>>>>
> >>>>
> >>>> Thanks for the info.  However, it's easier to investigate if you just
> >>>> share the actual NB DB (json) file instead of the ovsdb-client dump.
> >>>> It's probably located in a path similar to /etc/ovn/ovnnb_db.db.
> >>>>
> >>>> Like that I could just load it in a sandbox and run ovn-nbctl commands
> >>>> against it directly.
> >>>>
> >>>> Regards,
> >>>> Dumitru
> >>>>
> >>>>>
> >>>>> Best regards,
> >>>>> Ice Bear
> >>>>> Vào Th 4, 21 thg 5, 2025 vào lúc 16:19 Dumitru Ceara <
> >>>> dce...@redhat.com>
> >>>>> đã viết:
> >>>>>
> >>>>>> On 5/21/25 5:16 AM, Q Kay wrote:
> >>>>>>> Hi Dumitru,
> >>>>>>
> >>>>>> Hi Ice Bear,
> >>>>>>
> >>>>>> CC: ovs-discuss@openvswitch.org
> >>>>>>
> >>>>>>> Thanks for your answer. First, I will address some of your
> questions.
> >>>>>>>
> >>>>>>>>> The critical evidence is in the failed flow, where we see:
> >>>>>>>>>
> >>>>>>
> >>>>
> >>
> 'recirc_id(0x3d77),in_port(28),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(),eth_type(0x0800),ipv4(frag=no),
> >>>>>>>>> packets:48, bytes:4704, used:0.940s, actions:drop'
> >>>>>>>>> The packet is being marked as invalid (+inv) and subsequently
> >>>> dropped.
> >>>>>>>>> It's a bit weird though that this isn't a +rpl traffic.  Is this
> >> hit
> >>>> by
> >>>>>> the ICMP echo or by the ICMP echo-reply packet?
> >>>>>>>
> >>>>>>> This recirc hit by icmp echo reply packet.
> >>>>>>>
> >>>>>>
> >>>>>> OK, that's good.
> >>>>>>
> >>>>>>> I understand what you mean. The outgoing and return traffic from
> >>>>>>> different logical switches will be flagged as inv. If that's the
> >> case,
> >>>>>>> it will work correctly with TCP (both are dropped). But for ICMP, I
> >>>>>>> notice something a bit strange.
> >>>>>>>
> >>>>>>>>> My hypothesis is that the handling of ct_state flags is causing
> the
> >>>>>> return
> >>>>>>>>> traffic to be dropped. This may be because the outgoing and
> return
> >>>>>>>>> connections do not share the same logical_switch datapath.
> >>>>>>>
> >>>>>>> According to your reasoning, ICMP reply packets from a different
> >>>> logical
> >>>>>>> switch than the request packets will be dropped. However, in
> >> practice,
> >>>>>>> when I initiate an ICMP request from 6.6.6.6 <https://6.6.6.6> to
> >>>>>>> 5.5.5.5 <https://5.5.5.5>, the result I get is success (note that
> >> echo
> >>>>>>> request and reply come from different logical switches regardless
> of
> >>>>>>> whether they are initiated by 5.5.5.5 <https://5.5.5.5> or 6.6.6.6
> >>>>>>> <https://6.6.6.6>). You can compare the two recirculation flows to
> >> see
> >>>>>>> this oddity. You can take a look at the attached image for better
> >>>>>>> visualization.
> >>>>>>>
> >>>>>>
> >>>>>> OK.  From the ovn-trace command you shared
> >>>>>>
> >>>>>>> 2. Using OVN trace:
> >>>>>>> ovn-trace --no-leader-only 70974da0-2e9d-469a-9782-455a0380ab95
> >> 'inport
> >>>>>> ==
> >>>>>>> "319cd637-10fb-4b45-9708-d02beefd698a" &&
> eth.src==fa:16:3e:ea:67:18
> >> &&
> >>>>>>> eth.dst==fa:16:3e:04:28:c7 && ip4.src==6.6.6.6 && ip4.dst==5.5.5.5
> &&
> >>>>>>> ip.proto==1 && ip.ttl==64'
> >>>>>>
> >>>>>> I'm guessing the fa:16:3e:ea:67:18 MAC is the one owned by 6.6.6.6.
> >>>>>>
> >>>>>> Now, after filtering only the ICMP ECHO reply flows in your initial
> >>>>>> datapath
> >>>>>> flow dump:
> >>>>>>
> >>>>>>> *For successful ping flow: 5.5.5.5 -> 6.6.6.6*
> >>>>>>
> >>>>>> Note: ICMP reply comes from 6.6.6.6 to 5.5.5.5 (B -> A).
> >>>>>>
> >>>>>>> *- On Compute 1 (containing source instance): *
> >>>>>>>
> >>>>>>
> >>>>
> >>
> 'recirc_id(0),tunnel(tun_id=0x2,src=10.10.10.85,dst=10.10.10.84,geneve({class=0x102,type=0x80,len=4,0xb000a/0x7fffffff}),flags(-df+csum+key)),in_port(9),eth(src=fa:16:3e:ea:67:18,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=0/0xfe),
> >>>>>>> packets:55, bytes:5390, used:0.204s, actions:29'
> >>>>>>
> >>>>>> We see no conntrack fields in the match.  So, based on the diagram
> you
> >>>>>> shared,
> >>>>>> I'm guessing there's no allow-related ACL or load balancer on
> logical
> >>>>>> switch 2.
> >>>>>>
> >>>>>> But then for the failed ping flow:
> >>>>>>
> >>>>>>> *For failed ping flow: 6.6.6.6 -> 5.5.5.5*
> >>>>>>
> >>>>>> Note: ICMP reply comes from 5.5.5.5 to 6.6.6.6 (A -> B).
> >>>>>>
> >>>>>>> *- On Compute 1: *
> >>>>>>
> >>>>>> [...]
> >>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> 'recirc_id(0),in_port(28),eth(src=fa:16:3e:81:ed:92,dst=fa:16:3e:72:fd:e5),eth_type(0x0800),ipv4(proto=1,frag=no),
> >>>>>>> packets:48, bytes:4704, used:0.940s,
> >>>> actions:ct(zone=87),recirc(0x3d77)'
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> 'recirc_id(0x3d77),in_port(28),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(),eth_type(0x0800),ipv4(frag=no),
> >>>>>>> packets:48, bytes:4704, used:0.940s, actions:drop'
> >>>>>>
> >>>>>> In this case we _do_ have conntrack fields in the match/actions.
> >>>>>> Is it possible that logical switch 1 has allow-related ACLs or LBs?
> >>>>>>
> >>>>>> On the TCP side of things: it's kind of hard to tell what's going on
> >>>>>> without having the complete configuration of your OVN deployment.
> >>>>>>
> >>>>>> NOTE: if an ACL is applied to a port group, that is equivalent to
> >>>> applying
> >>>>>> the ACL to all logical switches that have ports in that port group.
> >>>>>>
> >>>>>>>>> I'd say it's not a bug.  However, if you want to change the
> default
> >>>>>>>>> behavior you can use the NB_Global.options:use_ct_inv_match=true
> >> knob
> >>>>>> to
> >>>>>>>>> allow +inv packets in the logical switch pipeline.
> >>>>>>>
> >>>>>>> I tried setting the option use_ct_inv_match=. The result is just as
> >> you
> >>>>>>> said, everything works successfully with both ICMP and TCP.
> >>>>>>> Based on this experiment, I suspect there might be a small bug when
> >> OVN
> >>>>>>> handles ICMP packets. Could you please let me know if my experiment
> >> and
> >>>>>>> reasoning are correct?
> >>>>>>>
> >>>>>>
> >>>>>> As said above, it really depends on the full configuration.  Maybe
> we
> >>>> can
> >>>>>> tell more if you can share the NB database?  Or at least if you
> share
> >>>> the
> >>>>>> ACLs applied on the two logical switches (or port groups).
> >>>>>>
> >>>>>>>
> >>>>>>> Thanks for your support.
> >>>>>>>
> >>>>>>
> >>>>>> No problem.
> >>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Best regards,
> >>>>>>> Ice Bear
> >>>>>>
> >>>>>> Regards,
> >>>>>> Dumitru
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >
>
>
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to