On Tue, Nov 30, 2021 at 12:13 PM Daniel Alvarez <[email protected]> wrote:
>
> Hey Christian
>
> > On 30 Nov 2021, at 18:06, Christian Stelter <[email protected]> wrote:
> >
> > 
> > Hi!
> >
> > We’re observing currently packet loss on a 3 node etcd cluster (all 3 nodes 
> > on different hypervisors) on one of our open stack clusters running the 
> > victoria release deployed via kolla-ansible.
> >
> > Open vSwitch Library has version 2.13.3, the ovn-controller has version 
> > 20.03.2 and the underlaying OS is Ubuntu 20.04 with current patches.
> >
> > We can reproduce the packet loss with this etcd setup in different projects 
> > on that cluster, but not on a second cluster (our stage env) with the same 
> > software versions and the same hardware components and same sizing.
> >
> > When we replace the default security group with a security group that uses 
> > the CDIR of the project network as remote security group instead of 
> > “default” in the ingress rule (IP v4 Any Any) the etcd cluster performs 
> > without packet loss/recurring leader elections.
>
> I am confused as the default SG will block ingress traffic in OpenStack by 
> default.
>
> As this is an OVS/OVN ML, I would suggest to share the ACLs/Logical 
> Flows/OpenFlows for both cases. This question, framed like this requires 
> OpenStack (maybe even kolla-ansible if the default SG differs from the 
> reference implementation) and etcd knowledge so I would advise to isolate the 
> traffic pattern as much as possible as well as the packet loss % and other 
> potentially useful data.
>
>
> >
> > Other projects or applications seem not to be impacted. At least none that 
> > we know of.
> >
> > Any hints what could cause such a behavior? We suspect it's just a symptom 
> > of another problem that we are currently not aware of.
> >

In my opinion this could be due to an old bug in ovn-controller
related to wrong conjunction id generation.

Is it possible for you to test with the latest OVN version ?

If not can you run the below command and see if the packet loss issue
is resolved ?

Run - ovn-appctl -t ovn-controller recompute.

If running this command solves the issue, then it's definitely a known
issue which has been fixed in the later versions.  If you can confirm
this works
I can share the commit which fixed this issue.

Thanks
Numan

> > Kind regards,
> >
> > Christian Stelter
> > _______________________________________________
> > discuss mailing list
> > [email protected]
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
> _______________________________________________
> discuss mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to