It's mostly on nb. Yes, I set that value before to 60000 but it didn't help!
On Sun, Oct 10, 2021 at 10:34 PM Han Zhou <[email protected]> wrote: > > > On Sat, Oct 9, 2021 at 12:02 PM Seena Fallah <[email protected]> > wrote: > > > > Also I get many logs like this in ovn: > > > > 2021-10-09T18:54:45.263Z|01151|jsonrpc|WARN|Dropped 6 log messages in > last 8 seconds (most recently, 3 seconds ago) due to excessive rate > > 2021-10-09T18:54:45.263Z|01152|jsonrpc|WARN|tcp:10.0.0.1:44454: receive > error: Connection reset by peer > > 2021-10-09T18:54:45.263Z|01153|reconnect|WARN|tcp:10.0.01:44454: > connection dropped (Connection reset by peer) > > 2021-10-09T18:54:46.798Z|01154|reconnect|WARN|tcp:10.0.0.2:50224: > connection dropped (Connection reset by peer) > > 2021-10-09T18:54:49.127Z|01155|reconnect|WARN|tcp:10.0.0.3:48514: > connection dropped (Connection reset by peer) > > 2021-10-09T18:54:51.241Z|01156|reconnect|WARN|tcp:10.0.0.3:48544: > connection dropped (Connection reset by peer) > > 2021-10-09T18:54:53.005Z|01157|reconnect|WARN|tcp:10.0.0.3:48846: > connection dropped (Connection reset by peer) > > 2021-10-09T18:54:53.246Z|01158|reconnect|WARN|tcp:10.0.0.3:48796: > connection dropped (Connection reset by peer) > > > > What does it mean about excessive rate? How many req/s is going to be an > excessive rate? > > Don't worry about "excessive rate", which is talking about the log rate > limit itself. > The "connection reset by peer" indicates client side inactivity probe is > enabled and it disconnects when the server hasn't responded for a while. > What server is this? NB or SB? Usually SB DB would have this problem if > there are lots of nodes and if the inactivity probe is not adjusted on the > nodes (ovn-controllers). Try: ovs-vsctl set open . > external_ids:ovn-remote-probe-interval=100000 on each node. > > > > > On Thu, Oct 7, 2021 at 12:46 AM Seena Fallah <[email protected]> > wrote: > >> > >> Seems the most leader failure is for NB and the command you said is for > SB. > >> > >> Do you have any benchmarks of how many ACLs can OVN perform normally? > >> I see many failures after 100k ACLs. > >> > >> On Thu, Oct 7, 2021 at 12:14 AM Numan Siddique <[email protected]> wrote: > >>> > >>> On Wed, Oct 6, 2021 at 2:49 PM Seena Fallah <[email protected]> > wrote: > >>> > > >>> > I'm using these versions on a centos container: > >>> > ovsdb-server (Open vSwitch) 2.15.2 > >>> > ovn-nbctl 21.06.0 > >>> > Open vSwitch Library 2.15.90 > >>> > DB Schema 5.32.0 > >>> > > >>> > Today I see the election timed out too and I should increase ovsdb > election timeout too. I saw the commits but I didn't find any related > change to my problem. > >>> > If I use ovn 21.09 with ovsdb 2.16 Is there still any need to > increase election timeout and disable the inactivity probe? > >>> > >>> Not sure on that. It's worth a try if you have a test environment. > >>> > >>> > Also is there any limitation on the number of ACLs that can OVN > handle? > >>> > >>> I don't think there is any limitation on the number of ACLs. In > >>> general as the size of the SB DB increases, we have seen issues. > >>> > >>> Can you run the below command on each of your nodes where > >>> ovn-controller runs and see if that helps ? > >>> > >>> --- > >>> ovs-vsctl set open . external_ids:ovn-monitor-all=true > >>> --- > >>> > >>> Thanks > >>> Numan > >>> > >>> > >>> > > >>> > Thanks. > >>> > > >>> > On Wed, Oct 6, 2021 at 9:43 PM Numan Siddique <[email protected]> > wrote: > >>> >> > >>> >> On Wed, Oct 6, 2021 at 12:15 PM Seena Fallah <[email protected]> > wrote: > >>> >> > > >>> >> > Hi, > >>> >> > > >>> >> > I use ovn for OpenStack neutron plugin for my production. After > days I see issues about losing a leader in ovsdb. It seems it was because > of the failing inactivity probe and because I had 17k acls. After I disable > the inactivity probe it works fine but when I did a scale test on it (about > 40k ACLS) again it fails the leader. > >>> >> > I saw many docs about ovn at scale issues that were raised by > both RedHat and eBay and seems the solution is to rewrite ovn with ddlog. I > checked it with northd-ddlog but nothing changes. > >>> >> > > >>> >> > My question is should I wait more for ovn to be stable for high > scale or is there any tuning I miss in my deployment? > >>> >> > Also, will the ovn-nb/sb rewrite with ddlog and can help the > issues at a high scale? if yes is there any due time? > >>> >> > >>> >> What is the ovsdb-server version you're using ? There are many > >>> >> improvements in the ovsdb-server in 2.16. > >>> >> Maybe that would help in your deployment. And also there were many > >>> >> improvements which went into OVN 21.09 > >>> >> if you want to test it out. > >>> >> > >>> >> Thanks > >>> >> Numan > >>> >> > >>> >> > > >>> >> > Thanks. > >>> >> > _______________________________________________ > >>> >> > discuss mailing list > >>> >> > [email protected] > >>> >> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > >>> > > >>> > _______________________________________________ > >>> > discuss mailing list > >>> > [email protected] > >>> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > > > _______________________________________________ > > discuss mailing list > > [email protected] > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >
_______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
