On Wed, Nov 15, 2023 at 10:49 AM Шагов Георгий <gmsha...@cloud.ru> wrote:
> Hello Ales > Hi, > > > I really appreciate your reply. It helps a lot. > > > > - ovn-appctl -t ovn-controller ct-zone-list > > this request produced about 7K+ records. So, it seems like 6K5 records for > ct-zone in Bridge Table seems to be valid > > > > Digging deeper we have found that it looks like a major cause for 100% CPU > in ovn-controller is that it performs a full recompute constantly. > > Looking into logs of ovsdb-server of openvswitch_db we see constant > messages: > > reconnect|ERR|tcp:127.0.0.1:53560: no response to inactivity probe after > 5 seconds, disconnecting > > > So it seems like openvswitch_db constantly drop the connection from > ovn-controller due to inactivity. > > > > We tried to change an Inactivity Probe interval at openvswitch_db using > ovn-controller setting into openvswitch_db:Open_VSwitch table into > external_ids: ovn-openflow-probe-interval > > How it is explained here: > https://mail.openvswitch.org/pipermail/ovs-dev/2020-August/373671.html > > Yet , it seems to be not working, regardless of the value we set (ex: > ovn-openflow-probe-interval:”60”) we still do observe the same 5 secs > interval into openvswitch_db logs: > > reconnect|ERR|tcp:127.0.0.1:53560: no response to inactivity probe after > 5 seconds, disconnecting > > This is very confusing. > There are multiple connections that we have from ovn-controller to br-int (ovsdb). Only one of them can be influenced by the "ovn-openflow-probe-interval". This was recently changed by "controller: disable OpenFlow inactivity probing" [0]. The fact that the probe is still failing after 5 seconds suggests that it is one of the hardcoded ones. This is explained in the email thread [1]. You can try to upgrade past the mentioned patch to see if that helps, unfortunately this is only on main currently and will be available in 24.03. > > > Do we miss anything here? Any hint is appreciated. > > Thanx in advance. > > [0] https://github.com/ovn-org/ovn/commit/c16e5da803838fa66129eb61d7930fc84d237f85 [1] https://mail.openvswitch.org/pipermail/ovs-dev/2023-May/404625.html Hopefully this helps. Best regards, Ales > > > > > *From: *Ales Musil <amu...@redhat.com> > *Date: *Tuesday, 14 November 2023, 14:19 > *To: *Шагов Георгий <gmsha...@cloud.ru> > *Cc: *"ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org> > *Subject: *Re: [ovs-discuss] Enormous amount of records into openvswitch > db Bridge table external_ids ct-zone > > > ВНИМАНИЕ! ВНЕШНИЙ ОТПРАВИТЕЛЬ > *Если отправитель почты неизвестен, не переходите по ссылкам, не сообщайте > пароль,* > *не запускайте вложения и сообщите коллегам из ЦКЗ на secur...@cloud.ru > <secur...@cloud.ru>* > > > > > > On Tue, Nov 14, 2023 at 12:02 PM Шагов Георгий via discuss < > ovs-discuss@openvswitch.org> wrote: > > Hello All > > > > > > Hi, > > > > We do observe strangely f(or our installation) amount of records into > openvswitch db Bridge table external_ids:ct-zone, i.e.: 6K5+ > > > > CT zone is allocated for most of the LSPs (there are some exceptions) and > for all LR DNAT and SNAT that are local for the specified controller. Which > means that you have a lot of ports and possibly routers on that single > controller or the external-ids are not cleared on update (this would be a > bug) . You can actually check the zone list by running: ovn-appctl -t > ovn-controller ct-zone-list, to see if that matches the count of active > zones that ovn-controller knows about. > > > > > > grep -A20 '^Bridge table' ./ovs.dump | grep external_ids | sed > 's/ct-zone-/\nct-zone-/g' | sort | uniq | wc -l > > 6659 > > > > Details: > > 5 "Bridge" : { > > 6 "06ef9e06-188e-4654-93b2-5242a324a5c7" : { > > 7 "initial" : { > > 8 "datapath_type" : "system", > > 9 "external_ids" : [ > > 10 "map", > > 11 [ > > 12 [ > > 13 > "ct-zone-00368809-59f5-4408-8ae3-fb5401ff6ea4_dnat", > > 14 "60" > > 15 ], > > 16 [ > > > > In that same time if I run: > > ovs-dpctl ct-stats-show > > Connections Stats: > > Total: 1672 > > TCP: 1269 > > UDP: 398 > > ICMP: 5 > > > > The questions are: > > - Who is writing into: openvswitch db Bridge table > external_ids:ct-zone? > > > > ovn-controller is writing those values for the purpose of restoring the > zones after restart. > > > > > > - Is there any way to manage these records into openvswitch db Bridge > table external_ids? I want to purge them… > > > > ovn-controller will still write those that are new/changed if you purge > them. I would advise against that if you care about the restoration after > restart. > > > > > > This, actually kills ovn-controller in 100% CPU, since it gets reply from > openvswitch with full number of ct-zone records into external_ids of Bridge > table: > > 1 2023-11-13T14:28:47.976Z|10019838|jsonrpc|DBG|tcp:127.0.0.1:6640: > received reply, result=[false,"00000000-0000-0000-0000-000000000000",{" > > > > Any help is extremely appreciated > > > > Yours truly, > > George > > УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ: Это электронное сообщение и любые > документы, приложенные к нему, содержат конфиденциальную информацию. > Настоящим уведомляем Вас о том, что если это сообщение не предназначено > Вам, использование, копирование, распространение информации, содержащейся в > настоящем сообщении, а также осуществление любых действий на основе этой > информации, строго запрещено. Если Вы получили это сообщение по ошибке, > пожалуйста, сообщите об этом отправителю по электронной почте и удалите это > сообщение. > CONFIDENTIALITY NOTICE: This email and any files attached to it are > confidential. If you are not the intended recipient you are notified that > using, copying, distributing or taking any action in reliance on the > contents of this information is strictly prohibited. If you have received > this email in error please notify the sender and delete this email. > > _______________________________________________ > discuss mailing list > disc...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > > > Best regards, > > Ales > > > -- > > *Ales Musil * > > Senior Software Engineer - OVN Core > > Red Hat EMEA <https://www.redhat.com> > > amu...@redhat.com > > [image: Image removed by sender.] <https://red.ht/sig> > > > УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ: Это электронное сообщение и любые > документы, приложенные к нему, содержат конфиденциальную информацию. > Настоящим уведомляем Вас о том, что если это сообщение не предназначено > Вам, использование, копирование, распространение информации, содержащейся в > настоящем сообщении, а также осуществление любых действий на основе этой > информации, строго запрещено. Если Вы получили это сообщение по ошибке, > пожалуйста, сообщите об этом отправителю по электронной почте и удалите это > сообщение. > CONFIDENTIALITY NOTICE: This email and any files attached to it are > confidential. If you are not the intended recipient you are notified that > using, copying, distributing or taking any action in reliance on the > contents of this information is strictly prohibited. If you have received > this email in error please notify the sender and delete this email. > -- Ales Musil Senior Software Engineer - OVN Core Red Hat EMEA <https://www.redhat.com> amu...@redhat.com <https://red.ht/sig>
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss