> On 15 Nov 2023, at 16:41, Шагов Георгий via discuss 
> <ovs-discuss@openvswitch.org> wrote:
> 
> Hello Vladislav
>  
> I really appreciate your reply to the problem
> AFAIK, ovn-controller doesn’t use TCP sockets to connect to local OVS, so it 
> seems that this ERR message is not related to OVS<->ovn-controller interaction
> Looking at this picture: 
> https://mail.openvswitch.org/pipermail/ovs-dev/2020-August/373671.html
> I would say that ovn-controller establishes tcp  both SBDB and ovswitch_db, 
> but indeed it uses UNIX to openvswitch daemon.

Ahh, I was talking about non-container deployment with systemd units.

>  
> If you see such errors in ovsdb-server, which handles Open_vSwitch database, 
> you should check which service has connected to it.
>  
> Getting back to the error from ovswitch_db
> reconnect|ERR|tcp:127.0.0.1:53560
>  
> we checked this port: 53560 into ovn-controller logs and found a corresponded 
> error with the same port and timing, so I am sure this is connection of 
> ovn-controller.
>  
> In addition, what we did, we indeed re-established/reconfigured connection 
> from ovn-controller to ovswitch_db using unix socket, that had given us a 
> default probe interval in 10 secs against 5 secs for tcp. And that worked! 
> Ovn-controller dropped CPU consumption from 100% to 60%! Though ovswitch_db 
> increased 20% I would say.
>  
>  
> I’d suggest enabling dbg logs for ovn-controller (via ovn-appctl vlog/set 
> command) to see which module makes most of CPU load.
> Hm… Yes, we indeed extended logs to DBG, but the option of ‘to see which 
> module makes most of CPU load’ – slept my attention, thx for the hint, will 
> check

You can use ovn-appctl stopwatch/show to try to find which module uses much 
time and its values.

>  
> What we did, and this seems to be working we hacked ovs changing 
> RECONNECT_DEFAULT_PROBE_INTERVAL to 60000, checkint at the moment…..
>  
> Thns in advance
>  
>  
>  
> From: Vladislav Odintsov <odiv...@gmail.com <mailto:odiv...@gmail.com>>
> Date: Wednesday, 15 November 2023, 15:10
> To: Шагов Георгий <gmsha...@cloud.ru <mailto:gmsha...@cloud.ru>>
> Cc: Ales Musil <amu...@redhat.com <mailto:amu...@redhat.com>>, 
> "ovs-discuss@openvswitch.org <mailto:ovs-discuss@openvswitch.org>" 
> <ovs-discuss@openvswitch.org <mailto:ovs-discuss@openvswitch.org>>
> Subject: Re: [ovs-discuss] Enormous amount of records into openvswitch db 
> Bridge table external_ids ct-zone
>  
> ВНИМАНИЕ! ВНЕШНИЙ ОТПРАВИТЕЛЬ
> Если отправитель почты неизвестен, не переходите по ссылкам, не сообщайте 
> пароль,
> не запускайте вложения и сообщите коллегам из ЦКЗ на secur...@cloud.ru 
> <mailto:secur...@cloud.ru>
> Hi,
> 
> 
> On 15 Nov 2023, at 12:49, Шагов Георгий via discuss 
> <ovs-discuss@openvswitch.org> wrote:
>  
> Hello Ales
>  
> I really appreciate your reply. It helps a lot.
>  
> ovn-appctl -t ovn-controller ct-zone-list
> this request produced about 7K+ records. So, it seems like 6K5 records for 
> ct-zone in Bridge Table seems to be valid
>  
> Digging deeper we have found that it looks like a major cause for 100% CPU in 
> ovn-controller is that it performs a full recompute constantly.
> Looking into logs of ovsdb-server of openvswitch_db we see constant messages:
> reconnect|ERR|tcp:127.0.0.1:53560: no response to inactivity probe after 5 
> seconds, disconnecting
>  
> So it seems like openvswitch_db constantly drop the connection from 
> ovn-controller due to inactivity.
>  
> We tried to change an Inactivity Probe interval at openvswitch_db using 
> ovn-controller setting into openvswitch_db:Open_VSwitch table into 
> external_ids: ovn-openflow-probe-interval
> How it is explained here: 
> https://mail.openvswitch.org/pipermail/ovs-dev/2020-August/373671.html
> Yet , it seems to be not working, regardless of the value we set (ex: 
> ovn-openflow-probe-interval:”60”) we still do observe the same 5 secs 
> interval into openvswitch_db logs:
> reconnect|ERR|tcp:127.0.0.1:53560: no response to inactivity probe after 5 
> seconds, disconnecting
> This is very confusing.
>  
> AFAIK, ovn-controller doesn’t use TCP sockets to connect to local OVS, so it 
> seems that this ERR message is not related to OVS<->ovn-controller 
> interaction.
> TCP can be used to connect to OVN_Southbound database...
> If you see such errors in ovsdb-server, which handles Open_vSwitch database, 
> you should check which service has connected to it.
>  
> If you think the problem is about connections, so it is important to 
> understand which socket brings these problems.
>  
> ovn-controller <-> ovs-vswitchd OpenFlow is connected via unix socket 
> (default for br-int if /var/run/openvswitch/br-int.mgmt). There is a 
> configuration knob external_ids:ovn-openflow-probe-interval for this 
> connection. 0 is its default, I’d leave it as is.
> local ovsdb connection is done via unix socket to ovsdb-server 
> (/var/run/openvswitch/db.sock).
>  
> I’d suggest enabling dbg logs for ovn-controller (via ovn-appctl vlog/set 
> command) to see which module makes most of CPU load.
> 
> 
>  
> Do we miss anything here? Any hint is appreciated.
> Thanx in advance.
>  
>  
> From: Ales Musil <amu...@redhat.com <mailto:amu...@redhat.com>>
> Date: Tuesday, 14 November 2023, 14:19
> To: Шагов Георгий <gmsha...@cloud.ru <mailto:gmsha...@cloud.ru>>
> Cc: "ovs-discuss@openvswitch.org <mailto:ovs-discuss@openvswitch.org>" 
> <ovs-discuss@openvswitch.org <mailto:ovs-discuss@openvswitch.org>>
> Subject: Re: [ovs-discuss] Enormous amount of records into openvswitch db 
> Bridge table external_ids ct-zone
>  
> ВНИМАНИЕ! ВНЕШНИЙ ОТПРАВИТЕЛЬ
> Если отправитель почты неизвестен, не переходите по ссылкам, не сообщайте 
> пароль,
> не запускайте вложения и сообщите коллегам из ЦКЗ на secur...@cloud.ru 
> <mailto:secur...@cloud.ru>
>  
>  
> On Tue, Nov 14, 2023 at 12:02 PM Шагов Георгий via discuss 
> <ovs-discuss@openvswitch.org <mailto:ovs-discuss@openvswitch.org>> wrote:
> Hello All
>  
>  
> Hi,
> 
>  
> We do observe strangely f(or our installation) amount of records into 
> openvswitch db Bridge table external_ids:ct-zone, i.e.: 6K5+
>  
> CT zone is allocated for most of the LSPs (there are some exceptions) and for 
> all LR DNAT and SNAT that are local for the specified controller. Which means 
> that you have a lot of ports and possibly routers on that single controller 
> or the external-ids are not cleared on update (this would be a bug) . You can 
> actually check the zone list by running: ovn-appctl -t ovn-controller 
> ct-zone-list, to see if that matches the count of active zones that 
> ovn-controller knows about.
>  
>  
> grep -A20 '^Bridge table' ./ovs.dump | grep external_ids | sed 
> 's/ct-zone-/\nct-zone-/g' | sort | uniq | wc -l
>     6659
>  
> Details:
>       5       "Bridge" : {
>       6          "06ef9e06-188e-4654-93b2-5242a324a5c7" : {
>       7             "initial" : {
>       8                "datapath_type" : "system",
>       9                "external_ids" : [
>      10                   "map",
>      11                   [
>      12                      [
>      13                         
> "ct-zone-00368809-59f5-4408-8ae3-fb5401ff6ea4_dnat",
>      14                         "60"
>      15                      ],
>      16                      [
>  
> In that same time if I run:
> ovs-dpctl ct-stats-show
> Connections Stats:
>     Total: 1672
>   TCP: 1269
>   UDP: 398
>   ICMP: 5
>  
> The questions are:
> Who is writing into: openvswitch db Bridge table external_ids:ct-zone?
>  
> ovn-controller is writing those values for the purpose of restoring the zones 
> after restart.
>  
>  
> Is there any way to manage these records into openvswitch db Bridge table 
> external_ids? I want to purge them…
>  
> ovn-controller will still write those that are new/changed if you purge them. 
> I would advise against that if you care about the restoration after restart.
>  
>  
> This, actually kills ovn-controller in 100% CPU, since it gets reply from 
> openvswitch with full number of ct-zone records into external_ids of Bridge 
> table:
>       1 2023-11-13T14:28:47.976Z|10019838|jsonrpc|DBG|tcp:127.0.0.1:6640 
> <http://127.0.0.1:6640/>: received reply, 
> result=[false,"00000000-0000-0000-0000-000000000000",{"
>  
> Any help is extremely appreciated
>  
> Yours truly,
> George
> УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ: Это электронное сообщение и любые 
> документы, приложенные к нему, содержат конфиденциальную информацию. 
> Настоящим уведомляем Вас о том, что если это сообщение не предназначено Вам, 
> использование, копирование, распространение информации, содержащейся в 
> настоящем сообщении, а также осуществление любых действий на основе этой 
> информации, строго запрещено. Если Вы получили это сообщение по ошибке, 
> пожалуйста, сообщите об этом отправителю по электронной почте и удалите это 
> сообщение.
> CONFIDENTIALITY NOTICE: This email and any files attached to it are 
> confidential. If you are not the intended recipient you are notified that 
> using, copying, distributing or taking any action in reliance on the contents 
> of this information is strictly prohibited. If you have received this email 
> in error please notify the sender and delete this email.
> _______________________________________________
> discuss mailing list
> disc...@openvswitch.org <mailto:disc...@openvswitch.org>
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>  
> Best regards,
> Ales
> 
> -- 
> Ales Musil
> Senior Software Engineer - OVN Core
> Red Hat EMEA <https://www.redhat.com/>
> amu...@redhat.com <mailto:amu...@redhat.com> 
>  <https://red.ht/sig>
>  
> УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ: Это электронное сообщение и любые 
> документы, приложенные к нему, содержат конфиденциальную информацию. 
> Настоящим уведомляем Вас о том, что если это сообщение не предназначено Вам, 
> использование, копирование, распространение информации, содержащейся в 
> настоящем сообщении, а также осуществление любых действий на основе этой 
> информации, строго запрещено. Если Вы получили это сообщение по ошибке, 
> пожалуйста, сообщите об этом отправителю по электронной почте и удалите это 
> сообщение.
> CONFIDENTIALITY NOTICE: This email and any files attached to it are 
> confidential. If you are not the intended recipient you are notified that 
> using, copying, distributing or taking any action in reliance on the contents 
> of this information is strictly prohibited. If you have received this email 
> in error please notify the sender and delete this email.
> _______________________________________________
> discuss mailing list
> disc...@openvswitch.org <mailto:disc...@openvswitch.org>
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>  
>  
> 
> 
> Regards,
> Vladislav Odintsov
>  
> УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ: Это электронное сообщение и любые 
> документы, приложенные к нему, содержат конфиденциальную информацию. 
> Настоящим уведомляем Вас о том, что если это сообщение не предназначено Вам, 
> использование, копирование, распространение информации, содержащейся в 
> настоящем сообщении, а также осуществление любых действий на основе этой 
> информации, строго запрещено. Если Вы получили это сообщение по ошибке, 
> пожалуйста, сообщите об этом отправителю по электронной почте и удалите это 
> сообщение.
> CONFIDENTIALITY NOTICE: This email and any files attached to it are 
> confidential. If you are not the intended recipient you are notified that 
> using, copying, distributing or taking any action in reliance on the contents 
> of this information is strictly prohibited. If you have received this email 
> in error please notify the sender and delete this email.
> _______________________________________________
> discuss mailing list
> disc...@openvswitch.org <mailto:disc...@openvswitch.org>
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Regards,
Vladislav Odintsov

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to