Hello Ales and all

I just felt to obliged in competing the case with some explanation from my side

We have made some additional investigation of the case and came to conclusion 
that the main RC of this was the change we made 6th of November, changing 
‘external_ids:ovn-monitor-all’ from false (default) to true [3]. Thus, making 
ovn-controller to receive datapathes from all the nodes.

Fortunately, we were able to reproduce the case at our staging environment and 
after switching back the mentioned hint we have observed the decrease is flows 
from ‘359340’ to ‘86197’ and ct-zones from  ‘4487’ to ‘1044’ correspondently.

So the hint [3] had given us respite in time, yet had not solved our problem. 
We are going to Implement Relays for SBDB and switch back the hint [3] to 
default value.



The only thing that remains some awkward to me is that why probe interval seems 
to be non-manageable at ovsdb-server side and why it was decided to hard code 
one? [0]


[3] - 
https://mail.openvswitch.org/pipermail/ovs-discuss/2023-November/052798.html


From: Ales Musil <amu...@redhat.com>
Date: Wednesday, 15 November 2023, 13:09
To: Шагов Георгий <gmsha...@cloud.ru>
Cc: "ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>
Subject: Re: [ovs-discuss] Enormous amount of records into openvswitch db 
Bridge table external_ids ct-zone

ВНИМАНИЕ! ВНЕШНИЙ ОТПРАВИТЕЛЬ
Если отправитель почты неизвестен, не переходите по ссылкам, не сообщайте 
пароль,
не запускайте вложения и сообщите коллегам из ЦКЗ на 
secur...@cloud.ru<mailto:secur...@cloud.ru>


On Wed, Nov 15, 2023 at 10:49 AM Шагов Георгий 
<gmsha...@cloud.ru<mailto:gmsha...@cloud.ru>> wrote:
Hello Ales

Hi,


I really appreciate your reply. It helps a lot.


  *   ovn-appctl -t ovn-controller ct-zone-list
this request produced about 7K+ records. So, it seems like 6K5 records for 
ct-zone in Bridge Table seems to be valid

Digging deeper we have found that it looks like a major cause for 100% CPU in 
ovn-controller is that it performs a full recompute constantly.
Looking into logs of ovsdb-server of openvswitch_db we see constant messages:
reconnect|ERR|tcp:127.0.0.1:53560<http://127.0.0.1:53560>: no response to 
inactivity probe after 5 seconds, disconnecting

So it seems like openvswitch_db constantly drop the connection from 
ovn-controller due to inactivity.


We tried to change an Inactivity Probe interval at openvswitch_db using 
ovn-controller setting into openvswitch_db:Open_VSwitch table into 
external_ids: ovn-openflow-probe-interval
How it is explained here: 
https://mail.openvswitch.org/pipermail/ovs-dev/2020-August/373671.html
Yet , it seems to be not working, regardless of the value we set (ex: 
ovn-openflow-probe-interval:”60”) we still do observe the same 5 secs interval 
into openvswitch_db logs:
reconnect|ERR|tcp:127.0.0.1:53560<http://127.0.0.1:53560>: no response to 
inactivity probe after 5 seconds, disconnecting
This is very confusing.

There are multiple connections that we have from ovn-controller to br-int 
(ovsdb). Only one of them can be influenced by the 
"ovn-openflow-probe-interval". This was recently changed by "controller: 
disable OpenFlow inactivity probing" [0]. The fact that the probe is still 
failing after 5 seconds suggests that it is one of the hardcoded ones. This is 
explained in the email thread [1]. You can try to upgrade past the mentioned 
patch to see if that helps, unfortunately this is only on main currently and 
will be available in 24.03.


Do we miss anything here? Any hint is appreciated.
Thanx in advance.

[0] 
https://github.com/ovn-org/ovn/commit/c16e5da803838fa66129eb61d7930fc84d237f85
[1] https://mail.openvswitch.org/pipermail/ovs-dev/2023-May/404625.html
Hopefully this helps.
Best regards,
Ales



From: Ales Musil <amu...@redhat.com<mailto:amu...@redhat.com>>
Date: Tuesday, 14 November 2023, 14:19
To: Шагов Георгий <gmsha...@cloud.ru<mailto:gmsha...@cloud.ru>>
Cc: "ovs-discuss@openvswitch.org<mailto:ovs-discuss@openvswitch.org>" 
<ovs-discuss@openvswitch.org<mailto:ovs-discuss@openvswitch.org>>
Subject: Re: [ovs-discuss] Enormous amount of records into openvswitch db 
Bridge table external_ids ct-zone

ВНИМАНИЕ! ВНЕШНИЙ ОТПРАВИТЕЛЬ
Если отправитель почты неизвестен, не переходите по ссылкам, не сообщайте 
пароль,
не запускайте вложения и сообщите коллегам из ЦКЗ на 
secur...@cloud.ru<mailto:secur...@cloud.ru>


On Tue, Nov 14, 2023 at 12:02 PM Шагов Георгий via discuss 
<ovs-discuss@openvswitch.org<mailto:ovs-discuss@openvswitch.org>> wrote:
Hello All


Hi,

We do observe strangely f(or our installation) amount of records into 
openvswitch db Bridge table external_ids:ct-zone, i.e.: 6K5+

CT zone is allocated for most of the LSPs (there are some exceptions) and for 
all LR DNAT and SNAT that are local for the specified controller. Which means 
that you have a lot of ports and possibly routers on that single controller or 
the external-ids are not cleared on update (this would be a bug) . You can 
actually check the zone list by running: ovn-appctl -t ovn-controller 
ct-zone-list, to see if that matches the count of active zones that 
ovn-controller knows about.


grep -A20 '^Bridge table' ./ovs.dump | grep external_ids | sed 
's/ct-zone-/\nct-zone-/g' | sort | uniq | wc -l
    6659

Details:
      5       "Bridge" : {
      6          "06ef9e06-188e-4654-93b2-5242a324a5c7" : {
      7             "initial" : {
      8                "datapath_type" : "system",
      9                "external_ids" : [
     10                   "map",
     11                   [
     12                      [
     13                         
"ct-zone-00368809-59f5-4408-8ae3-fb5401ff6ea4_dnat",
     14                         "60"
     15                      ],
     16                      [

In that same time if I run:
ovs-dpctl ct-stats-show
Connections Stats:
    Total: 1672
  TCP: 1269
  UDP: 398
  ICMP: 5

The questions are:

  *   Who is writing into: openvswitch db Bridge table external_ids:ct-zone?

ovn-controller is writing those values for the purpose of restoring the zones 
after restart.



  *   Is there any way to manage these records into openvswitch db Bridge table 
external_ids? I want to purge them…

ovn-controller will still write those that are new/changed if you purge them. I 
would advise against that if you care about the restoration after restart.


This, actually kills ovn-controller in 100% CPU, since it gets reply from 
openvswitch with full number of ct-zone records into external_ids of Bridge 
table:
      1 
2023-11-13T14:28:47.976Z|10019838|jsonrpc|DBG|tcp:127.0.0.1:6640<http://127.0.0.1:6640>:
 received reply, result=[false,"00000000-0000-0000-0000-000000000000",{"

Any help is extremely appreciated

Yours truly,
George
УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ: Это электронное сообщение и любые документы, 
приложенные к нему, содержат конфиденциальную информацию. Настоящим уведомляем 
Вас о том, что если это сообщение не предназначено Вам, использование, 
копирование, распространение информации, содержащейся в настоящем сообщении, а 
также осуществление любых действий на основе этой информации, строго запрещено. 
Если Вы получили это сообщение по ошибке, пожалуйста, сообщите об этом 
отправителю по электронной почте и удалите это сообщение.
CONFIDENTIALITY NOTICE: This email and any files attached to it are 
confidential. If you are not the intended recipient you are notified that 
using, copying, distributing or taking any action in reliance on the contents 
of this information is strictly prohibited. If you have received this email in 
error please notify the sender and delete this email.

_______________________________________________
discuss mailing list
disc...@openvswitch.org<mailto:disc...@openvswitch.org>
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Best regards,
Ales

--

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA<https://www.redhat.com>

amu...@redhat.com<mailto:amu...@redhat.com>
Error! Filename not specified.<https://red.ht/sig>


УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ: Это электронное сообщение и любые документы, 
приложенные к нему, содержат конфиденциальную информацию. Настоящим уведомляем 
Вас о том, что если это сообщение не предназначено Вам, использование, 
копирование, распространение информации, содержащейся в настоящем сообщении, а 
также осуществление любых действий на основе этой информации, строго запрещено. 
Если Вы получили это сообщение по ошибке, пожалуйста, сообщите об этом 
отправителю по электронной почте и удалите это сообщение.
CONFIDENTIALITY NOTICE: This email and any files attached to it are 
confidential. If you are not the intended recipient you are notified that 
using, copying, distributing or taking any action in reliance on the contents 
of this information is strictly prohibited. If you have received this email in 
error please notify the sender and delete this email.



--

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA<https://www.redhat.com>

amu...@redhat.com<mailto:amu...@redhat.com>
[Image removed by sender.]<https://red.ht/sig>


УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ: Это электронное сообщение и любые документы, 
приложенные к нему, содержат конфиденциальную информацию. Настоящим уведомляем 
Вас о том, что если это сообщение не предназначено Вам, использование, 
копирование, распространение информации, содержащейся в настоящем сообщении, а 
также осуществление любых действий на основе этой информации, строго запрещено. 
Если Вы получили это сообщение по ошибке, пожалуйста, сообщите об этом 
отправителю по электронной почте и удалите это сообщение.
CONFIDENTIALITY NOTICE: This email and any files attached to it are 
confidential. If you are not the intended recipient you are notified that 
using, copying, distributing or taking any action in reliance on the contents 
of this information is strictly prohibited. If you have received this email in 
error please notify the sender and delete this email.
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to