Hello Ales
* controller: disable OpenFlow inactivity probing we have OVN-23.03.3 LTE – this patch [0] you mentioned I do not observe into the code. I would say that this option: ovn-openflow-probe-interval="60" – should be operatable: I checked the code for better assurance: grep -A6 ^ofctrl_init ./controller/ofctrl.c ofctrl_init(struct ovn_extend_table *group_table, struct ovn_extend_table *meter_table, int inactivity_probe_interval) { swconn = rconn_create(inactivity_probe_interval, 0, DSCP_DEFAULT, 1 << OFP15_VERSION); tx_counter = rconn_packet_counter_create(); And Yet, we do NOT observe ANY trace of this option at ovswitch_db site: ovs-vsctl list open external_ids : {ovn-openflow-probe-interval="60", ovn-remote-probe-interval="60000"} # ovs-vsctl list controller # ovs-vsctl list manager # OVS-2.17.7 Do you think we can use some work-around like: ovn-openflow-probe-interval="0" to avoid disconnects from ovswitch_db side? Thnx in advance From: Ales Musil <amu...@redhat.com> Date: Wednesday, 15 November 2023, 13:09 To: Шагов Георгий <gmsha...@cloud.ru> Cc: "ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org> Subject: Re: [ovs-discuss] Enormous amount of records into openvswitch db Bridge table external_ids ct-zone ВНИМАНИЕ! ВНЕШНИЙ ОТПРАВИТЕЛЬ Если отправитель почты неизвестен, не переходите по ссылкам, не сообщайте пароль, не запускайте вложения и сообщите коллегам из ЦКЗ на secur...@cloud.ru<mailto:secur...@cloud.ru> On Wed, Nov 15, 2023 at 10:49 AM Шагов Георгий <gmsha...@cloud.ru<mailto:gmsha...@cloud.ru>> wrote: Hello Ales Hi, I really appreciate your reply. It helps a lot. * ovn-appctl -t ovn-controller ct-zone-list this request produced about 7K+ records. So, it seems like 6K5 records for ct-zone in Bridge Table seems to be valid Digging deeper we have found that it looks like a major cause for 100% CPU in ovn-controller is that it performs a full recompute constantly. Looking into logs of ovsdb-server of openvswitch_db we see constant messages: reconnect|ERR|tcp:127.0.0.1:53560<http://127.0.0.1:53560>: no response to inactivity probe after 5 seconds, disconnecting So it seems like openvswitch_db constantly drop the connection from ovn-controller due to inactivity. We tried to change an Inactivity Probe interval at openvswitch_db using ovn-controller setting into openvswitch_db:Open_VSwitch table into external_ids: ovn-openflow-probe-interval How it is explained here: https://mail.openvswitch.org/pipermail/ovs-dev/2020-August/373671.html Yet , it seems to be not working, regardless of the value we set (ex: ovn-openflow-probe-interval:”60”) we still do observe the same 5 secs interval into openvswitch_db logs: reconnect|ERR|tcp:127.0.0.1:53560<http://127.0.0.1:53560>: no response to inactivity probe after 5 seconds, disconnecting This is very confusing. There are multiple connections that we have from ovn-controller to br-int (ovsdb). Only one of them can be influenced by the "ovn-openflow-probe-interval". This was recently changed by "controller: disable OpenFlow inactivity probing" [0]. The fact that the probe is still failing after 5 seconds suggests that it is one of the hardcoded ones. This is explained in the email thread [1]. You can try to upgrade past the mentioned patch to see if that helps, unfortunately this is only on main currently and will be available in 24.03. Do we miss anything here? Any hint is appreciated. Thanx in advance. [0] https://github.com/ovn-org/ovn/commit/c16e5da803838fa66129eb61d7930fc84d237f85 [1] https://mail.openvswitch.org/pipermail/ovs-dev/2023-May/404625.html Hopefully this helps. Best regards, Ales From: Ales Musil <amu...@redhat.com<mailto:amu...@redhat.com>> Date: Tuesday, 14 November 2023, 14:19 To: Шагов Георгий <gmsha...@cloud.ru<mailto:gmsha...@cloud.ru>> Cc: "ovs-discuss@openvswitch.org<mailto:ovs-discuss@openvswitch.org>" <ovs-discuss@openvswitch.org<mailto:ovs-discuss@openvswitch.org>> Subject: Re: [ovs-discuss] Enormous amount of records into openvswitch db Bridge table external_ids ct-zone ВНИМАНИЕ! ВНЕШНИЙ ОТПРАВИТЕЛЬ Если отправитель почты неизвестен, не переходите по ссылкам, не сообщайте пароль, не запускайте вложения и сообщите коллегам из ЦКЗ на secur...@cloud.ru<mailto:secur...@cloud.ru> On Tue, Nov 14, 2023 at 12:02 PM Шагов Георгий via discuss <ovs-discuss@openvswitch.org<mailto:ovs-discuss@openvswitch.org>> wrote: Hello All Hi, We do observe strangely f(or our installation) amount of records into openvswitch db Bridge table external_ids:ct-zone, i.e.: 6K5+ CT zone is allocated for most of the LSPs (there are some exceptions) and for all LR DNAT and SNAT that are local for the specified controller. Which means that you have a lot of ports and possibly routers on that single controller or the external-ids are not cleared on update (this would be a bug) . You can actually check the zone list by running: ovn-appctl -t ovn-controller ct-zone-list, to see if that matches the count of active zones that ovn-controller knows about. grep -A20 '^Bridge table' ./ovs.dump | grep external_ids | sed 's/ct-zone-/\nct-zone-/g' | sort | uniq | wc -l 6659 Details: 5 "Bridge" : { 6 "06ef9e06-188e-4654-93b2-5242a324a5c7" : { 7 "initial" : { 8 "datapath_type" : "system", 9 "external_ids" : [ 10 "map", 11 [ 12 [ 13 "ct-zone-00368809-59f5-4408-8ae3-fb5401ff6ea4_dnat", 14 "60" 15 ], 16 [ In that same time if I run: ovs-dpctl ct-stats-show Connections Stats: Total: 1672 TCP: 1269 UDP: 398 ICMP: 5 The questions are: * Who is writing into: openvswitch db Bridge table external_ids:ct-zone? ovn-controller is writing those values for the purpose of restoring the zones after restart. * Is there any way to manage these records into openvswitch db Bridge table external_ids? I want to purge them… ovn-controller will still write those that are new/changed if you purge them. I would advise against that if you care about the restoration after restart. This, actually kills ovn-controller in 100% CPU, since it gets reply from openvswitch with full number of ct-zone records into external_ids of Bridge table: 1 2023-11-13T14:28:47.976Z|10019838|jsonrpc|DBG|tcp:127.0.0.1:6640<http://127.0.0.1:6640>: received reply, result=[false,"00000000-0000-0000-0000-000000000000",{" Any help is extremely appreciated Yours truly, George УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ: Это электронное сообщение и любые документы, приложенные к нему, содержат конфиденциальную информацию. Настоящим уведомляем Вас о том, что если это сообщение не предназначено Вам, использование, копирование, распространение информации, содержащейся в настоящем сообщении, а также осуществление любых действий на основе этой информации, строго запрещено. Если Вы получили это сообщение по ошибке, пожалуйста, сообщите об этом отправителю по электронной почте и удалите это сообщение. CONFIDENTIALITY NOTICE: This email and any files attached to it are confidential. If you are not the intended recipient you are notified that using, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error please notify the sender and delete this email. _______________________________________________ discuss mailing list disc...@openvswitch.org<mailto:disc...@openvswitch.org> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss Best regards, Ales -- Ales Musil Senior Software Engineer - OVN Core Red Hat EMEA<https://www.redhat.com> amu...@redhat.com<mailto:amu...@redhat.com> Error! Filename not specified.<https://red.ht/sig> УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ: Это электронное сообщение и любые документы, приложенные к нему, содержат конфиденциальную информацию. Настоящим уведомляем Вас о том, что если это сообщение не предназначено Вам, использование, копирование, распространение информации, содержащейся в настоящем сообщении, а также осуществление любых действий на основе этой информации, строго запрещено. Если Вы получили это сообщение по ошибке, пожалуйста, сообщите об этом отправителю по электронной почте и удалите это сообщение. CONFIDENTIALITY NOTICE: This email and any files attached to it are confidential. If you are not the intended recipient you are notified that using, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error please notify the sender and delete this email. -- Ales Musil Senior Software Engineer - OVN Core Red Hat EMEA<https://www.redhat.com> amu...@redhat.com<mailto:amu...@redhat.com> [Image removed by sender.]<https://red.ht/sig> УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ: Это электронное сообщение и любые документы, приложенные к нему, содержат конфиденциальную информацию. Настоящим уведомляем Вас о том, что если это сообщение не предназначено Вам, использование, копирование, распространение информации, содержащейся в настоящем сообщении, а также осуществление любых действий на основе этой информации, строго запрещено. Если Вы получили это сообщение по ошибке, пожалуйста, сообщите об этом отправителю по электронной почте и удалите это сообщение. CONFIDENTIALITY NOTICE: This email and any files attached to it are confidential. If you are not the intended recipient you are notified that using, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error please notify the sender and delete this email.
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss