On 7 Mar 2023, at 18:01, Vladislav Odintsov via discuss 
<ovs-discuss@openvswitch.org> wrote:

Thanks Ilya for the quick and detailed response!

On 7 Mar 2023, at 14:03, Ilya Maximets via discuss 
<ovs-discuss@openvswitch.org> wrote:

On 3/7/23 00:15, Vladislav Odintsov wrote:
Hi Ilya,

I’m wondering whether there are possible configuration parameters for ovsdb 
relay -> main ovsdb server inactivity probe timer.
My cluster experiencing issues where relay disconnects from main cluster due to 
5 sec. inactivity probe timeout.
Main cluster has quite big database and a bunch of daemons, which connects to 
it and it makes difficult to maintain connections in time.

For ovsdb relay as a remote I use in-db configuration (to provide inactivity 
probe and rbac configuration for ovn-controllers).
For ovsdb-server, which serves SB, I just set --remote=pssl:<port>.

I’d like to configure remote for ovsdb cluster via DB to set inactivity probe 
setting, but I’m not sure about the correct way for that.

For now I see only two options:
1. Setup custom database scheme with connection table, serve it in same SB 
cluster and specify this connection when start ovsdb sb server.

There is a ovsdb/local-config.ovsschema shipped with OVS that can be
used for that purpose.  But you'll need to craft transactions for it
manually with ovsdb-client.

There is a control tool prepared by Terry:
 
https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/

Thanks for pointing on a patch, I guess, I’ll test it out.


But it's not in the repo yet (I need to get back to reviews on that
topic at some point).  The tool itself should be fine, but maybe name
will change.

Am I right that in-DB remote configuration must be a hosted by this 
ovsdb-server database?
What is the best way to configure additional DB on ovsdb-server so that this 
configuration to be permanent?
Also, am I understand correctly that there is no necessity for this DB to be 
clustered?


2. Setup second connection in ovn sb database to be used for ovsdb cluster and 
deploy cluster separately from ovsdb relay, because they both start same 
connections and conflict on ports. (I don’t use docker here, so I need a 
separate server for that).

That's an easy option available right now, true.  If they are deployed
on different nodes, you may even use the same connection record.


Anyway, if I configure ovsdb remote for ovsdb cluster with specified inactivity 
probe (say, to 60k), I guess it’s still not enough to have ovsdb pings every 60 
seconds. Inactivity probe must be the same from both ends - right? From the 
ovsdb relay process.

Inactivity probes don't need to be the same.  They are separate for each
side of a connection and so configured separately.

You can set up inactivity probe for the server side of the connection via
database.  So, server will probe the relay every 60 seconds, but today
it's not possible to set inactivity probe for the relay-to-server direction.
So, relay will probe the server every 5 seconds.

The way out from this situation is to allow configuration of relays via
database as well, e.g. relay:db:Local_Config,Config,relays.  This will
require addition of a new table to the Local_Config database and allowing
relay config to be parsed from the database in the code.  That wasn't
implemented yet.

I saw your talk on last ovscon about this topic, and the solution was in 
progress there. But maybe there were some changes from that time? I’m ready to 
test it if any. Or, maybe there’s any workaround?

Sorry, we didn't move forward much on that topic since the presentation.
There are few unanswered questions around local config database.  Mainly
regarding upgrades from cmdline/main db -based configuration to a local
config -based.  But I hope we can figure that out in the current release
time frame, i.e. before 3.2 release.

Regarding configuration method… Just like an idea (I haven’t seen this variant 
as one of possible).
Remote add/remove is possible via ovsdb-server ctl socket. Could introducing 
new command
"ovsdb-server/set-remote-param PARAM=VALUE" be a solution here?


There is also this workaround:
 
https://patchwork.ozlabs.org/project/openvswitch/patch/an2a4qcpihpcfukyt1uomqre.1.1641782536691.hmail.wentao....@easystack.cn/
It simply takes the server->relay inactivity probe value and applies it
to the relay->server connection.  But it's not a correct solution, because
it relies on certain database names.

Out of curiosity, what kind of poll intervals you see on your main server
setup that triggers inactivity probe failures?  Can upgrade to OVS 3.1
solve some of these issues?  3.1 should be noticeably faster than 2.17,
and also parallel compaction introduced in 3.0 removes one of the big
reasons for large poll intervals.  OVN upgrade to 22.09+ or even 23.03
should also help with database sizes.

We see failures on the OVSDB Relay side:

2023-03-06T22:19:32.966Z|00099|reconnect|ERR|ssl:xxx:16642: no response to 
inactivity probe after 5 seconds, disconnecting
2023-03-06T22:19:32.966Z|00100|reconnect|INFO|ssl:xxx:16642: connection dropped
2023-03-06T22:19:40.989Z|00101|reconnect|INFO|ssl:xxx:16642: connected
2023-03-06T22:19:50.997Z|00102|reconnect|ERR|ssl:xxx:16642: no response to 
inactivity probe after 5 seconds, disconnecting
2023-03-06T22:19:50.997Z|00103|reconnect|INFO|ssl:xxx:16642: connection dropped
2023-03-06T22:19:59.022Z|00104|reconnect|INFO|ssl:xxx:16642: connected
2023-03-06T22:20:09.026Z|00105|reconnect|ERR|ssl:xxx:16642: no response to 
inactivity probe after 5 seconds, disconnecting
2023-03-06T22:20:09.026Z|00106|reconnect|INFO|ssl:xxx:16642: connection dropped
2023-03-06T22:20:17.052Z|00107|reconnect|INFO|ssl:xxx:16642: connected
2023-03-06T22:20:27.056Z|00108|reconnect|ERR|ssl:xxx:16642: no response to 
inactivity probe after 5 seconds, disconnecting
2023-03-06T22:20:27.056Z|00109|reconnect|INFO|ssl:xxx:16642: connection dropped
2023-03-06T22:20:35.111Z|00110|reconnect|INFO|ssl:xxx:16642: connected

On the DB cluster this looks like:

2023-03-06T22:19:04.208Z|00451|stream_ssl|WARN|SSL_read: unexpected SSL 
connection close
2023-03-06T22:19:04.211Z|00452|reconnect|WARN|ssl:xxx:52590: connection dropped 
(Protocol error)

Does it state that configuring inactivity probe on the DB cluster side will not 
help and configuration on the relay side must be done?

We already run OVN 22.09.1 with some backports from next versions.
OVS version is 2.17, so I think it’s possible to try to upgrade OVS to 3.1. 
I’ll take a look on changelog, thanks for pointing this out!


Best regards, Ilya Maximets.
_______________________________________________
discuss mailing list
disc...@openvswitch.org<mailto:disc...@openvswitch.org>
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Regards,
Vladislav Odintsov

_______________________________________________
discuss mailing list
disc...@openvswitch.org<mailto:disc...@openvswitch.org>
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to