On Thu, 9 Jul 2020 at 11:53, Matthew Booth <mbo...@redhat.com> wrote:
>
> I'm running a 3-node ovsdb raft cluster in kubernetes without using
> host networking, NET_ADMIN, or any special networking privileges. I'm
> using a StatefulSet, so I have persistent storage and a persistent
> network name. However, I don't have a persistent IP. I have studied 2
> existing implementation of OVN including [1], but as they are both
> focussed on providing SDN service to the cluster itself (which I'm
> not: I'm just a regular tenant of the cluster), they both legitimately
> use host networking and therefore don't suffer this issue.
>
> [1] 
> https://github.com/ovn-org/ovn-kubernetes/blob/master/dist/templates/ovnkube-db-raft.yaml.j2
>
> I finally managed to test what happens when a pod's IP changes, and
> the answer is: it breaks. Specifically, the logs are full of:
>
> 2020-07-09T10:09:16Z|06012|socket_util|ERR|Dropped 59 log messages in
> last 59 seconds (most recently, 1 seconds ago) due to excessive rate
> 2020-07-09T10:09:16Z|06013|socket_util|ERR|6644:10.131.0.4: bind:
> Cannot assign requested address
> 2020-07-09T10:09:16Z|06014|raft|WARN|Dropped 59 log messages in last
> 59 seconds (most recently, 1 seconds ago) due to excessive rate
> 2020-07-09T10:09:16Z|06015|raft|WARN|ptcp:6644:10.131.0.4: listen
> failed (Cannot assign requested address)
>
> The reason it can't bind to 10.131.0.4 is that it's no longer a local
> IP address.
>
> Note that this is binding the raft cluster port, not the client port.
> I have clients connecting to a service IP, which is static. I can't
> specifically test that it still works after the pod IPs change, but as
> it worked before there's no reason to suspect it won't.
>
> My first thought was to use service IPs for the raft cluster, too, but
> if it wants to bind to its local cluster IP that's never going to
> work, because the service IP is never a local IP address (traffic is
> forwarded by an external service).
>
> ovsdb-server is invoked in its container by ovn-ctl:
>
>             exec /usr/share/openvswitch/scripts/ovn-ctl \
>             --no-monitor \
>             --db-nb-create-insecure-remote=yes \
>             --db-nb-cluster-remote-addr="$(bracketify ${initialiser_ip})" \
>             --db-nb-cluster-local-addr="$(bracketify ${LOCAL_IP})" \
>             --db-nb-cluster-local-proto=tcp \
>             --db-nb-cluster-remote-proto=tcp \
>             --ovn-nb-log="-vconsole:${OVN_LOG_LEVEL} -vfile:off" \
>             run_nb_ovsdb
>
> initialiser_ip is the pod IP address of the pod which comes up first.
> This is a bootstrapping thing, and afaik isn't relevant once the
> cluster is initialised. It certainly doesn't appear in the command
> line below. LOCAL_IP is the current ip address of this pod.
> Surprisingly (to me), this doesn't appear in the ovsdb-server
> invocation either. The actual invocation is:
>
> ovsdb-server -vconsole:info -vfile:off
> --log-file=/var/log/openvswitch/ovsdb-server-sb.log
> --remote=punix:/pod-run/ovnsb_db.sock --pidfile=/pod-run/ovnsb_db.pid
> --unixctl=ovnsb_db.ctl
> --remote=db:OVN_Southbound,SB_Global,connections
> --private-key=db:OVN_Southbound,SSL,private_key
> --certificate=db:OVN_Southbound,SSL,certificate
> --ca-cert=db:OVN_Southbound,SSL,ca_cert
> --ssl-protocols=db:OVN_Southbound,SSL,ssl_protocols
> --ssl-ciphers=db:OVN_Southbound,SSL,ssl_ciphers
> --remote=ptcp:6642:0.0.0.0 /var/lib/openvswitch/ovnsb_db.db
>
> So it's getting its former IP address from somewhere. As the only
> local state is the database itself, I assume it's reading it from the
> DB's cluster table. Here's what it currently thinks about cluster
> state:
>
> # ovs-appctl -t /pod-run/ovnsb_db.ctl cluster/status OVN_Southbound
> 83c7
> Name: OVN_Southbound
> Cluster ID: 1524 (1524187a-8a7b-41d5-89cf-ad2d00141258)
> Server ID: 83c7 (83c771fd-d866-4324-bdd6-707c1bf72010)
> Address: tcp:10.131.0.4:6644
> Status: cluster member
> Role: candidate
> Term: 41039
> Leader: unknown
> Vote: self
>
> Log: [5526, 5526]
> Entries not yet committed: 0
> Entries not yet applied: 0
> Connections: (->7f46) (->66fc)
> Servers:
>     83c7 (83c7 at tcp:10.131.0.4:6644) (self) (voted for 83c7)
>     7f46 (7f46 at tcp:10.129.2.9:6644)
>     66fc (66fc at tcp:10.128.2.13:6644)
>
> This highlights the next problem, which is that both the other IPs
> have changed, too. I know the new IP addresses of the other 2 cluster
> nodes, although I don't know which one is 7f46 (but presumably it
> knows). Even if I did know, presumably I can't modify the db while
> it's not a member of the cluster anyway. The only way I can currently
> think of to recover this situation is:
>
> * Scale back the cluster to just node-0
> * node-0 converts itself to a standalone db
> * node-0 converts itself to a cluster db with a new local IP
> * Scale the cluster back up to 3 nodes, initialised from node-0
>
> I haven't tested this so there may be problems with it, but in any
> case it's not a realistic solution.
>
> A much nicer solution would be to use a service IP for the raft
> cluster, but from the above error message I'm not expecting that to
> work because it won't be able to bind it. I'm going to test this
> today, and I'll update if I find to the contrary.

Just to confirm I tested this and, as expected, ovsdb-server fails to
start with:

2020-07-09T14:49:30Z|00013|socket_util|ERR|6643:172.30.84.58: bind:
Cannot assign requested address
2020-07-09T14:49:30Z|00014|raft|WARN|ptcp:6643:172.30.84.58: listen
failed (Cannot assign requested address)

In this case 172.30.84.58 is the stable service IP associated with
this node, but it is not assigned directly to the node.

Matt
-- 
Matthew Booth
Red Hat OpenStack Engineer, Compute DFG

Phone: +442070094448 (UK)

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to