On 11/2/23 19:59, Joe Liu via discuss wrote:
> Hi community,
> 

Hi Joe,

> We hit an issue during upgrading OVS/OVN on ovn-central on master node,
> and ovn-controllers on worker nodes:
> 
> Before the upgrade, we have
> openvswitch-2.16.90
> ovn-21.09.0
> ovn-host-21.09.0
> ovn-central-21.09.0
> 
> After the upgrade, we have
> openvswitch-2.17.7
> ovn-22.03.2
> ovn-host-22.03.2
> ovn-central-22.03.2
> 
> However, as soon as the first ovn-controller on worker nodes got
> upgraded to 22.03.2 (before the ovn-centralĀ on master node upgraded),
> the I/O stopped from the worker node, including the active distributed
> gateway port on the node;
> 
> The reverse is true too: as soon as the ovn-central on master node got
> upgraded to22.03.2, all the I/Os stopped on worker nodes, including
> distributed gateway ports.
> 
> It seemed that ovn version 22.03.2 is not compatible with version
> 21.09.0, as long as ovn-central and ovn-controllers version have
> mismatch, we hit a window of I/O disruption.
> 

OVN's supported upgrade procedures are:

https://github.com/ovn-org/ovn/blob/main/Documentation/intro/install/ovn-upgrades.rst#upgrade-procedures

We try to ensure backwards compatibility but that's hard when upgrading
from a version that's pre-22.03 (the first LTS).

> Is this expected -- incompatibility of OVN version between ovn-central
> and ovn-contropller will cause a window of I/O disruption during the
> upgrade? If so, what can we do to avoid such disruptions?
> 

In your case, to get this first successful upgrade to 22.03 (the first
OVN LTS), you probably need to use the "Fail-safe upgrade":

https://github.com/ovn-org/ovn/blob/main/Documentation/intro/install/ovn-upgrades.rst#fail-safe-upgrade

That ensures that ovn-controller doesn't change the dataplane as long as
northd is still running the old version you're upgrading from.

For further upgrades (>=22.03 -> newer versions) it should be OK to use
the "Rolling upgrade" procedure:

https://github.com/ovn-org/ovn/blob/main/Documentation/intro/install/ovn-upgrades.rst#rolling-upgrade

> Currently, we attempt to launch a parallel multi-processes on upgrading
> ovn-central on master nodes and all ovn-controllers on worker nodes, but
> still observe a window of I/O disruption between 1 minute to 10 minutes.
> Any suggestions?

Right, "Fail-safe upgrade" should improve that.

> 
> Thanks in advanceĀ for your help!
> Joe
> 

Hope this helps!

Best regards,
Dumitru

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to