On 11/2/23 19:59, Joe Liu via discuss wrote: > Hi community, > Hi Joe,
> We hit an issue during upgrading OVS/OVN on ovn-central on master node, > and ovn-controllers on worker nodes: > > Before the upgrade, we have > openvswitch-2.16.90 > ovn-21.09.0 > ovn-host-21.09.0 > ovn-central-21.09.0 > > After the upgrade, we have > openvswitch-2.17.7 > ovn-22.03.2 > ovn-host-22.03.2 > ovn-central-22.03.2 > > However, as soon as the first ovn-controller on worker nodes got > upgraded to 22.03.2 (before the ovn-centralĀ on master node upgraded), > the I/O stopped from the worker node, including the active distributed > gateway port on the node; > > The reverse is true too: as soon as the ovn-central on master node got > upgraded to22.03.2, all the I/Os stopped on worker nodes, including > distributed gateway ports. > > It seemed that ovn version 22.03.2 is not compatible with version > 21.09.0, as long as ovn-central and ovn-controllers version have > mismatch, we hit a window of I/O disruption. > OVN's supported upgrade procedures are: https://github.com/ovn-org/ovn/blob/main/Documentation/intro/install/ovn-upgrades.rst#upgrade-procedures We try to ensure backwards compatibility but that's hard when upgrading from a version that's pre-22.03 (the first LTS). > Is this expected -- incompatibility of OVN version between ovn-central > and ovn-contropller will cause a window of I/O disruption during the > upgrade? If so, what can we do to avoid such disruptions? > In your case, to get this first successful upgrade to 22.03 (the first OVN LTS), you probably need to use the "Fail-safe upgrade": https://github.com/ovn-org/ovn/blob/main/Documentation/intro/install/ovn-upgrades.rst#fail-safe-upgrade That ensures that ovn-controller doesn't change the dataplane as long as northd is still running the old version you're upgrading from. For further upgrades (>=22.03 -> newer versions) it should be OK to use the "Rolling upgrade" procedure: https://github.com/ovn-org/ovn/blob/main/Documentation/intro/install/ovn-upgrades.rst#rolling-upgrade > Currently, we attempt to launch a parallel multi-processes on upgrading > ovn-central on master nodes and all ovn-controllers on worker nodes, but > still observe a window of I/O disruption between 1 minute to 10 minutes. > Any suggestions? Right, "Fail-safe upgrade" should improve that. > > Thanks in advanceĀ for your help! > Joe > Hope this helps! Best regards, Dumitru _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss