On 3/4/15 12:56 PM, Assaf Muller wrote:
Hello everyone,

An issue came up recently:
http://lists.openstack.org/pipermail/openstack-dev/2015-March/058280.html

Where a recent Kilo patch made non-backwards compatible to the RPC interface
between the Neutron server and its agents. I'm trying to figure out how much
of an issue that really is.

The question is: Does anyone have any experience with performing a 'rolling 
upgrade'
for Neutron, specifically, upgrading the Neutron API server(s) first, and 
upgrading
Neutron agents later? Has anyone performed this from Icehouse to Juno 
successfully?
Would this typically work across the board for other services as well?

When database migrations are involved, typically we shut down all producers/consumers of the database, then migrate the database, then bring up new code for producers/consumers.

This model works across all the services (except for swift, because... swift).

When database migrations are /not/ at play then the general desire is to do a rolling upgrade, in order to have services down for as little time as possible. It's not just doing all the APIs at once and then agents, it's doing a sub-set of APIs in a batch mode, so that the API itself is never 100% down. This works in Nova, where there is a concept of a upgrade_levels for RPC message format, and there is a conductor service which can be upgraded first which can handle translating internals of RPC messages for older services. The end scenario was that we could upgrade conductors first in one swoop (since they are bus consumers and not API points), then roll through the APIs and other services, then finally roll through the computes. Once everything was updated we could bump the upgrade_levels for compute.

Without this sort of structure for Neutron it'll be... difficult to do mixed versions of individual API nodes as well as mixed versions of agents and APIs.

Given that agents aren't API listeners, an upgrade strategy could be to update the agents all at once to new code that's backwards compatible with the old API nodes then roll through the API nodes, or vice versa. Roll through API nodes to get to new code that is backwards compatible with old agents, then update all the agents.

Either way its preferable to do things in as small of "atomic" chunks as possible. In large clusters, with nova, there is a 1:1 relationship between nova-compute and hypervisors, so anything that has to be atomic across compute is painful. Slow. With Neutron, depending on the setup, there is a similar relationship, so being able to break those up into batches, or at least being able to treat them at a different time from the public APIs is desirable.



--
-jlk

_______________________________________________
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Reply via email to