model-migration: the what, why, and progress

Tim Penhey Sun, 28 Feb 2016 20:38:47 -0800

Hi folks,

I'm writing this to communicate what we are doing, why, and how its
going. Something I hope will prompt others to do the same.


I'm also keeping this on the internal juju list because I will be
mentioning JAAS.

What and why:

Model migration is a key component of JAAS and allows us to work with
the most up to date Juju Controllers without fear. With the new world
order where a controller can host multiple models, we are now in the
situation where upgrading a controller could have dramatic and drastic
problems for other models hosted on it should things go wrong. This is
not something we even want to have as an option where we are dealing
with models that are not ours.

Models hosted in a controller are not able to upgrade to versions of
Juju beyond that of the controller they are running on. So one of two
things needs to happen:
 1) we upgrade the controller
 2) we move the model to a newer controller

Once one of these things has happened, the hosted model can upgrade to a
new version (up to what the controller is running).

Since we have already decided that (1) is not a viable option in the
JAAS world, (2) is what we are working on.

Model migration also gives us the ability to load balance should one
controller get overloaded by moving some models to a different controller.

Progress:

Team Onyx has been focusing on model migrations for some time now, and
many pieces are falling into place.

In order to migrate a model, as a user, you need to be a "controller
admin" in both the source and target controllers. There is a CLI command
to initiate the migration "juju migrate". Status of the migration will
be reported as part of "juju status".

This starts a chain of events that will result in the model being moved
and all the agents looking at the new controller.

One of the first big changes was to change the machine agent to use the
new dependency engine for managing dependencies between the various
workers. This is needed to cleanly quiesce environments.  This work
landed into master some time ago, and we are changing the individual
workers one at a time to fit into the new engine.

Some complexities and steps are not shown to avoid too much detail.

The general process of migration goes something like this:
 0) initial checks to confirm no large pending changes or error states
   - no machines being provisioned
   - no missing agents
   - no dying or dead entities
   - there is a grace period, but if it doesn't settle down, the
migration is aborted
 1) the model is quiesced and agents enter a read only state
 2) the general state is checked again to make sure all things are quiet
and stable
 3) the model is serialized into a versioned, database agnostic, wire
format and imported into the target controller
 4) binaries used by the model, tools, charms, resources etc are added
to the target controller
 5) logs are streamed across
 6) all agents are told to check in with the new controller to confirm
access and that the state looks consistent as it would expect
 7) agents switch over to new controller
 8) documents in old controller are cleaned up

At any state prior to switching the agents over, the migration may be
aborted and the model goes back into an active mode.

Right now we are working like mad to get this working.

The model import/export is at a stage where by the end of this week we
should be able to deal with "simple" environments, where simple means
not dealing with the new features like: payloads, storage, networks, or
resources.  These extra parts will be added ASAP.

Several weeks ago I was able to do a proof of concept migration of
simple machines (this was before I got services, units and relations
going) between two controllers using a separate binary to do the export
and import and much manual-hacker-fu.

We are actively working on the controller managing the migration process.

We are making good progress I do feel that things are going to be tight.

The import/export work is in the feature branch "model-migrations".
The control aspect is in the feature branch "model-migration-control".
Moving all the environment workers to a nested dependency engine is the
feature branch "MADE-state-workers" (MADE was short hand for Machine
Agent Dependency Engine). The machine-dep-engine feature branch merged
into master back in January. The MADE-workers feature branch landed
earlier this month. Individual worker changes are now targeting master.

Each of the feature branches are independent of each other for now.

A side benefit I see of the migration work is the potential to speed up
'juju status' significantly.  The model export takes the entire db into
a structured object model, and it reads entire collections at a time, so
is much faster. I'd love to test this in real world large environments.

Cheers,
Tim

-- 
Juju-dev mailing list
[email protected]
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev

model-migration: the what, why, and progress

Reply via email to