On 22/08, Matthew Booth wrote: > On Wed, 22 Aug 2018 at 10:47, Gorka Eguileor <gegui...@redhat.com> wrote: > > > > On 20/08, Matthew Booth wrote: > > > For those who aren't familiar with it, nova's volume-update (also > > > called swap volume by nova devs) is the nova part of the > > > implementation of cinder's live migration (also called retype). > > > Volume-update is essentially an internal cinder<->nova api, but as > > > that's not a thing it's also unfortunately exposed to users. Some > > > users have found it and are using it, but because it's essentially an > > > internal cinder<->nova api it breaks pretty easily if you don't treat > > > it like a special snowflake. It looks like we've finally found a way > > > it's broken for non-cinder callers that we can't fix, even with a > > > dirty hack. > > > > > > volume-update <server> <old> <new> essentially does a live copy of the > > > data on <old> volume to <new> volume, then seamlessly swaps the > > > attachment to <server> from <old> to <new>. The guest OS on <server> > > > will not notice anything at all as the hypervisor swaps the storage > > > backing an attached volume underneath it. > > > > > > When called by cinder, as intended, cinder does some post-operation > > > cleanup such that <old> is deleted and <new> inherits the same > > > volume_id; that is <old> effectively becomes <new>. When called any > > > other way, however, this cleanup doesn't happen, which breaks a bunch > > > of assumptions. One of these is that a disk's serial number is the > > > same as the attached volume_id. Disk serial number, in KVM at least, > > > is immutable, so can't be updated during volume-update. This is fine > > > if we were called via cinder, because the cinder cleanup means the > > > volume_id stays the same. If called any other way, however, they no > > > longer match, at least until a hard reboot when it will be reset to > > > the new volume_id. It turns out this breaks live migration, but > > > probably other things too. We can't think of a workaround. > > > > > > I wondered why users would want to do this anyway. It turns out that > > > sometimes cinder won't let you migrate a volume, but nova > > > volume-update doesn't do those checks (as they're specific to cinder > > > internals, none of nova's business, and duplicating them would be > > > fragile, so we're not adding them!). Specifically we know that cinder > > > won't let you migrate a volume with snapshots. There may be other > > > reasons. If cinder won't let you migrate your volume, you can still > > > move your data by using nova's volume-update, even though you'll end > > > up with a new volume on the destination, and a slightly broken > > > instance. Apparently the former is a trade-off worth making, but the > > > latter has been reported as a bug. > > > > > > > Hi Matt, > > > > As you know, I'm in favor of making this REST API call only authorized > > for Cinder to avoid messing the cloud. > > > > I know you wanted Cinder to have a solution to do live migrations of > > volumes with snapshots, and while this is not possible to do in a > > reasonable fashion, I kept thinking about it given your strong feelings > > to provide a solution for users that really need this, and I think we > > may have a "reasonable" compromise. > > > > The solution is conceptually simple. We add a new API microversion in > > Cinder that adds and optional parameter called "generic_keep_source" > > (defaults to False) to both migrate and retype operations. > > > > This means that if the driver optimized migration cannot do the > > migration and the generic migration code is the one doing the migration, > > then, instead of our final step being to swap the volume id's and > > deleting the source volume, what we would do is to swap the volume id's > > and move all the snapshots to reference the new volume. Then we would > > create a user message with the new ID of the volume. > > > > This way we can preserve the old volume with all its snapshots and do > > the live migration. > > > > The implementation is a little bit tricky, as we'll have to add anew > > "update_migrated_volume" mechanism to support the renaming of both > > volumes, since the old one wouldn't work with this among other things, > > but it's doable. > > > > Unfortunately I don't have the time right now to work on this... > > Sounds promising, and honestly more than I'd have hoped for. > > Matt >
Hi Matt, Reading Sean's reply I notice that I phrased that wrong. The volume on the new storage backend wouldn't have any snapshots. The result of the operation would be a new volume with the old ID and no snapshots (this would be the one in use by Nova), and the old volume with all the snapshots having a new ID on the DB. Due to Cinder's mechanism to create this new volume we wouldn't be returning it on the REST API call, but as a user message instead. Sorry for the confusion. Cheers, Gorka. __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev