Hi, I started to review patch series [1] which addresses the issue with live migration resources. While doing that I made some notes possibly can be useful for reviewers. I would like to share those notes and to ask community to look critically and check if I'm wrong in my conclusions.
** How nova make live migration (LM)? *** Components of LM workflow In LM process the following components are involved: - nova-api Migration params are determined and validated on this level, most important: - instance - source VM - host - target hostname - block_migration - force - conductor Some orchestration process is done on this level: - migration object creating - LiveMigrationTask building and executing - scheduler call - check_can_live_migrate_destination - RPC request to compute node to check that destination environment is appropriate. On destination node check_can_live_migrate_source call is made to check rollback is possible. - migration call to the source compute node - scheduler Scheduler is involved in LM only if the destination host is empty. In that case, scheduler's select_destinations function pick an appropriate host, conductor also calls check_can_live_migrate_destination on picked host. - compute source node It's the place where migration starts and ends. - pre_live_migration call to destination node is made first - control is transferred to the underlying driver for migration - migration monitor is started - post_live_migration or rollback is made - compute destination node Calls from conductor and source node are processed here, check_can_live_migrate_source is made to the source node. *** Common calls diagram http://amadev.ru/static/lm_diagram.png *** Calls list for the libvirt case The following list of calls can be used as reference. - nova.api.openstack.compute.migrate_server.MigrateServerController._migrate_live - nova.compute.api.API.live_migrate - nova.conductor.api.ComputeTaskAPI.live_migrate_instance - nova.conductor.manager.ComputeTaskManager._live_migrate - nova.conductor.manager.ComputeTaskManager._build_live_migrate_task - nova.conductor.tasks.live_migrate.LiveMigrationTask._execute - nova.conductor.tasks.live_migrate.LiveMigrationTask._find_destination - nova.scheduler.manager.SchedulerManager.select_destinations - nova.conductor.tasks.live_migrate.LiveMigrationTask._call_livem_checks_on_host - nova.compute.manager.ComputeManager.check_can_live_migrate_destination - nova.compute.manager.ComputeManager.live_migration - nova.compute.manager.ComputeManager._do_live_migration - nova.compute.manager.pre_live_migration - nova.virt.libvirt.driver.LibvirtDriver._live_migration_operation - nova.virt.libvirt.guest.Guest.migrate - librirt:domain.migrateToURI{,2,3} - nova.compute.manager.ComputeManager.post_live_migration_at_destination ** What is the problem with LM? Nova doesn't claim resources within LM, so we can get in a situation with wrong scheduling until next periodic update_available_resource is done. It has good description in bug [2]. ** What changes in patch were done? New live_migration_claim was added to the ResourceTracker similarly to resize and rebuild claim. It was decided to initiate live_migration_claim within check_can_live_migrate_destination on destination node. To make that done migration (was created in conductor) and resource limits for destination node (got from scheduler) must be passed to check_can_live_migrate_destination, so that's why conductor call and compute RPC API were changed. Overall intention of this patch is taking info account amount of resources on destination node that can be a basement for future LM improvement related to numa, sr-iov, huge pages. [1] https://review.openstack.org/#/c/244489/ [2] https://bugs.launchpad.net/nova/+bug/1289064 -- Thanks, Andrey Volkov, Software Engineer, Mirantis, Inc. __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev