Nicolas Vazquez created CLOUDSTACK-10326: --------------------------------------------
Summary: Prevent hosts fall into Maintenance when there are running VMs on it Key: CLOUDSTACK-10326 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-10326 Project: CloudStack Issue Type: Bug Security Level: Public (Anyone can view this level - this is the default.) Affects Versions: 4.11.0.0 Reporter: Nicolas Vazquez Assignee: Nicolas Vazquez Fix For: 4.11.1.0 This issue was discovered, fixed and tested on KVM, but applies for every hypervisor. h2. Background When enabling maintenance mode in a host, host state is put into 'PrepareForMaintenance' and running VMs are migrated into another host. After every VM is migrated, host goes to 'Maintenance' state. Checks are performed on ResourceManagerImpl.checkAndMaintan() method: * List VMs with host_id = HOST_ID * List VMs with last_host_id = HOST_ID and state=Migrating When both queries are empty, then the host can be put into Maintenance. When a VM is being migrated to DEST_HOST, its host_id column is set to DEST_HOST, last_host_id = ORIGIN_HOST and state = Migrating. If then migration fails, host_id = last_host_id = ORIGIN_HOST h2. Issue This sequence: * Enable maintenance mode on ORIGIN_HOST * VMs start being migrated to a host, say DEST_HOST * checkAndMaintain() starts: ** First check passes (no VM with host_id = ORIGIN_HOST_ID as those are being migrated) ** Before the second check, one or more migrations fail ** Second check passes, however there are VMs running on the host as migrations have failed. * Host goes into Maintenance state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)