jgotteswinter opened a new issue, #13010:
URL: https://github.com/apache/cloudstack/issues/13010
### problem
While enabling maintenance mode i see random instances getting stopped while
the host is evacuated, the majority is migrated without any issues. But
sometimes i see a instance which should have been live migrated being stopped.
the management server says this
`2026-04-13 10:26:54,986 INFO [c.c.h.HighAvailabilityManagerExtImpl]
(HA-Worker-1:[ctx-7abbe53d, work-3314]) (logid:5ce65c99) Migration attempt: for
VM VM instance
{"id":4930,"instanceName":"i-55-4930-VM","state":"Running","type":"User","uuid":"cf19-00b6-465e-98f1-c63b4860498d"}from
host Host
{"id":18,"name":"XXXch02","type":"Routing","uuid":"dc51-a18d-4f7d-9a2e-7dfbb7a1b908"}.
Starting attempt: 1/5 times.
2026-04-13 10:42:32,197 INFO [c.c.v.ClusteredVirtualMachineManagerImpl]
(Work-Job-Executor-21:[ctx-1e4a3543, job-742712/job-743693, ctx-ff1f267b])
(logid:279e8d1b) Migrating VM instance
{"id":4930,"instanceName":"i-55-4930-VM","state":"Running","type":"User","uuid":"cf19-00b6-465e-98f1-c63b4860498d"}
to
Dest[Zone(Id)-Pod(Id)-Cluster(Id)-Host(Id)-Storage(Volume(Id|Type-->Pool(Id))]
: Dest[Zone(3)-Pod(3)-Cluster(3)-Host(18)-Storage()]
2026-04-13 10:42:32,349 WARN [c.c.v.ClusteredVirtualMachineManagerImpl]
(Work-Job-Executor-21:[ctx-1e4a3543, job-742712/job-743693, ctx-ff1f267b])
(logid:279e8d1b) Unable to migrate VM instance
{"id":4930,"instanceName":"i-55-4930-VM","state":"Running","type":"User","uuid":"cf19-00b6-465e-98f1-c63b4860498d"}
to Host
{"id":18,"name":"XXXch02","type":"Routing","uuid":"dc51-a18d-4f7d-9a2e-7dfbb7a1b908"}
due to [Resource [Host:18] is unreachable: Host 18: Operation timed out]
com.cloud.exception.AgentUnavailableException: Resource [Host:18] is
unreachable: Host 18: Operation timed out
2026-04-13 10:43:27,247 INFO [c.c.r.ResourceManagerImpl]
(AgentMonitor-1:[ctx-6e6b2b3f]) (logid:afd387b5) Attempting maintenance for
Host
{"id":21,"name":"XXXch03","type":"Routing","uuid":"eacf-b3e7-4aa9-b4ae-ff5a41862c06"}
found pending migration for VM instance
{"id":4930,"instanceName":"i-55-4930-VM","state":"Stopping","type":"User","uuid":"cf19-00b6-465e-98f1-c63b4860498d"}.
2026-04-13 10:43:40,248 ERROR [c.c.v.VmWorkJobHandlerProxy]
(Work-Job-Executor-21:[ctx-1e4a3543, job-742712/job-743693, ctx-ff1f267b])
(logid:279e8d1b) Invocation exception, caused by:
com.cloud.utils.exception.CloudRuntimeException: Unable to migrate VM instance
{"id":4930,"instanceName":"i-55-4930-VM","state":"Running","type":"User","uuid":"cf19-00b6-465e-98f1-c63b4860498d"}
2026-04-13 10:43:40,248 INFO [c.c.v.VmWorkJobHandlerProxy]
(Work-Job-Executor-21:[ctx-1e4a3543, job-742712/job-743693, ctx-ff1f267b])
(logid:279e8d1b) Rethrow exception
com.cloud.utils.exception.CloudRuntimeException: Unable to migrate VM instance
{"id":4930,"instanceName":"i-55-4930-VM","state":"Running","type":"User","uuid":"cf19-00b6-465e-98f1-c63b4860498d"}
2026-04-13 10:43:40,248 ERROR [c.c.v.VmWorkJobDispatcher]
(Work-Job-Executor-21:[ctx-1e4a3543, job-742712/job-743693]) (logid:279e8d1b)
Unable to complete AsyncJob
{"accountId":1,"cmd":"com.cloud.vm.VmWorkMigrateAway","cmdInfo":"rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAATQnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAFQ","cmdVersion":0,"completeMsid":null,"created":"Mon
Apr 13 10:42:31 UTC
2026","id":743693,"initMsid":90520733699643,"instanceId":null,"instanceType":null,"lastPolled":null,"lastUpdated":null,"processStatus":0,"removed":null,"result":null,"resultCode":0,"status":"IN_PROGRESS","userId":1,"uuid":"1401-8cf9-4276-ab57-c6a844371dd2"},
job origin: 742712 com.cloud.utils.exception.CloudRuntimeException: Unable to
migrate VM instance {"id":4930,"instanceName":"i-55-4930-VM","s
tate":"Running","type":"User","uuid":"cf19-00b6-465e-98f1-c63b4860498d"}`
i would expect to just leave the instance alone up and running on its origin
host and trigger a failure for the maintenance mode.
### versions
ACS 4.22
Ubuntu 24.04
KVM
### The steps to reproduce the bug
1.
2.
3.
...
### What to do about it?
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]