[ https://issues.apache.org/jira/browse/CLOUDSTACK-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13729442#comment-13729442 ]
Lennert den Teuling commented on CLOUDSTACK-3535: ------------------------------------------------- This is the code that is responsible for nothing to happen (UserVmDomRInvestigator.java) if (s_logger.isDebugEnabled()) { s_logger.debug("could not reach agent, could not reach agent's host, returning that we don't have enough information"); } return null; I think because null is returned nothing happens, I've replaced this simply with "Status.Down" and the HA works fine. Maybe I'm looking at this issue to simple, but why would a unreachable agent and an unpingable host not be enough to trigger HA? The only logical reason i could think of, is that when network issues occur ugly things could happen. But there still is the KVMHAChecker which uses the filesystem to check for heartbeat of the node. So if you would combine the output of the UserVmDomRInvestigator together with the KVMHAChecker, would this be enough to return "host.down" instead of "null" and fix this issue? Ideally you would turn of the host trough IPMI to make sure it's dead, but for now could this be a solution? > No HA actions are performed when a KVM host goes offline > -------------------------------------------------------- > > Key: CLOUDSTACK-3535 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3535 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Hypervisor Controller, KVM, Management Server > Affects Versions: 4.1.0, 4.1.1, 4.2.0 > Environment: KVM (CentOS 6.3) with CloudStack 4.1 > Reporter: Paul Angus > Priority: Blocker > Fix For: 4.2.0 > > Attachments: management-server.log.Agent > > > If a KVM host 'goes down', CloudStack does not perform HA for instances which > are marked as HA enabled on that host (including system VMs) > CloudStack does not show the host as disconnected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira