GitHub user remibergsma opened a pull request: https://github.com/apache/cloudstack/pull/211
return a state instead of null in AbstractInvestigatorImpl When a full cluster is down or unreachable, CloudStack currently reports everything the same as the last known state, which is usually Up. When it cannot reach a host and cannot reach another host in the same cluster either, it returns null and says "I don't know". This prevents it from reporting the problem. Now, we return an Alert or Disconnected state so proper action can be taken. Also logging was added, so we know what part of the code put it to Alert or Disconnected. When the host is available again, it goes from Alert state back to Up and CloudStack starts HA work to recover the VMs. I tested it on 4.6/master and it works fine now. As this is a nasty bug, we might want to fix this also in 4.5 and 4.4. Thanks to @dahn and @snuf for their help solving this issue. You can merge this pull request into a Git repository by running: $ git pull https://github.com/remibergsma/cloudstack investigator_null_state_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/cloudstack/pull/211.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #211 ---- commit 78e095e64b2344a49e96a7939ca6edd3b36d93dd Author: Remi Bergsma <git...@remi.nl> Date: 2015-04-29T18:14:14Z return a state instead of null When a full cluster is down or unreachable, CloudStack currently reports everything the same as the last known state, which is usually Up. When it cannot reach a host and cannot reach another host in the same cluster either, it returns null and says "I don't know". This prevents it from reporting the problem. Now, we return an Alert or Disconnected state so proper action can be taken. Also logging was added, so we know what part of the code put it to Alert or Disconnected. ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---