Right, sorry for using the terms interchangeably, I see what you mean. I'll do further testing then as VM HA was also not working in my setup.
I'll be back. -- Sent from the Delta quadrant using Borg technology! Nux! www.nux.ro ----- Original Message ----- > From: "Rohit Yadav" <rohit.ya...@shapeblue.com> > To: "dev" <dev@cloudstack.apache.org> > Sent: Wednesday, 17 January, 2018 09:09:19 > Subject: Re: HA issues > Hi Lucian, > > > The "Host HA" feature is entirely different from VM HA, however, they may work > in tandem, so please stop using the terms interchangeably as it may cause the > community to believe a regression has been caused. > > > The "Host HA" feature currently ships with only "Host HA" provider for KVM > that > is strictly tied to out-of-band management (IPMI for fencing, i.e power off > and > recovery, i.e. reboot) and NFS (as primary storage). (We also have a provider > for simulator, but that's for coverage/testing purposes). > > > Therefore, "Host HA" for KVM (+nfs) currently works only when OOBM is enabled. > The frameowkr allows interested parties may write their own HA providers for a > hypervisor that can use a different strategy/mechanism for fencing/recovery of > hosts (including write a non-IPMI based OOBM plugin) and host/disk activity > checker that is non-NFS based. > > > The "Host HA" feature ships disabled by default and does not cause any > interference with VM HA. However, when enabled and configured correctly, it is > a known limitation that when it is unable to successfully perform recovery or > fencing tasks it may not trigger VM HA. We can discuss how to handle such > cases > (thoughts?). "Host HA" would try couple of times to recover and failing to do > so, it would eventually trigger a host fencing task. If it's unable to fence a > host, it will indefinitely attempt to fence the host (the host state will be > stuck at fencing state in cloud.ha_config table for example) and alerts will > be > sent to admin who can do some manual intervention to handle such situations > (if > you've email/smtp enabled, you should see alert emails). > > > We can discuss how to improve and have a workaround for the case you've hit, > thanks for sharing. > > > - Rohit > > ________________________________ > From: Nux! <n...@li.nux.ro> > Sent: Tuesday, January 16, 2018 10:42:35 PM > To: dev > Subject: Re: HA issues > > Ok, reinstalled and re-tested. > > What I've learned: > > - HA only works now if OOB is configured, the old way HA no longer applies - > this can be good and bad, not everyone has IPMIs > > - HA only works if IPMI is reachable. I've pulled the cord on a HV and HA > failed > to do its thing, leaving me with a HV down along with all the VMs running > there. That's bad. > I've opened this ticket for it: > https://issues.apache.org/jira/browse/CLOUDSTACK-10234 > > Let me know if you need any extra info or stuff to test. > > Regards, > Lucian > > -- > Sent from the Delta quadrant using Borg technology! > > Nux! > www.nux.ro > > > rohit.ya...@shapeblue.com > www.shapeblue.com > 53 Chandos Place, Covent Garden, London WC2N 4HSUK > @shapeblue > > > > ----- Original Message ----- >> From: "Nux!" <n...@li.nux.ro> >> To: "dev" <dev@cloudstack.apache.org> >> Sent: Tuesday, 16 January, 2018 11:35:58 >> Subject: Re: HA issues > >> I'll reinstall my setup and try again, just to be sure I'm working on a clean >> slate. >> >> -- >> Sent from the Delta quadrant using Borg technology! >> >> Nux! >> www.nux.ro >> >> ----- Original Message ----- >>> From: "Rohit Yadav" <rohit.ya...@shapeblue.com> >>> To: "dev" <dev@cloudstack.apache.org> >>> Sent: Tuesday, 16 January, 2018 11:29:51 >>> Subject: Re: HA issues >> >>> Hi Lucian, >>> >>> >>> If you're talking about the new HostHA feature (with KVM+nfs+ipmi), please >>> refer >>> to following docs: >>> >>> http://docs.cloudstack.apache.org/projects/cloudstack-administration/en/latest/hosts.html#out-of-band-management >>> >>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA >>> >>> >>> We'll need to you look at logs perhaps create a JIRA ticket with the logs >>> and >>> details? If you saw ipmi based reboot, then host-ha indeed tried to recover >>> i.e. reboot the host, once hostha has done its work it would schedule HA >>> for VM >>> as soon as the recovery operation succeeds (we've simulator and kvm based >>> marvin tests for such scenarios). >>> >>> >>> Can you see it making attempt to schedule VM ha in logs, or any failure? >>> >>> >>> - Rohit >>> >>> <https://cloudstack.apache.org> >>> >>> >>> >>> ________________________________ >>> From: Nux! <n...@li.nux.ro> >>> Sent: Tuesday, January 16, 2018 12:47:56 AM >>> To: dev >>> Subject: [4.11] HA issues >>> >>> Hi, >>> >>> I see there's a new HA engine for KVM and IPMI support which is really nice, >>> however it seems hit and miss. >>> I have created an instance with HA offering, kernel panicked one of the >>> hypervisors - after a while the server was rebooted via IPMI probably, but >>> the >>> instance never moved to a running hypervisor and even after the original >>> hypervisor came back it was still left in Stopped state. >>> Is there any extra things I need to set up to have proper HA? >>> >>> Regards, >>> Lucian >>> >>> -- >>> Sent from the Delta quadrant using Borg technology! >>> >>> Nux! >>> www.nux.ro >>> >>> rohit.ya...@shapeblue.com >>> www.shapeblue.com<http://www.shapeblue.com> >>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK > > > @shapeblue