Great, thanks for clarification. ---- On Thu, 25 Oct 2018 13:07:58 +0100 Simone 
Tiraboschi <[email protected]> wrote ---- On Thu, Oct 25, 2018 at 1:31 PM 
Alan G <[email protected]> wrote: Hi, I have 4.1 cluster with FC block 
storage and hosted engine. Last night a host went unreachable due to a 
driver/firmware issue with the NIC card. The Engine spotted this, the host was 
fenced and everything behaved as expected. However, it got me thinking - if the 
affected host had been the one running the Engine, what would have happened? 
I'm assuming the Engine would have failed liveness check on the other hosted 
engine hosts and they would attempt to start the Engine. But as the "failed" 
host still had access to the storage (I believe the HBA was still working) then 
they would not be able to get a lock on the storage. In which case I'm in a 
catch-22, the Engine cannot fence the failed host because its network is 
isolated, but the Engine cannot be restarted else where until the failed host 
is fenced. At this point it requires human intervention to fence the failed 
host. Is my understanding correct on this? If so is there any way to mitigate 
this risk? ovirt-ha-agent implements a specific test for this kind of failures 
continuously trying to ping a specific IPv4 address (usually the network 
gateway) to check network connectivity on each involved host. On failed pings 
each host penalises itself by a certain amount of points; the HA score of each 
host is written into the hosted-engine metadata volume on the shared storage so 
each host can also see the score of other hosts and in your case this would 
work since all the hosts can still access the storage via FC. Once the 
difference between the score of the host running the engine VM and the best 
candidate host is large enough a migrate to best host (or shutdown and restart 
there if not possible as in your case) action will be triggered. If you want, 
you can easily try to reproduce this scenario.   Thanks, Alan 
_______________________________________________ Users mailing list -- 
[email protected] To unsubscribe send an email to [email protected] Privacy 
Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/ List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/6AGWQYGYLXJ4RBO2UWVZLZTJTRAB7S26/
_______________________________________________
Users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/67PRPVU2JMIKKA2DGVOPR5EV4BG2GO54/

Reply via email to