[ https://issues.apache.org/jira/browse/CLOUDSTACK-10234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nux updated CLOUDSTACK-10234: ----------------------------- Description: To simulate PSU failure I pulled the power from the server physically, HA fails to do the right thing and move the affected VMs to other HVs. I waited a good while, but alas nothing happened. The VM and VR running on the affected hypervisor were never moved to another one (I have another 2 running). Is there any way to at least force the system to mark that HV as bad/offline? This is what I see in the management server logs: {code:java} Caused by: com.cloud.utils.exception.CloudRuntimeException: Out-of-band Management action (OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get Auth Capabilities error Error issuing Get Channel Authentication Capabilities request Error: Unable to establish IPMI v2 / RMCP+ session at org.apache.cloudstack.outofbandmanagement.OutOfBandManagementServiceImpl.executePowerOperation(OutOfBandManagementServiceImpl.java:423) at sun.reflect.GeneratedMethodAccessor199.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ... 21 more 2018-01-16 17:00:13,396 WARN [o.a.c.alerts] (pool-5-thread-7:null) (logid:4f7299f6) AlertType:: 30 | dataCenterId:: 1 | podId:: 1 | clusterId:: null | message:: HA Fencing of host id=1, in dc id=1 performed 2018-01-16 17:00:15,375 DEBUG [c.c.a.t.Request] (pool-2-thread-27:null) (logid:6b21a8c1) Seq 5-9115285645797884785: Sending \{ Cmd , MgmtId: 161334379813, via: 5(hv03.cloud.local), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.CheckOnHostCommand":{"host":{"guid":"598d48ef-158d-3e14-ad68-8d02c9368ddf-LibvirtComputingResource","privateNetwork":{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false},"publicNetwork":\{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false},"storageNetwork1":\{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false}},"wait":20}}] } 2018-01-16 17:00:15,380 DEBUG [c.c.a.t.Request] (pool-2-thread-5:null) (logid:bb993597) Seq 4-6582855280332112812: Sending \{ Cmd , MgmtId: 161334379813, via: 4(hv02.cloud.local), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.CheckOnHostCommand":{"host":{"guid":"6ebb3010-9c49-3a9c-b620-ecbc9731aca2-LibvirtComputingResource","privateNetwork":{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false},"publicNetwork":\{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false},"storageNetwork1":\{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false}},"wait":20}}] } 2018-01-16 17:00:15,423 DEBUG [c.c.a.t.Request] (AgentManager-Handler-4:null) (logid:) Seq 5-9115285645797884785: Processing: \{ Ans: , MgmtId: 161334379813, via: 5, Ver: v1, Flags: 10, [{"com.cloud.agent.api.Answer":{"result":false,"details":"Heart is beating...","wait":0}}] } 2018-01-16 17:00:15,423 DEBUG [c.c.a.t.Request] (pool-2-thread-27:null) (logid:6b21a8c1) Seq 5-9115285645797884785: Received: \{ Ans: , MgmtId: 161334379813, via: 5(hv03.cloud.local), Ver: v1, Flags: 10, { Answer } } 2018-01-16 17:00:15,423 DEBUG [c.c.a.m.AgentManagerImpl] (pool-2-thread-27:null) (logid:6b21a8c1) Details from executing class com.cloud.agent.api.CheckOnHostCommand: Heart is beating... 2018-01-16 17:00:15,427 DEBUG [c.c.a.t.Request] (AgentManager-Handler-6:null) (logid:) Seq 4-6582855280332112812: Processing: \{ Ans: , MgmtId: 161334379813, via: 4, Ver: v1, Flags: 10, [{"com.cloud.agent.api.Answer":{"result":false,"details":"Heart is beating...","wait":0}}] } 2018-01-16 17:00:15,427 DEBUG [c.c.a.t.Request] (pool-2-thread-5:null) (logid:bb993597) Seq 4-6582855280332112812: Received: \{ Ans: , MgmtId: 161334379813, via: 4(hv02.cloud.local), Ver: v1, Flags: 10, { Answer } } 2018-01-16 17:00:15,427 DEBUG [c.c.a.m.AgentManagerImpl] (pool-2-thread-5:null) (logid:bb993597) Details from executing class com.cloud.agent.api.CheckOnHostCommand: Heart is beating... 2018-01-16 17:00:16,217 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] (AsyncJobMgr-Heartbeat-1:ctx-d9c2c841) (logid:1b093681) Begin cleanup expired async-jobs 2018-01-16 17:00:16,218 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] (AsyncJobMgr-Heartbeat-1:ctx-d9c2c841) (logid:1b093681) End cleanup expired async-jobs 2018-01-16 17:00:17,392 WARN [o.a.c.o.PowerOperationTask] (pool-6-thread-29:null) (logid:f9788c38) Out-of-band management background task operation=STATUS for host id=1 failed with: Out-of-band Management action (STATUS) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get Auth Capabilities error Error issuing Get Channel Authentication Capabilities request Error: Unable to establish IPMI v2 / RMCP+ session 2018-01-16 17:00:17,422 DEBUG [o.a.c.o.OutOfBandManagementServiceImpl] (pool-5-thread-6:ctx-65225bcc) (logid:665de20f) Out-of-band Management action (OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get Auth Capabilities error Error issuing Get Channel Authentication Capabilities request Error: Unable to establish IPMI v2 / RMCP+ session 2018-01-16 17:00:17,438 WARN [o.a.c.k.h.KVMHAProvider] (pool-5-thread-6:ctx-65225bcc) (logid:665de20f) OOBM service is not configured or enabled for this host hv01.cloud.local error is Out-of-band Management action (OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get Auth Capabilities error Error issuing Get Channel Authentication Capabilities request Error: Unable to establish IPMI v2 / RMCP+ session 2018-01-16 17:00:17,438 WARN [o.a.c.h.t.BaseHATask] (pool-5-thread-9:null) (logid:ff44841a) Exception occurred while running FenceTask on a resource: org.apache.cloudstack.ha.provider.HAFenceException: OOBM service is not configured or enabled for this host hv01.cloud.local org.apache.cloudstack.ha.provider.HAFenceException: OOBM service is not configured or enabled for this host hv01.cloud.local at org.apache.cloudstack.kvm.ha.KVMHAProvider.fence(KVMHAProvider.java:99) at org.apache.cloudstack.kvm.ha.KVMHAProvider.fence(KVMHAProvider.java:42) at org.apache.cloudstack.ha.task.FenceTask.performAction(FenceTask.java:42) at org.apache.cloudstack.ha.task.BaseHATask$1.call(BaseHATask.java:86) at org.apache.cloudstack.ha.task.BaseHATask$1.call(BaseHATask.java:83) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: com.cloud.utils.exception.CloudRuntimeException: Out-of-band Management action (OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get Auth Capabilities error Error issuing Get Channel Authentication Capabilities request Error: Unable to establish IPMI v2 / RMCP+ session at org.apache.cloudstack.outofbandmanagement.OutOfBandManagementServiceImpl.executePowerOperation(OutOfBandManagementServiceImpl.java:423) at sun.reflect.GeneratedMethodAccessor199.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ... 21 more 2018-01-16 17:00:17,439 WARN [o.a.c.alerts] (pool-5-thread-9:null) (logid:ff44841a) AlertType:: 30 | dataCenterId:: 1 | podId:: 1 | clusterId:: null | message:: HA Fencing of host id=1, in dc id=1 performed 2018-01-16 17:00:17,903 DEBUG [o.a.c.s.SecondaryStorageManagerImpl] (secstorage-1:ctx-ccb33721) (logid:722404aa) Zone 1 is ready to launch secondary storage VM 2018-01-16 17:00:17,935 DEBUG [c.c.c.ConsoleProxyManagerImpl] (consoleproxy-1:ctx-22a69a02) (logid:393fab21) Zone 1 is ready to launch console proxy {code} was: To simulate PSU failure I pulled the power from the server physically, HA fails to do the right thing and move the affected VMs to other HVs. I waited a good while, but alas nothing happened. The VM and VR running on the affected hypervisor were never moved to another one (I have another 2 running). This is what I see in the management server logs: {code:java} Caused by: com.cloud.utils.exception.CloudRuntimeException: Out-of-band Management action (OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get Auth Capabilities error Error issuing Get Channel Authentication Capabilities request Error: Unable to establish IPMI v2 / RMCP+ session at org.apache.cloudstack.outofbandmanagement.OutOfBandManagementServiceImpl.executePowerOperation(OutOfBandManagementServiceImpl.java:423) at sun.reflect.GeneratedMethodAccessor199.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ... 21 more 2018-01-16 17:00:13,396 WARN [o.a.c.alerts] (pool-5-thread-7:null) (logid:4f7299f6) AlertType:: 30 | dataCenterId:: 1 | podId:: 1 | clusterId:: null | message:: HA Fencing of host id=1, in dc id=1 performed 2018-01-16 17:00:15,375 DEBUG [c.c.a.t.Request] (pool-2-thread-27:null) (logid:6b21a8c1) Seq 5-9115285645797884785: Sending \{ Cmd , MgmtId: 161334379813, via: 5(hv03.cloud.local), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.CheckOnHostCommand":{"host":{"guid":"598d48ef-158d-3e14-ad68-8d02c9368ddf-LibvirtComputingResource","privateNetwork":{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false},"publicNetwork":\{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false},"storageNetwork1":\{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false}},"wait":20}}] } 2018-01-16 17:00:15,380 DEBUG [c.c.a.t.Request] (pool-2-thread-5:null) (logid:bb993597) Seq 4-6582855280332112812: Sending \{ Cmd , MgmtId: 161334379813, via: 4(hv02.cloud.local), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.CheckOnHostCommand":{"host":{"guid":"6ebb3010-9c49-3a9c-b620-ecbc9731aca2-LibvirtComputingResource","privateNetwork":{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false},"publicNetwork":\{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false},"storageNetwork1":\{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false}},"wait":20}}] } 2018-01-16 17:00:15,423 DEBUG [c.c.a.t.Request] (AgentManager-Handler-4:null) (logid:) Seq 5-9115285645797884785: Processing: \{ Ans: , MgmtId: 161334379813, via: 5, Ver: v1, Flags: 10, [{"com.cloud.agent.api.Answer":{"result":false,"details":"Heart is beating...","wait":0}}] } 2018-01-16 17:00:15,423 DEBUG [c.c.a.t.Request] (pool-2-thread-27:null) (logid:6b21a8c1) Seq 5-9115285645797884785: Received: \{ Ans: , MgmtId: 161334379813, via: 5(hv03.cloud.local), Ver: v1, Flags: 10, { Answer } } 2018-01-16 17:00:15,423 DEBUG [c.c.a.m.AgentManagerImpl] (pool-2-thread-27:null) (logid:6b21a8c1) Details from executing class com.cloud.agent.api.CheckOnHostCommand: Heart is beating... 2018-01-16 17:00:15,427 DEBUG [c.c.a.t.Request] (AgentManager-Handler-6:null) (logid:) Seq 4-6582855280332112812: Processing: \{ Ans: , MgmtId: 161334379813, via: 4, Ver: v1, Flags: 10, [{"com.cloud.agent.api.Answer":{"result":false,"details":"Heart is beating...","wait":0}}] } 2018-01-16 17:00:15,427 DEBUG [c.c.a.t.Request] (pool-2-thread-5:null) (logid:bb993597) Seq 4-6582855280332112812: Received: \{ Ans: , MgmtId: 161334379813, via: 4(hv02.cloud.local), Ver: v1, Flags: 10, { Answer } } 2018-01-16 17:00:15,427 DEBUG [c.c.a.m.AgentManagerImpl] (pool-2-thread-5:null) (logid:bb993597) Details from executing class com.cloud.agent.api.CheckOnHostCommand: Heart is beating... 2018-01-16 17:00:16,217 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] (AsyncJobMgr-Heartbeat-1:ctx-d9c2c841) (logid:1b093681) Begin cleanup expired async-jobs 2018-01-16 17:00:16,218 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] (AsyncJobMgr-Heartbeat-1:ctx-d9c2c841) (logid:1b093681) End cleanup expired async-jobs 2018-01-16 17:00:17,392 WARN [o.a.c.o.PowerOperationTask] (pool-6-thread-29:null) (logid:f9788c38) Out-of-band management background task operation=STATUS for host id=1 failed with: Out-of-band Management action (STATUS) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get Auth Capabilities error Error issuing Get Channel Authentication Capabilities request Error: Unable to establish IPMI v2 / RMCP+ session 2018-01-16 17:00:17,422 DEBUG [o.a.c.o.OutOfBandManagementServiceImpl] (pool-5-thread-6:ctx-65225bcc) (logid:665de20f) Out-of-band Management action (OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get Auth Capabilities error Error issuing Get Channel Authentication Capabilities request Error: Unable to establish IPMI v2 / RMCP+ session 2018-01-16 17:00:17,438 WARN [o.a.c.k.h.KVMHAProvider] (pool-5-thread-6:ctx-65225bcc) (logid:665de20f) OOBM service is not configured or enabled for this host hv01.cloud.local error is Out-of-band Management action (OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get Auth Capabilities error Error issuing Get Channel Authentication Capabilities request Error: Unable to establish IPMI v2 / RMCP+ session 2018-01-16 17:00:17,438 WARN [o.a.c.h.t.BaseHATask] (pool-5-thread-9:null) (logid:ff44841a) Exception occurred while running FenceTask on a resource: org.apache.cloudstack.ha.provider.HAFenceException: OOBM service is not configured or enabled for this host hv01.cloud.local org.apache.cloudstack.ha.provider.HAFenceException: OOBM service is not configured or enabled for this host hv01.cloud.local at org.apache.cloudstack.kvm.ha.KVMHAProvider.fence(KVMHAProvider.java:99) at org.apache.cloudstack.kvm.ha.KVMHAProvider.fence(KVMHAProvider.java:42) at org.apache.cloudstack.ha.task.FenceTask.performAction(FenceTask.java:42) at org.apache.cloudstack.ha.task.BaseHATask$1.call(BaseHATask.java:86) at org.apache.cloudstack.ha.task.BaseHATask$1.call(BaseHATask.java:83) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: com.cloud.utils.exception.CloudRuntimeException: Out-of-band Management action (OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get Auth Capabilities error Error issuing Get Channel Authentication Capabilities request Error: Unable to establish IPMI v2 / RMCP+ session at org.apache.cloudstack.outofbandmanagement.OutOfBandManagementServiceImpl.executePowerOperation(OutOfBandManagementServiceImpl.java:423) at sun.reflect.GeneratedMethodAccessor199.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ... 21 more 2018-01-16 17:00:17,439 WARN [o.a.c.alerts] (pool-5-thread-9:null) (logid:ff44841a) AlertType:: 30 | dataCenterId:: 1 | podId:: 1 | clusterId:: null | message:: HA Fencing of host id=1, in dc id=1 performed 2018-01-16 17:00:17,903 DEBUG [o.a.c.s.SecondaryStorageManagerImpl] (secstorage-1:ctx-ccb33721) (logid:722404aa) Zone 1 is ready to launch secondary storage VM 2018-01-16 17:00:17,935 DEBUG [c.c.c.ConsoleProxyManagerImpl] (consoleproxy-1:ctx-22a69a02) (logid:393fab21) Zone 1 is ready to launch console proxy {code} > HA fails in cases of PSU failure. > --------------------------------- > > Key: CLOUDSTACK-10234 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-10234 > Project: CloudStack > Issue Type: Improvement > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Management Server > Affects Versions: 4.11.0.0 > Environment: 4.11 RC1, NFS storage, CentOS 7 management server and > hypervisors > Reporter: Nux > Priority: Major > Labels: HA, KVM > > To simulate PSU failure I pulled the power from the server physically, HA > fails to do the right thing and move the affected VMs to other HVs. > I waited a good while, but alas nothing happened. The VM and VR running on > the affected hypervisor were never moved to another one (I have another 2 > running). > Is there any way to at least force the system to mark that HV as bad/offline? > This is what I see in the management server logs: > {code:java} > Caused by: com.cloud.utils.exception.CloudRuntimeException: Out-of-band > Management action (OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed > with error: Get Auth Capabilities error Error issuing Get Channel > Authentication Capabilities request Error: Unable to establish IPMI v2 / > RMCP+ session at > org.apache.cloudstack.outofbandmanagement.OutOfBandManagementServiceImpl.executePowerOperation(OutOfBandManagementServiceImpl.java:423) > at sun.reflect.GeneratedMethodAccessor199.invoke(Unknown Source) at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ... 21 more 2018-01-16 17:00:13,396 WARN [o.a.c.alerts] > (pool-5-thread-7:null) (logid:4f7299f6) AlertType:: 30 | dataCenterId:: 1 | > podId:: 1 | clusterId:: null | message:: HA Fencing of host id=1, in dc id=1 > performed 2018-01-16 17:00:15,375 DEBUG [c.c.a.t.Request] > (pool-2-thread-27:null) (logid:6b21a8c1) Seq 5-9115285645797884785: Sending > \{ Cmd , MgmtId: 161334379813, via: 5(hv03.cloud.local), Ver: v1, Flags: > 100011, > [{"com.cloud.agent.api.CheckOnHostCommand":{"host":{"guid":"598d48ef-158d-3e14-ad68-8d02c9368ddf-LibvirtComputingResource","privateNetwork":{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false},"publicNetwork":\{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false},"storageNetwork1":\{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false}},"wait":20}}] > } 2018-01-16 17:00:15,380 DEBUG [c.c.a.t.Request] (pool-2-thread-5:null) > (logid:bb993597) Seq 4-6582855280332112812: Sending \{ Cmd , MgmtId: > 161334379813, via: 4(hv02.cloud.local), Ver: v1, Flags: 100011, > [{"com.cloud.agent.api.CheckOnHostCommand":{"host":{"guid":"6ebb3010-9c49-3a9c-b620-ecbc9731aca2-LibvirtComputingResource","privateNetwork":{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false},"publicNetwork":\{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false},"storageNetwork1":\{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false}},"wait":20}}] > } 2018-01-16 17:00:15,423 DEBUG [c.c.a.t.Request] > (AgentManager-Handler-4:null) (logid:) Seq 5-9115285645797884785: Processing: > \{ Ans: , MgmtId: 161334379813, via: 5, Ver: v1, Flags: 10, > [{"com.cloud.agent.api.Answer":{"result":false,"details":"Heart is > beating...","wait":0}}] } 2018-01-16 17:00:15,423 DEBUG [c.c.a.t.Request] > (pool-2-thread-27:null) (logid:6b21a8c1) Seq 5-9115285645797884785: Received: > \{ Ans: , MgmtId: 161334379813, via: 5(hv03.cloud.local), Ver: v1, Flags: > 10, { Answer } } 2018-01-16 17:00:15,423 DEBUG [c.c.a.m.AgentManagerImpl] > (pool-2-thread-27:null) (logid:6b21a8c1) Details from executing class > com.cloud.agent.api.CheckOnHostCommand: Heart is beating... 2018-01-16 > 17:00:15,427 DEBUG [c.c.a.t.Request] (AgentManager-Handler-6:null) (logid:) > Seq 4-6582855280332112812: Processing: \{ Ans: , MgmtId: 161334379813, via: > 4, Ver: v1, Flags: 10, > [{"com.cloud.agent.api.Answer":{"result":false,"details":"Heart is > beating...","wait":0}}] } 2018-01-16 17:00:15,427 DEBUG [c.c.a.t.Request] > (pool-2-thread-5:null) (logid:bb993597) Seq 4-6582855280332112812: Received: > \{ Ans: , MgmtId: 161334379813, via: 4(hv02.cloud.local), Ver: v1, Flags: 10, > { Answer } } 2018-01-16 17:00:15,427 DEBUG [c.c.a.m.AgentManagerImpl] > (pool-2-thread-5:null) (logid:bb993597) Details from executing class > com.cloud.agent.api.CheckOnHostCommand: Heart is beating... 2018-01-16 > 17:00:16,217 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] > (AsyncJobMgr-Heartbeat-1:ctx-d9c2c841) (logid:1b093681) Begin cleanup expired > async-jobs 2018-01-16 17:00:16,218 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] > (AsyncJobMgr-Heartbeat-1:ctx-d9c2c841) (logid:1b093681) End cleanup expired > async-jobs 2018-01-16 17:00:17,392 WARN [o.a.c.o.PowerOperationTask] > (pool-6-thread-29:null) (logid:f9788c38) Out-of-band management background > task operation=STATUS for host id=1 failed with: Out-of-band Management > action (STATUS) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with > error: Get Auth Capabilities error Error issuing Get Channel Authentication > Capabilities request Error: Unable to establish IPMI v2 / RMCP+ session > 2018-01-16 17:00:17,422 DEBUG [o.a.c.o.OutOfBandManagementServiceImpl] > (pool-5-thread-6:ctx-65225bcc) (logid:665de20f) Out-of-band Management action > (OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get > Auth Capabilities error Error issuing Get Channel Authentication Capabilities > request Error: Unable to establish IPMI v2 / RMCP+ session 2018-01-16 > 17:00:17,438 WARN [o.a.c.k.h.KVMHAProvider] (pool-5-thread-6:ctx-65225bcc) > (logid:665de20f) OOBM service is not configured or enabled for this host > hv01.cloud.local error is Out-of-band Management action (OFF) on host > (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get Auth > Capabilities error Error issuing Get Channel Authentication Capabilities > request Error: Unable to establish IPMI v2 / RMCP+ session 2018-01-16 > 17:00:17,438 WARN [o.a.c.h.t.BaseHATask] (pool-5-thread-9:null) > (logid:ff44841a) Exception occurred while running FenceTask on a resource: > org.apache.cloudstack.ha.provider.HAFenceException: OOBM service is not > configured or enabled for this host hv01.cloud.local > org.apache.cloudstack.ha.provider.HAFenceException: OOBM service is not > configured or enabled for this host hv01.cloud.local at > org.apache.cloudstack.kvm.ha.KVMHAProvider.fence(KVMHAProvider.java:99) > at org.apache.cloudstack.kvm.ha.KVMHAProvider.fence(KVMHAProvider.java:42) > at org.apache.cloudstack.ha.task.FenceTask.performAction(FenceTask.java:42) > at org.apache.cloudstack.ha.task.BaseHATask$1.call(BaseHATask.java:86) > at org.apache.cloudstack.ha.task.BaseHATask$1.call(BaseHATask.java:83) at > java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) Caused by: > com.cloud.utils.exception.CloudRuntimeException: Out-of-band Management > action (OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with > error: Get Auth Capabilities error Error issuing Get Channel Authentication > Capabilities request Error: Unable to establish IPMI v2 / RMCP+ session > at > org.apache.cloudstack.outofbandmanagement.OutOfBandManagementServiceImpl.executePowerOperation(OutOfBandManagementServiceImpl.java:423) > at sun.reflect.GeneratedMethodAccessor199.invoke(Unknown Source) at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ... 21 more 2018-01-16 17:00:17,439 WARN [o.a.c.alerts] > (pool-5-thread-9:null) (logid:ff44841a) AlertType:: 30 | dataCenterId:: 1 | > podId:: 1 | clusterId:: null | message:: HA Fencing of host id=1, in dc id=1 > performed 2018-01-16 17:00:17,903 DEBUG [o.a.c.s.SecondaryStorageManagerImpl] > (secstorage-1:ctx-ccb33721) (logid:722404aa) Zone 1 is ready to launch > secondary storage VM 2018-01-16 17:00:17,935 DEBUG > [c.c.c.ConsoleProxyManagerImpl] (consoleproxy-1:ctx-22a69a02) > (logid:393fab21) Zone 1 is ready to launch console proxy > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)