[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-10234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nux updated CLOUDSTACK-10234:
-----------------------------
    Description: 
To simulate PSU failure I pulled the power from the server physically, HA fails 
to do the right thing and move the affected VMs to other HVs.

I waited a good while, but alas nothing happened. The VM and VR running on the 
affected hypervisor were never moved to another one (I have another 2 running).

 Is there any way to at least force the system to mark that HV as bad/offline?

This is what I see in the management server logs:
{code:java}
Caused by: com.cloud.utils.exception.CloudRuntimeException: Out-of-band 
Management action (OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed 
with error: Get Auth Capabilities error Error issuing Get Channel 
Authentication Capabilities request Error: Unable to establish IPMI v2 / RMCP+ 
session     at 
org.apache.cloudstack.outofbandmanagement.OutOfBandManagementServiceImpl.executePowerOperation(OutOfBandManagementServiceImpl.java:423)
     at sun.reflect.GeneratedMethodAccessor199.invoke(Unknown Source)     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     ... 21 more 2018-01-16 17:00:13,396 WARN  [o.a.c.alerts] 
(pool-5-thread-7:null) (logid:4f7299f6) AlertType:: 30 | dataCenterId:: 1 | 
podId:: 1 | clusterId:: null | message:: HA Fencing of host id=1, in dc id=1 
performed 2018-01-16 17:00:15,375 DEBUG [c.c.a.t.Request] 
(pool-2-thread-27:null) (logid:6b21a8c1) Seq 5-9115285645797884785: Sending  \{ 
Cmd , MgmtId: 161334379813, via: 5(hv03.cloud.local), Ver: v1, Flags: 100011, 
[{"com.cloud.agent.api.CheckOnHostCommand":{"host":{"guid":"598d48ef-158d-3e14-ad68-8d02c9368ddf-LibvirtComputingResource","privateNetwork":{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false},"publicNetwork":\{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false},"storageNetwork1":\{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false}},"wait":20}}]
 } 2018-01-16 17:00:15,380 DEBUG [c.c.a.t.Request] (pool-2-thread-5:null) 
(logid:bb993597) Seq 4-6582855280332112812: Sending  \{ Cmd , MgmtId: 
161334379813, via: 4(hv02.cloud.local), Ver: v1, Flags: 100011, 
[{"com.cloud.agent.api.CheckOnHostCommand":{"host":{"guid":"6ebb3010-9c49-3a9c-b620-ecbc9731aca2-LibvirtComputingResource","privateNetwork":{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false},"publicNetwork":\{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false},"storageNetwork1":\{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false}},"wait":20}}]
 } 2018-01-16 17:00:15,423 DEBUG [c.c.a.t.Request] 
(AgentManager-Handler-4:null) (logid:) Seq 5-9115285645797884785: Processing:  
\{ Ans: , MgmtId: 161334379813, via: 5, Ver: v1, Flags: 10, 
[{"com.cloud.agent.api.Answer":{"result":false,"details":"Heart is 
beating...","wait":0}}] } 2018-01-16 17:00:15,423 DEBUG [c.c.a.t.Request] 
(pool-2-thread-27:null) (logid:6b21a8c1) Seq 5-9115285645797884785: Received:  
\{ Ans: , MgmtId: 161334379813, via: 5(hv03.cloud.local), Ver: v1, Flags: 10, { 
Answer } } 2018-01-16 17:00:15,423 DEBUG [c.c.a.m.AgentManagerImpl] 
(pool-2-thread-27:null) (logid:6b21a8c1) Details from executing class 
com.cloud.agent.api.CheckOnHostCommand: Heart is beating... 2018-01-16 
17:00:15,427 DEBUG [c.c.a.t.Request] (AgentManager-Handler-6:null) (logid:) Seq 
4-6582855280332112812: Processing:  \{ Ans: , MgmtId: 161334379813, via: 4, 
Ver: v1, Flags: 10, 
[{"com.cloud.agent.api.Answer":{"result":false,"details":"Heart is 
beating...","wait":0}}] } 2018-01-16 17:00:15,427 DEBUG [c.c.a.t.Request] 
(pool-2-thread-5:null) (logid:bb993597) Seq 4-6582855280332112812: Received:  
\{ Ans: , MgmtId: 161334379813, via: 4(hv02.cloud.local), Ver: v1, Flags: 10, { 
Answer } } 2018-01-16 17:00:15,427 DEBUG [c.c.a.m.AgentManagerImpl] 
(pool-2-thread-5:null) (logid:bb993597) Details from executing class 
com.cloud.agent.api.CheckOnHostCommand: Heart is beating... 2018-01-16 
17:00:16,217 INFO  [o.a.c.f.j.i.AsyncJobManagerImpl] 
(AsyncJobMgr-Heartbeat-1:ctx-d9c2c841) (logid:1b093681) Begin cleanup expired 
async-jobs 2018-01-16 17:00:16,218 INFO  [o.a.c.f.j.i.AsyncJobManagerImpl] 
(AsyncJobMgr-Heartbeat-1:ctx-d9c2c841) (logid:1b093681) End cleanup expired 
async-jobs 2018-01-16 17:00:17,392 WARN  [o.a.c.o.PowerOperationTask] 
(pool-6-thread-29:null) (logid:f9788c38) Out-of-band management background task 
operation=STATUS for host id=1 failed with: Out-of-band Management action 
(STATUS) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get 
Auth Capabilities error Error issuing Get Channel Authentication Capabilities 
request Error: Unable to establish IPMI v2 / RMCP+ session 2018-01-16 
17:00:17,422 DEBUG [o.a.c.o.OutOfBandManagementServiceImpl] 
(pool-5-thread-6:ctx-65225bcc) (logid:665de20f) Out-of-band Management action 
(OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get 
Auth Capabilities error Error issuing Get Channel Authentication Capabilities 
request Error: Unable to establish IPMI v2 / RMCP+ session 2018-01-16 
17:00:17,438 WARN  [o.a.c.k.h.KVMHAProvider] (pool-5-thread-6:ctx-65225bcc) 
(logid:665de20f) OOBM service is not configured or enabled for this host 
hv01.cloud.local error is Out-of-band Management action (OFF) on host 
(57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get Auth Capabilities 
error Error issuing Get Channel Authentication Capabilities request Error: 
Unable to establish IPMI v2 / RMCP+ session 2018-01-16 17:00:17,438 WARN  
[o.a.c.h.t.BaseHATask] (pool-5-thread-9:null) (logid:ff44841a) Exception 
occurred while running FenceTask on a resource: 
org.apache.cloudstack.ha.provider.HAFenceException: OOBM service is not 
configured or enabled for this host hv01.cloud.local 
org.apache.cloudstack.ha.provider.HAFenceException: OOBM service is not 
configured or enabled for this host hv01.cloud.local     at 
org.apache.cloudstack.kvm.ha.KVMHAProvider.fence(KVMHAProvider.java:99)     at 
org.apache.cloudstack.kvm.ha.KVMHAProvider.fence(KVMHAProvider.java:42)     at 
org.apache.cloudstack.ha.task.FenceTask.performAction(FenceTask.java:42)     at 
org.apache.cloudstack.ha.task.BaseHATask$1.call(BaseHATask.java:86)     at 
org.apache.cloudstack.ha.task.BaseHATask$1.call(BaseHATask.java:83)     at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)     at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
    at java.lang.Thread.run(Thread.java:748) Caused by: 
com.cloud.utils.exception.CloudRuntimeException: Out-of-band Management action 
(OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get 
Auth Capabilities error Error issuing Get Channel Authentication Capabilities 
request Error: Unable to establish IPMI v2 / RMCP+ session     at 
org.apache.cloudstack.outofbandmanagement.OutOfBandManagementServiceImpl.executePowerOperation(OutOfBandManagementServiceImpl.java:423)
     at sun.reflect.GeneratedMethodAccessor199.invoke(Unknown Source)     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     ... 21 more 2018-01-16 17:00:17,439 WARN  [o.a.c.alerts] 
(pool-5-thread-9:null) (logid:ff44841a) AlertType:: 30 | dataCenterId:: 1 | 
podId:: 1 | clusterId:: null | message:: HA Fencing of host id=1, in dc id=1 
performed 2018-01-16 17:00:17,903 DEBUG [o.a.c.s.SecondaryStorageManagerImpl] 
(secstorage-1:ctx-ccb33721) (logid:722404aa) Zone 1 is ready to launch 
secondary storage VM 2018-01-16 17:00:17,935 DEBUG 
[c.c.c.ConsoleProxyManagerImpl] (consoleproxy-1:ctx-22a69a02) (logid:393fab21) 
Zone 1 is ready to launch console proxy
{code}

  was:
To simulate PSU failure I pulled the power from the server physically, HA fails 
to do the right thing and move the affected VMs to other HVs.

I waited a good while, but alas nothing happened. The VM and VR running on the 
affected hypervisor were never moved to another one (I have another 2 running).

 

This is what I see in the management server logs:
{code:java}
Caused by: com.cloud.utils.exception.CloudRuntimeException: Out-of-band 
Management action (OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed 
with error: Get Auth Capabilities error Error issuing Get Channel 
Authentication Capabilities request Error: Unable to establish IPMI v2 / RMCP+ 
session     at 
org.apache.cloudstack.outofbandmanagement.OutOfBandManagementServiceImpl.executePowerOperation(OutOfBandManagementServiceImpl.java:423)
     at sun.reflect.GeneratedMethodAccessor199.invoke(Unknown Source)     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     ... 21 more 2018-01-16 17:00:13,396 WARN  [o.a.c.alerts] 
(pool-5-thread-7:null) (logid:4f7299f6) AlertType:: 30 | dataCenterId:: 1 | 
podId:: 1 | clusterId:: null | message:: HA Fencing of host id=1, in dc id=1 
performed 2018-01-16 17:00:15,375 DEBUG [c.c.a.t.Request] 
(pool-2-thread-27:null) (logid:6b21a8c1) Seq 5-9115285645797884785: Sending  \{ 
Cmd , MgmtId: 161334379813, via: 5(hv03.cloud.local), Ver: v1, Flags: 100011, 
[{"com.cloud.agent.api.CheckOnHostCommand":{"host":{"guid":"598d48ef-158d-3e14-ad68-8d02c9368ddf-LibvirtComputingResource","privateNetwork":{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false},"publicNetwork":\{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false},"storageNetwork1":\{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false}},"wait":20}}]
 } 2018-01-16 17:00:15,380 DEBUG [c.c.a.t.Request] (pool-2-thread-5:null) 
(logid:bb993597) Seq 4-6582855280332112812: Sending  \{ Cmd , MgmtId: 
161334379813, via: 4(hv02.cloud.local), Ver: v1, Flags: 100011, 
[{"com.cloud.agent.api.CheckOnHostCommand":{"host":{"guid":"6ebb3010-9c49-3a9c-b620-ecbc9731aca2-LibvirtComputingResource","privateNetwork":{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false},"publicNetwork":\{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false},"storageNetwork1":\{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false}},"wait":20}}]
 } 2018-01-16 17:00:15,423 DEBUG [c.c.a.t.Request] 
(AgentManager-Handler-4:null) (logid:) Seq 5-9115285645797884785: Processing:  
\{ Ans: , MgmtId: 161334379813, via: 5, Ver: v1, Flags: 10, 
[{"com.cloud.agent.api.Answer":{"result":false,"details":"Heart is 
beating...","wait":0}}] } 2018-01-16 17:00:15,423 DEBUG [c.c.a.t.Request] 
(pool-2-thread-27:null) (logid:6b21a8c1) Seq 5-9115285645797884785: Received:  
\{ Ans: , MgmtId: 161334379813, via: 5(hv03.cloud.local), Ver: v1, Flags: 10, { 
Answer } } 2018-01-16 17:00:15,423 DEBUG [c.c.a.m.AgentManagerImpl] 
(pool-2-thread-27:null) (logid:6b21a8c1) Details from executing class 
com.cloud.agent.api.CheckOnHostCommand: Heart is beating... 2018-01-16 
17:00:15,427 DEBUG [c.c.a.t.Request] (AgentManager-Handler-6:null) (logid:) Seq 
4-6582855280332112812: Processing:  \{ Ans: , MgmtId: 161334379813, via: 4, 
Ver: v1, Flags: 10, 
[{"com.cloud.agent.api.Answer":{"result":false,"details":"Heart is 
beating...","wait":0}}] } 2018-01-16 17:00:15,427 DEBUG [c.c.a.t.Request] 
(pool-2-thread-5:null) (logid:bb993597) Seq 4-6582855280332112812: Received:  
\{ Ans: , MgmtId: 161334379813, via: 4(hv02.cloud.local), Ver: v1, Flags: 10, { 
Answer } } 2018-01-16 17:00:15,427 DEBUG [c.c.a.m.AgentManagerImpl] 
(pool-2-thread-5:null) (logid:bb993597) Details from executing class 
com.cloud.agent.api.CheckOnHostCommand: Heart is beating... 2018-01-16 
17:00:16,217 INFO  [o.a.c.f.j.i.AsyncJobManagerImpl] 
(AsyncJobMgr-Heartbeat-1:ctx-d9c2c841) (logid:1b093681) Begin cleanup expired 
async-jobs 2018-01-16 17:00:16,218 INFO  [o.a.c.f.j.i.AsyncJobManagerImpl] 
(AsyncJobMgr-Heartbeat-1:ctx-d9c2c841) (logid:1b093681) End cleanup expired 
async-jobs 2018-01-16 17:00:17,392 WARN  [o.a.c.o.PowerOperationTask] 
(pool-6-thread-29:null) (logid:f9788c38) Out-of-band management background task 
operation=STATUS for host id=1 failed with: Out-of-band Management action 
(STATUS) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get 
Auth Capabilities error Error issuing Get Channel Authentication Capabilities 
request Error: Unable to establish IPMI v2 / RMCP+ session 2018-01-16 
17:00:17,422 DEBUG [o.a.c.o.OutOfBandManagementServiceImpl] 
(pool-5-thread-6:ctx-65225bcc) (logid:665de20f) Out-of-band Management action 
(OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get 
Auth Capabilities error Error issuing Get Channel Authentication Capabilities 
request Error: Unable to establish IPMI v2 / RMCP+ session 2018-01-16 
17:00:17,438 WARN  [o.a.c.k.h.KVMHAProvider] (pool-5-thread-6:ctx-65225bcc) 
(logid:665de20f) OOBM service is not configured or enabled for this host 
hv01.cloud.local error is Out-of-band Management action (OFF) on host 
(57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get Auth Capabilities 
error Error issuing Get Channel Authentication Capabilities request Error: 
Unable to establish IPMI v2 / RMCP+ session 2018-01-16 17:00:17,438 WARN  
[o.a.c.h.t.BaseHATask] (pool-5-thread-9:null) (logid:ff44841a) Exception 
occurred while running FenceTask on a resource: 
org.apache.cloudstack.ha.provider.HAFenceException: OOBM service is not 
configured or enabled for this host hv01.cloud.local 
org.apache.cloudstack.ha.provider.HAFenceException: OOBM service is not 
configured or enabled for this host hv01.cloud.local     at 
org.apache.cloudstack.kvm.ha.KVMHAProvider.fence(KVMHAProvider.java:99)     at 
org.apache.cloudstack.kvm.ha.KVMHAProvider.fence(KVMHAProvider.java:42)     at 
org.apache.cloudstack.ha.task.FenceTask.performAction(FenceTask.java:42)     at 
org.apache.cloudstack.ha.task.BaseHATask$1.call(BaseHATask.java:86)     at 
org.apache.cloudstack.ha.task.BaseHATask$1.call(BaseHATask.java:83)     at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)     at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
    at java.lang.Thread.run(Thread.java:748) Caused by: 
com.cloud.utils.exception.CloudRuntimeException: Out-of-band Management action 
(OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get 
Auth Capabilities error Error issuing Get Channel Authentication Capabilities 
request Error: Unable to establish IPMI v2 / RMCP+ session     at 
org.apache.cloudstack.outofbandmanagement.OutOfBandManagementServiceImpl.executePowerOperation(OutOfBandManagementServiceImpl.java:423)
     at sun.reflect.GeneratedMethodAccessor199.invoke(Unknown Source)     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     ... 21 more 2018-01-16 17:00:17,439 WARN  [o.a.c.alerts] 
(pool-5-thread-9:null) (logid:ff44841a) AlertType:: 30 | dataCenterId:: 1 | 
podId:: 1 | clusterId:: null | message:: HA Fencing of host id=1, in dc id=1 
performed 2018-01-16 17:00:17,903 DEBUG [o.a.c.s.SecondaryStorageManagerImpl] 
(secstorage-1:ctx-ccb33721) (logid:722404aa) Zone 1 is ready to launch 
secondary storage VM 2018-01-16 17:00:17,935 DEBUG 
[c.c.c.ConsoleProxyManagerImpl] (consoleproxy-1:ctx-22a69a02) (logid:393fab21) 
Zone 1 is ready to launch console proxy
{code}


> HA fails in cases of PSU failure.
> ---------------------------------
>
>                 Key: CLOUDSTACK-10234
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-10234
>             Project: CloudStack
>          Issue Type: Improvement
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: Management Server
>    Affects Versions: 4.11.0.0
>         Environment: 4.11 RC1, NFS storage, CentOS 7 management server and 
> hypervisors
>            Reporter: Nux
>            Priority: Major
>              Labels: HA, KVM
>
> To simulate PSU failure I pulled the power from the server physically, HA 
> fails to do the right thing and move the affected VMs to other HVs.
> I waited a good while, but alas nothing happened. The VM and VR running on 
> the affected hypervisor were never moved to another one (I have another 2 
> running).
>  Is there any way to at least force the system to mark that HV as bad/offline?
> This is what I see in the management server logs:
> {code:java}
> Caused by: com.cloud.utils.exception.CloudRuntimeException: Out-of-band 
> Management action (OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed 
> with error: Get Auth Capabilities error Error issuing Get Channel 
> Authentication Capabilities request Error: Unable to establish IPMI v2 / 
> RMCP+ session     at 
> org.apache.cloudstack.outofbandmanagement.OutOfBandManagementServiceImpl.executePowerOperation(OutOfBandManagementServiceImpl.java:423)
>      at sun.reflect.GeneratedMethodAccessor199.invoke(Unknown Source)     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>      ... 21 more 2018-01-16 17:00:13,396 WARN  [o.a.c.alerts] 
> (pool-5-thread-7:null) (logid:4f7299f6) AlertType:: 30 | dataCenterId:: 1 | 
> podId:: 1 | clusterId:: null | message:: HA Fencing of host id=1, in dc id=1 
> performed 2018-01-16 17:00:15,375 DEBUG [c.c.a.t.Request] 
> (pool-2-thread-27:null) (logid:6b21a8c1) Seq 5-9115285645797884785: Sending  
> \{ Cmd , MgmtId: 161334379813, via: 5(hv03.cloud.local), Ver: v1, Flags: 
> 100011, 
> [{"com.cloud.agent.api.CheckOnHostCommand":{"host":{"guid":"598d48ef-158d-3e14-ad68-8d02c9368ddf-LibvirtComputingResource","privateNetwork":{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false},"publicNetwork":\{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false},"storageNetwork1":\{"ip":"172.16.25.101","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:f6","isSecurityGroupEnabled":false}},"wait":20}}]
>  } 2018-01-16 17:00:15,380 DEBUG [c.c.a.t.Request] (pool-2-thread-5:null) 
> (logid:bb993597) Seq 4-6582855280332112812: Sending  \{ Cmd , MgmtId: 
> 161334379813, via: 4(hv02.cloud.local), Ver: v1, Flags: 100011, 
> [{"com.cloud.agent.api.CheckOnHostCommand":{"host":{"guid":"6ebb3010-9c49-3a9c-b620-ecbc9731aca2-LibvirtComputingResource","privateNetwork":{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false},"publicNetwork":\{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false},"storageNetwork1":\{"ip":"172.16.25.100","netmask":"255.255.255.240","mac":"0c:c4:7a:40:8e:8e","isSecurityGroupEnabled":false}},"wait":20}}]
>  } 2018-01-16 17:00:15,423 DEBUG [c.c.a.t.Request] 
> (AgentManager-Handler-4:null) (logid:) Seq 5-9115285645797884785: Processing: 
>  \{ Ans: , MgmtId: 161334379813, via: 5, Ver: v1, Flags: 10, 
> [{"com.cloud.agent.api.Answer":{"result":false,"details":"Heart is 
> beating...","wait":0}}] } 2018-01-16 17:00:15,423 DEBUG [c.c.a.t.Request] 
> (pool-2-thread-27:null) (logid:6b21a8c1) Seq 5-9115285645797884785: Received: 
>  \{ Ans: , MgmtId: 161334379813, via: 5(hv03.cloud.local), Ver: v1, Flags: 
> 10, { Answer } } 2018-01-16 17:00:15,423 DEBUG [c.c.a.m.AgentManagerImpl] 
> (pool-2-thread-27:null) (logid:6b21a8c1) Details from executing class 
> com.cloud.agent.api.CheckOnHostCommand: Heart is beating... 2018-01-16 
> 17:00:15,427 DEBUG [c.c.a.t.Request] (AgentManager-Handler-6:null) (logid:) 
> Seq 4-6582855280332112812: Processing:  \{ Ans: , MgmtId: 161334379813, via: 
> 4, Ver: v1, Flags: 10, 
> [{"com.cloud.agent.api.Answer":{"result":false,"details":"Heart is 
> beating...","wait":0}}] } 2018-01-16 17:00:15,427 DEBUG [c.c.a.t.Request] 
> (pool-2-thread-5:null) (logid:bb993597) Seq 4-6582855280332112812: Received:  
> \{ Ans: , MgmtId: 161334379813, via: 4(hv02.cloud.local), Ver: v1, Flags: 10, 
> { Answer } } 2018-01-16 17:00:15,427 DEBUG [c.c.a.m.AgentManagerImpl] 
> (pool-2-thread-5:null) (logid:bb993597) Details from executing class 
> com.cloud.agent.api.CheckOnHostCommand: Heart is beating... 2018-01-16 
> 17:00:16,217 INFO  [o.a.c.f.j.i.AsyncJobManagerImpl] 
> (AsyncJobMgr-Heartbeat-1:ctx-d9c2c841) (logid:1b093681) Begin cleanup expired 
> async-jobs 2018-01-16 17:00:16,218 INFO  [o.a.c.f.j.i.AsyncJobManagerImpl] 
> (AsyncJobMgr-Heartbeat-1:ctx-d9c2c841) (logid:1b093681) End cleanup expired 
> async-jobs 2018-01-16 17:00:17,392 WARN  [o.a.c.o.PowerOperationTask] 
> (pool-6-thread-29:null) (logid:f9788c38) Out-of-band management background 
> task operation=STATUS for host id=1 failed with: Out-of-band Management 
> action (STATUS) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with 
> error: Get Auth Capabilities error Error issuing Get Channel Authentication 
> Capabilities request Error: Unable to establish IPMI v2 / RMCP+ session 
> 2018-01-16 17:00:17,422 DEBUG [o.a.c.o.OutOfBandManagementServiceImpl] 
> (pool-5-thread-6:ctx-65225bcc) (logid:665de20f) Out-of-band Management action 
> (OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get 
> Auth Capabilities error Error issuing Get Channel Authentication Capabilities 
> request Error: Unable to establish IPMI v2 / RMCP+ session 2018-01-16 
> 17:00:17,438 WARN  [o.a.c.k.h.KVMHAProvider] (pool-5-thread-6:ctx-65225bcc) 
> (logid:665de20f) OOBM service is not configured or enabled for this host 
> hv01.cloud.local error is Out-of-band Management action (OFF) on host 
> (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with error: Get Auth 
> Capabilities error Error issuing Get Channel Authentication Capabilities 
> request Error: Unable to establish IPMI v2 / RMCP+ session 2018-01-16 
> 17:00:17,438 WARN  [o.a.c.h.t.BaseHATask] (pool-5-thread-9:null) 
> (logid:ff44841a) Exception occurred while running FenceTask on a resource: 
> org.apache.cloudstack.ha.provider.HAFenceException: OOBM service is not 
> configured or enabled for this host hv01.cloud.local 
> org.apache.cloudstack.ha.provider.HAFenceException: OOBM service is not 
> configured or enabled for this host hv01.cloud.local     at 
> org.apache.cloudstack.kvm.ha.KVMHAProvider.fence(KVMHAProvider.java:99)     
> at org.apache.cloudstack.kvm.ha.KVMHAProvider.fence(KVMHAProvider.java:42)    
>  at org.apache.cloudstack.ha.task.FenceTask.performAction(FenceTask.java:42)  
>    at org.apache.cloudstack.ha.task.BaseHATask$1.call(BaseHATask.java:86)     
> at org.apache.cloudstack.ha.task.BaseHATask$1.call(BaseHATask.java:83)     at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>      at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>      at java.lang.Thread.run(Thread.java:748) Caused by: 
> com.cloud.utils.exception.CloudRuntimeException: Out-of-band Management 
> action (OFF) on host (57bf86e0-e1cd-484e-a4f1-78b3ca2da125) failed with 
> error: Get Auth Capabilities error Error issuing Get Channel Authentication 
> Capabilities request Error: Unable to establish IPMI v2 / RMCP+ session     
> at 
> org.apache.cloudstack.outofbandmanagement.OutOfBandManagementServiceImpl.executePowerOperation(OutOfBandManagementServiceImpl.java:423)
>      at sun.reflect.GeneratedMethodAccessor199.invoke(Unknown Source)     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>      ... 21 more 2018-01-16 17:00:17,439 WARN  [o.a.c.alerts] 
> (pool-5-thread-9:null) (logid:ff44841a) AlertType:: 30 | dataCenterId:: 1 | 
> podId:: 1 | clusterId:: null | message:: HA Fencing of host id=1, in dc id=1 
> performed 2018-01-16 17:00:17,903 DEBUG [o.a.c.s.SecondaryStorageManagerImpl] 
> (secstorage-1:ctx-ccb33721) (logid:722404aa) Zone 1 is ready to launch 
> secondary storage VM 2018-01-16 17:00:17,935 DEBUG 
> [c.c.c.ConsoleProxyManagerImpl] (consoleproxy-1:ctx-22a69a02) 
> (logid:393fab21) Zone 1 is ready to launch console proxy
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to