Experienced similar behaviour, for kvm - seems like restarting libvirtd and give it some time to settle, and than agent connects ... on its own...
Sent from Google Nexus 4 On Nov 21, 2014 4:43 PM, "Steve Searles" <[email protected]> wrote: > For some reason it is affecting every host. VMware, KVM, and XenServer. > No hosts will come out of maintenance same NPE for all. Storage will go in > and out of maintenance fine. Weird. Any ideas? The only way to get the > host back online is to remove it and re-add it. > > > Steven Searles > > > On Nov 21, 2014, at 9:04 AM, Steve Searles <[email protected]> wrote: > > Yea, tried all that. Now its affecting KVM as well. Thanks for the > reply, I will dig a bit deeper. > > > Steven Searles > > > On Nov 20, 2014, at 4:00 PM, Motty Cruz <[email protected]> wrote: > > Hi Steve, > have you try stopping and restarting ACS? also I would do the following in > xenserver > xe-toolstack-restart > it won't affect your VMs. > > To restart Cloudstack > service cloudstack-management restart (in CentOs) > > Thanks, > Motty > On 11/20/2014 12:55 PM, Steve Searles wrote: > > Found this in the catalina.out log on the management server. > > INFO [o.a.c.f.j.i.AsyncJobMonitor] (API-Job-Executor-4:ctx-8faa1563 > job-12695) Add job-12695 into job monitoring > WARN [c.c.a.d.ParamGenericValidationWorker] > (API-Job-Executor-4:ctx-8faa1563 job-12695 ctx-81dfab11) Received unknown > parameters for command cancelHostMaintenance. Unknown parameters : > signatureversion expires > ERROR [c.c.a.ApiAsyncJobDispatcher] (API-Job-Executor-4:ctx-8faa1563 > job-12695) Unexpected exception while executing > org.apache.cloudstack.api.command.admin.host.CancelMaintenanceCmd > > > Anyone know what the unknown parameters are all about? > > —Steve > > > > On Nov 20, 2014, at 3:14 PM, Steve Searles <[email protected]< > mailto:[email protected] <[email protected]>>> wrote: > > CS 4.4.1 - 4.4.2 > I am having a problem with my xenserver hosts getting stuck in > maintenance. Trying to cancel the maintenance produces the following NPE. > > 2014-11-20 15:04:28,575 INFO [o.a.c.f.j.i.AsyncJobMonitor] > (API-Job-Executor-14:ctx-4e8a63d4 job-12626) Add job-12626 into job > monitoring > 2014-11-20 15:04:28,576 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] > (API-Job-Executor-14:ctx-4e8a63d4 job-12626) Executing AsyncJobVO > {id:12626, userId: 2, accountId: 2, instanceType: Host, instanceId: 114, > cmd: org.apache.cloudstack.api.command.admin.host.CancelMaintenanceCmd, > cmdInfo: > {"id":"189c3843-8d92-419b-a8b2-e343ea02c8fd","response":"json","sessionkey":"OEXANRcg2kzKJrfGXpvCK3E6k28\u003d","ctxDetails":"{\"com.cloud.host.Host\":\"189c3843-8d92-419b-a8b2-e343ea02c8fd\"}","cmdEventType":"MAINT.CANCEL","ctxUserId":"2","httpmethod":"GET","_":"1416513869627","uuid":"189c3843-8d92-419b-a8b2-e343ea02c8fd","ctxAccountId":"2","ctxStartEventId":"140662"}, > cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, > result: null, initMsid: 345049793560, completeMsid: null, lastUpdated: > null, lastPolled: null, created: null} > 2014-11-20 15:04:28,577 DEBUG [c.c.a.ApiServlet] > (catalina-exec-12:ctx-78ec5c48 ctx-e7500b2b) ===END=== 172.23.0.1 -- GET > > command=cancelHostMaintenance&id=189c3843-8d92-419b-a8b2-e343ea02c8fd&response=json&sessionkey=OEXANRcg2kzKJrfGXpvCK3E6k28%3D&_=1416513869627 > 2014-11-20 15:04:28,601 ERROR [c.c.a.ApiAsyncJobDispatcher] > (API-Job-Executor-14:ctx-4e8a63d4 job-12626) Unexpected exception while > executing org.apache.cloudstack.api.command.admin.host.CancelMaintenanceCmd > java.lang.NullPointerException > at > com.cloud.resource.ResourceManagerImpl.doCancelMaintenance(ResourceManagerImpl.java:2083) > at > com.cloud.resource.ResourceManagerImpl.cancelMaintenance(ResourceManagerImpl.java:2140) > at > com.cloud.resource.ResourceManagerImpl.cancelMaintenance(ResourceManagerImpl.java:1127) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > > I have tried upgrading to the latest git build 4.4.2 and the problem still > exists. I think it started in 4.4.1 because it used to work properly in > 4.4.0. I also deleted and re-created the SSVM but that did not help > either. Does anyone have a solution or workaround? Is there a way to > manually take a host out of maintenance? I think there is more to it than > setting the status in the DB? > > > — Steve > > > > > > > >
