Thanks Rohit, I will give feedback as soon as I tried it
On Thu, Jun 19, 2014 at 10:38 PM, Rohit Yadav <[email protected]> wrote: > Hi Dimas, > > Looks like the VM is in starting state and CloudStack is unable to contact > the agent. Hope you've removed the VR from CloudStack using the UI. You can > try restarting the management server. The issue is of sync, where one party > (mgmt server) has different view of the world than the other (the > host/agent). In such cases, do not remove the host else when you re-add it, > it may destroy all the (user) VMs on it or simply fail. > > If restarting won't fix the problem, in global settings reduce the expunge > timeout (that's when CloudStack marks a VM as removed, since you've just > destroyed it, it can take some time to get expunged) and try again. > > As a final course of action I would stop the management server, then ssh to > the host and destroy SSVMs, using mysql client I would change db entries > for SSVM to removed/expunged (simply mark by updating row, do not remove > the row), start the mgmt server again and hope it would work this time. > > Suggestions anyone in such a case? > > Regards. > > > On Thu, Jun 19, 2014 at 8:29 PM, dimas yoga pratama <[email protected]> > wrote: > > > management log : > > > > 2014-06-19 21:49:21,312 WARN [c.c.u.n.Link] (AgentManager-Selector:null) > > SSL: Fail to find the generated keystore. Loading fail-safe one to > > continue. > > > > 2014-06-19 21:49:11,585 DEBUG [c.c.v.VirtualMachineManagerImpl] > > (AgentConnectTaskPool-344:ctx-07431045) Ignoring VM in starting mode: > > r-71-VM > > 2014-06-19 21:49:11,585 DEBUG [c.c.h.HighAvailabilityManagerImpl] > > (AgentConnectTaskPool-344:ctx-07431045) VM does not require investigation > > so I'm marking it as Stopped: VM[DomainRouter|r-71-VM] > > 2014-06-19 21:49:11,585 WARN [o.a.c.f.j.AsyncJobExecutionContext] > > (AgentConnectTaskPool-344:ctx-07431045) Job is executed without a > context, > > setup psudo job for the executing thread > > 2014-06-19 21:49:11,651 DEBUG [c.c.v.VirtualMachineManagerImpl] > > (AgentConnectTaskPool-344:ctx-07431045) Unable to transition the state > but > > we're moving on because it's forced stop > > 2014-06-19 21:49:11,651 DEBUG [c.c.v.VirtualMachineManagerImpl] > > (AgentConnectTaskPool-344:ctx-07431045) Unable to cleanup VM: > > VM[DomainRouter|r-71-VM] ,since outstanding work item is not found > > 2014-06-19 21:49:11,651 ERROR [c.c.a.m.AgentManagerImpl] > > (AgentConnectTaskPool-344:ctx-07431045) Monitor > > ClusteredVirtualMachineManagerImpl says there is an error in the connect > > process for 2 due to Work item not found, We cannot stop > > VM[DomainRouter|r-71-VM] when it is in state Starting > > com.cloud.utils.exception.CloudRuntimeException: Work item not found, We > > cannot stop VM[DomainRouter|r-71-VM] when it is in state Starting > > at > > > > > com.cloud.vm.VirtualMachineManagerImpl.advanceStop(VirtualMachineManagerImpl.java:1415) > > at > > > > > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStop(VirtualMachineManagerImpl.java:1344) > > at > > > > > com.cloud.vm.VirtualMachineManagerImpl.advanceStop(VirtualMachineManagerImpl.java:1312) > > at > > > > > com.cloud.ha.HighAvailabilityManagerImpl.scheduleRestart(HighAvailabilityManagerImpl.java:346) > > at > > > > > com.cloud.vm.VirtualMachineManagerImpl.compareState(VirtualMachineManagerImpl.java:2827) > > at > > > > > com.cloud.vm.VirtualMachineManagerImpl.fullHostSync(VirtualMachineManagerImpl.java:2384) > > at > > > > > com.cloud.vm.VirtualMachineManagerImpl.processConnect(VirtualMachineManagerImpl.java:3035) > > at > > > > > com.cloud.agent.manager.AgentManagerImpl.notifyMonitorsOfConnection(AgentManagerImpl.java:495) > > at > > > > > com.cloud.agent.manager.AgentManagerImpl.handleConnectedAgent(AgentManagerImpl.java:999) > > at > > > > > com.cloud.agent.manager.AgentManagerImpl.access$000(AgentManagerImpl.java:117) > > at > > > > > com.cloud.agent.manager.AgentManagerImpl$HandleAgentConnectTask.runInContext(AgentManagerImpl.java:1082) > > at > > > > > org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) > > at > > > > > org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) > > at > > > > > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) > > at > > > > > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) > > at > > > > > org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) > > at > > > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > at > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > at java.lang.Thread.run(Thread.java:744) > > 2014-06-19 21:49:08,251 DEBUG [c.c.s.s.SecondaryStorageManagerImpl] > > (secstorage-1:ctx-c407a559) Zone 1 is not ready to launch secondary > storage > > VM yet > > 2014-06-19 21:49:08,282 DEBUG [c.c.c.ConsoleProxyManagerImpl] > > (consoleproxy-1:ctx-60f4b3c9) Zone 1 is not ready to launch console proxy > > yet > > 2014-06-19 21:49:01,197 DEBUG [c.c.n.NetworkUsageManagerImpl] > > (AgentConnectTaskPool-342:ctx-b66d3294) Disconnected called on 2 with > > status Alert > > 2014-06-19 21:49:01,197 DEBUG [c.c.a.m.AgentManagerImpl] > > (AgentConnectTaskPool-342:ctx-b66d3294) Sending Disconnect to listener: > > com.cloud.consoleproxy.ConsoleProxyListener > > 2014-06-19 21:49:01,198 DEBUG [c.c.h.Status] > > (AgentConnectTaskPool-342:ctx-b66d3294) Transition:[Resource state = > > Enabled, Agent event = AgentDisconnected, Host id = 2, name = > > host1.cloud.priv] > > 2014-06-19 21:49:01,259 DEBUG [c.c.h.Status] > > (AgentConnectTaskPool-342:ctx-b66d3294) Agent status update: [id = 2; > name > > = host1.cloud.priv; old status = Connecting; event = AgentDisconnected; > new > > status = Alert; old update count = 404; new update count = 405] > > 2014-06-19 21:49:01,259 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] > > (AgentConnectTaskPool-342:ctx-b66d3294) Notifying other nodes of to > > disconnect > > 2014-06-19 21:49:01,260 DEBUG [c.c.a.m.AgentManagerImpl] > > (AgentConnectTaskPool-342:ctx-b66d3294) Failed to handle host connection: > > com.cloud.utils.exception.CloudRuntimeException: Unable to connect 2 > > 2014-06-19 21:49:01,261 DEBUG [c.c.a.m.AgentManagerImpl] > > (AgentConnectTaskPool-342:ctx-b66d3294) Can not send command > > com.cloud.agent.api.ReadyCommand due to Host 2 is not up > > > > > > Host log: > > > > 2014-06-19 21:19:35,890 INFO [cloud.agent.Agent] (Agent-Handler-3:null) > > Lost connection to the server. Dealing with the remaining commands... > > 2014-06-19 21:19:40,891 INFO [cloud.agent.Agent] (Agent-Handler-3:null) > > Reconnecting... > > 2014-06-19 21:19:40,891 INFO [utils.nio.NioClient] (Agent-Selector:null) > > Connecting to 10.151.32.51:8250 > > 2014-06-19 21:19:40,989 INFO [utils.nio.NioClient] (Agent-Selector:null) > > SSL: Handshake done > > 2014-06-19 21:19:40,989 INFO [utils.nio.NioClient] (Agent-Selector:null) > > Connected to 10.151.32.51:8250 > > 2014-06-19 21:19:41,084 INFO [cloud.agent.Agent] (Agent-Handler-2:null) > > Proccess agent startup answer, agent id = 0 > > 2014-06-19 21:19:41,084 INFO [cloud.agent.Agent] (Agent-Handler-2:null) > > Set agent id 0 > > 2014-06-19 21:19:41,085 INFO [cloud.agent.Agent] (Agent-Handler-2:null) > > Startup Response Received: agent id = 0 > > 2014-06-19 21:19:45,990 INFO [cloud.agent.Agent] (Agent-Handler-3:null) > > Connected to the server > > 2014-06-19 21:19:46,595 INFO [cloud.agent.Agent] (Agent-Handler-3:null) > > Lost connection to the server. Dealing with the remaining commands... > > 2014-06-19 21:19:51,596 INFO [cloud.agent.Agent] (Agent-Handler-3:null) > > Reconnecting... > > > > > > > > I'm using Centos 6.5 and Cloudstack 4.3 with basic networking. > > > > Please help me.. > > > > > > On Thu, Jun 19, 2014 at 9:42 PM, Rohit Yadav <[email protected]> > wrote: > > > > > I'm not sure what could be the specific issue. You can tail the > > management > > > server logs to see what is failing. After you figure out the specific > > > issue, you may share it with us with your host os, CloudStack version > > > details and the connected host details. > > > > > > Regards. > > > > > > > > > On Thu, Jun 19, 2014 at 8:01 PM, dimas yoga pratama <[email protected] > > > > > wrote: > > > > > > > Hi, from the infrastructure tab I can detect the hosts, but both of > > the > > > > hosts show alert state., I already try to force reconnect but it > fails. > > > > What should I do? Now the CPVM fail to start too. > > > > > > > > > > > > On Thu, Jun 19, 2014 at 9:17 PM, Rohit Yadav <[email protected]> > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > SSVMs and VRs are stateless so if restarts are not working for you, > > you > > > > may > > > > > (force) stop and remove them. The CloudStack HA thread(s) would > > > kickstart > > > > > new ones after a certain timeout, to speed this behaviour you may > > > restart > > > > > CloudStack as well. > > > > > > > > > > If your problem still persists after trying above you may try > > debugging > > > > the > > > > > issue: > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM,+templates,+Secondary+storage+troubleshooting > > > > > > > > > > Regards. > > > > > > > > > > > > > > > On Thu, Jun 19, 2014 at 7:34 PM, dimas yoga pratama < > > [email protected] > > > > > > > > > wrote: > > > > > > > > > > > OK this is my problem, after blackout I can''t start virtual > > router, > > > > and > > > > > > ssvm not detected in my cloudstack system. SSVM recreated itself > > but > > > > > stuck > > > > > > in starting state. > > > > > > > > > > > > What should I do?Please help me.. > > > > > > > > > > > > > > > > > > > > >
