Thanks Rohit, I will give feedback as soon as I tried it

On Thu, Jun 19, 2014 at 10:38 PM, Rohit Yadav <[email protected]> wrote:

> Hi Dimas,
>
> Looks like the VM is in starting state and CloudStack is unable to contact
> the agent. Hope you've removed the VR from CloudStack using the UI. You can
> try restarting the management server. The issue is of sync, where one party
> (mgmt server) has different view of the world than the other (the
> host/agent). In such cases, do not remove the host else when you re-add it,
> it may destroy all the (user) VMs on it or simply fail.
>
> If restarting won't fix the problem, in global settings reduce the expunge
> timeout (that's when CloudStack marks a VM as removed, since you've just
> destroyed it, it can take some time to get expunged) and try again.
>
> As a final course of action I would stop the management server, then ssh to
> the host and destroy SSVMs, using mysql client I would change db entries
> for SSVM to removed/expunged (simply mark by updating row, do not remove
> the row), start the mgmt server again and hope it would work this time.
>
> Suggestions anyone in such a case?
>
> Regards.
>
>
> On Thu, Jun 19, 2014 at 8:29 PM, dimas yoga pratama <[email protected]>
> wrote:
>
> > management log :
> >
> > 2014-06-19 21:49:21,312 WARN  [c.c.u.n.Link] (AgentManager-Selector:null)
> > SSL: Fail to find the generated keystore. Loading fail-safe one to
> > continue.
> >
> > 2014-06-19 21:49:11,585 DEBUG [c.c.v.VirtualMachineManagerImpl]
> > (AgentConnectTaskPool-344:ctx-07431045) Ignoring VM in starting mode:
> > r-71-VM
> > 2014-06-19 21:49:11,585 DEBUG [c.c.h.HighAvailabilityManagerImpl]
> > (AgentConnectTaskPool-344:ctx-07431045) VM does not require investigation
> > so I'm marking it as Stopped: VM[DomainRouter|r-71-VM]
> > 2014-06-19 21:49:11,585 WARN  [o.a.c.f.j.AsyncJobExecutionContext]
> > (AgentConnectTaskPool-344:ctx-07431045) Job is executed without a
> context,
> > setup psudo job for the executing thread
> > 2014-06-19 21:49:11,651 DEBUG [c.c.v.VirtualMachineManagerImpl]
> > (AgentConnectTaskPool-344:ctx-07431045) Unable to transition the state
> but
> > we're moving on because it's forced stop
> > 2014-06-19 21:49:11,651 DEBUG [c.c.v.VirtualMachineManagerImpl]
> > (AgentConnectTaskPool-344:ctx-07431045) Unable to cleanup VM:
> > VM[DomainRouter|r-71-VM] ,since outstanding work item is not found
> > 2014-06-19 21:49:11,651 ERROR [c.c.a.m.AgentManagerImpl]
> > (AgentConnectTaskPool-344:ctx-07431045) Monitor
> > ClusteredVirtualMachineManagerImpl says there is an error in the connect
> > process for 2 due to Work item not found, We cannot stop
> > VM[DomainRouter|r-71-VM] when it is in state Starting
> > com.cloud.utils.exception.CloudRuntimeException: Work item not found, We
> > cannot stop VM[DomainRouter|r-71-VM] when it is in state Starting
> >         at
> >
> >
> com.cloud.vm.VirtualMachineManagerImpl.advanceStop(VirtualMachineManagerImpl.java:1415)
> >         at
> >
> >
> com.cloud.vm.VirtualMachineManagerImpl.orchestrateStop(VirtualMachineManagerImpl.java:1344)
> >         at
> >
> >
> com.cloud.vm.VirtualMachineManagerImpl.advanceStop(VirtualMachineManagerImpl.java:1312)
> >         at
> >
> >
> com.cloud.ha.HighAvailabilityManagerImpl.scheduleRestart(HighAvailabilityManagerImpl.java:346)
> >         at
> >
> >
> com.cloud.vm.VirtualMachineManagerImpl.compareState(VirtualMachineManagerImpl.java:2827)
> >         at
> >
> >
> com.cloud.vm.VirtualMachineManagerImpl.fullHostSync(VirtualMachineManagerImpl.java:2384)
> >         at
> >
> >
> com.cloud.vm.VirtualMachineManagerImpl.processConnect(VirtualMachineManagerImpl.java:3035)
> >         at
> >
> >
> com.cloud.agent.manager.AgentManagerImpl.notifyMonitorsOfConnection(AgentManagerImpl.java:495)
> >         at
> >
> >
> com.cloud.agent.manager.AgentManagerImpl.handleConnectedAgent(AgentManagerImpl.java:999)
> >         at
> >
> >
> com.cloud.agent.manager.AgentManagerImpl.access$000(AgentManagerImpl.java:117)
> >         at
> >
> >
> com.cloud.agent.manager.AgentManagerImpl$HandleAgentConnectTask.runInContext(AgentManagerImpl.java:1082)
> >         at
> >
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
> >         at
> >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> >         at
> >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> >         at
> >
> >
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> >         at
> >
> >
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
> >         at
> >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >         at
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >         at java.lang.Thread.run(Thread.java:744)
> > 2014-06-19 21:49:08,251 DEBUG [c.c.s.s.SecondaryStorageManagerImpl]
> > (secstorage-1:ctx-c407a559) Zone 1 is not ready to launch secondary
> storage
> > VM yet
> > 2014-06-19 21:49:08,282 DEBUG [c.c.c.ConsoleProxyManagerImpl]
> > (consoleproxy-1:ctx-60f4b3c9) Zone 1 is not ready to launch console proxy
> > yet
> > 2014-06-19 21:49:01,197 DEBUG [c.c.n.NetworkUsageManagerImpl]
> > (AgentConnectTaskPool-342:ctx-b66d3294) Disconnected called on 2 with
> > status Alert
> > 2014-06-19 21:49:01,197 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentConnectTaskPool-342:ctx-b66d3294) Sending Disconnect to listener:
> > com.cloud.consoleproxy.ConsoleProxyListener
> > 2014-06-19 21:49:01,198 DEBUG [c.c.h.Status]
> > (AgentConnectTaskPool-342:ctx-b66d3294) Transition:[Resource state =
> > Enabled, Agent event = AgentDisconnected, Host id = 2, name =
> > host1.cloud.priv]
> > 2014-06-19 21:49:01,259 DEBUG [c.c.h.Status]
> > (AgentConnectTaskPool-342:ctx-b66d3294) Agent status update: [id = 2;
> name
> > = host1.cloud.priv; old status = Connecting; event = AgentDisconnected;
> new
> > status = Alert; old update count = 404; new update count = 405]
> > 2014-06-19 21:49:01,259 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > (AgentConnectTaskPool-342:ctx-b66d3294) Notifying other nodes of to
> > disconnect
> > 2014-06-19 21:49:01,260 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentConnectTaskPool-342:ctx-b66d3294) Failed to handle host connection:
> > com.cloud.utils.exception.CloudRuntimeException: Unable to connect 2
> > 2014-06-19 21:49:01,261 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentConnectTaskPool-342:ctx-b66d3294) Can not send command
> > com.cloud.agent.api.ReadyCommand due to Host 2 is not up
> >
> >
> > Host log:
> >
> > 2014-06-19 21:19:35,890 INFO  [cloud.agent.Agent] (Agent-Handler-3:null)
> > Lost connection to the server. Dealing with the remaining commands...
> > 2014-06-19 21:19:40,891 INFO  [cloud.agent.Agent] (Agent-Handler-3:null)
> > Reconnecting...
> > 2014-06-19 21:19:40,891 INFO  [utils.nio.NioClient] (Agent-Selector:null)
> > Connecting to 10.151.32.51:8250
> > 2014-06-19 21:19:40,989 INFO  [utils.nio.NioClient] (Agent-Selector:null)
> > SSL: Handshake done
> > 2014-06-19 21:19:40,989 INFO  [utils.nio.NioClient] (Agent-Selector:null)
> > Connected to 10.151.32.51:8250
> > 2014-06-19 21:19:41,084 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
> > Proccess agent startup answer, agent id = 0
> > 2014-06-19 21:19:41,084 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
> > Set agent id 0
> > 2014-06-19 21:19:41,085 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
> > Startup Response Received: agent id = 0
> > 2014-06-19 21:19:45,990 INFO  [cloud.agent.Agent] (Agent-Handler-3:null)
> > Connected to the server
> > 2014-06-19 21:19:46,595 INFO  [cloud.agent.Agent] (Agent-Handler-3:null)
> > Lost connection to the server. Dealing with the remaining commands...
> > 2014-06-19 21:19:51,596 INFO  [cloud.agent.Agent] (Agent-Handler-3:null)
> > Reconnecting...
> >
> >
> >
> > I'm using Centos 6.5 and Cloudstack 4.3 with basic networking.
> >
> > Please help me..
> >
> >
> > On Thu, Jun 19, 2014 at 9:42 PM, Rohit Yadav <[email protected]>
> wrote:
> >
> > > I'm not sure what could be the specific issue. You can tail the
> > management
> > > server logs to see what is failing. After you figure out the specific
> > > issue, you may share it with us with your host os, CloudStack version
> > > details and the connected host details.
> > >
> > > Regards.
> > >
> > >
> > > On Thu, Jun 19, 2014 at 8:01 PM, dimas yoga pratama <[email protected]
> >
> > > wrote:
> > >
> > > > Hi, from the infrastructure tab I can detect the hosts, but  both of
> > the
> > > > hosts show alert state., I already try to force reconnect but it
> fails.
> > > > What should I do? Now the CPVM fail to start too.
> > > >
> > > >
> > > > On Thu, Jun 19, 2014 at 9:17 PM, Rohit Yadav <[email protected]>
> > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > SSVMs and VRs are stateless so if restarts are not working for you,
> > you
> > > > may
> > > > > (force) stop and remove them. The CloudStack HA thread(s) would
> > > kickstart
> > > > > new ones after a certain timeout, to speed this behaviour you may
> > > restart
> > > > > CloudStack as well.
> > > > >
> > > > > If your problem still persists after trying above you may try
> > debugging
> > > > the
> > > > > issue:
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM,+templates,+Secondary+storage+troubleshooting
> > > > >
> > > > > Regards.
> > > > >
> > > > >
> > > > > On Thu, Jun 19, 2014 at 7:34 PM, dimas yoga pratama <
> > [email protected]
> > > >
> > > > > wrote:
> > > > >
> > > > > > OK this is my problem, after blackout I can''t start virtual
> > router,
> > > > and
> > > > > > ssvm not detected in my cloudstack system. SSVM recreated itself
> > but
> > > > > stuck
> > > > > > in starting state.
> > > > > >
> > > > > > What should I do?Please help me..
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to