Re: System VM Capacity Issue

John Burwell Thu, 25 Jul 2013 13:00:33 -0700

Wei,

Yes, the host has been rebooted.  It is a Xen host, and the hypervisor status 
looks good.  Do you any suggestions for additional checks I could run to verify 
the state of the host?


Thanks,
-John

On Jul 25, 2013, at 3:57 PM, Wei ZHOU <[email protected]> wrote:

> John,
> 
> I guess something wrong on your hosts. Did you reinstall the agent, and
> clean the setting (like firewall rules,vms) on host (or reboot the host)?
> 
> -Wei
> 
> 
> 2013/7/25 John Burwell <[email protected]>
> 
>> Marcus,
>> 
>> According the management UI, the host is up and the cluster is enabled.  I
>> have verified that I can ping the host from the management server machine.
>> What could cause the host to land in the avoid set?
>> 
>> Thanks,
>> -John
>> 
>>> For some reason it doesn't like your cluster:
>>> 
>>> 2013-07-25 12:35:22,895 DEBUG [cloud.deploy.FirstFitPlanner]
>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ])
>>> Removing from the clusterId list these clusters from avoid set: [1]
>>> 2013-07-25 12:35:22,895 DEBUG [cloud.deploy.FirstFitPlanner]
>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) No
>>> clusters found after removing disabled clusters and clusters in avoid
>>> list, returning.
>>> 
>>> I'm not sure from this why the cluster is in the avoid set... maybe
>>> all hosts are down, or it's disabled?
>>> 
>>> On Thu, Jul 25, 2013 at 12:38 PM, John Burwell <[email protected]>
>> wrote:
>>>> All,
>>>> 
>>>> After pulling and building the latest from the 4.2 branch (around 10am
>> on 25
>>>> July 2013), the SSVM and CPVM are being created, but will not start due
>> to
>>>> capacity issues.  Before this build, the system VMs were allocating and
>>>> starting as expected.  For this test, I completely rebuilt the branch
>>>> including the system vm image (i.e. mvn -P developer, systemvm clean
>>>> install) and rebuilt the database from scratch.  The following is an
>> extract
>>>> of the log form the startup:
>>>> 
>>>> 2013-07-25 12:35:22,888 DEBUG
>> [cloud.deploy.DeploymentPlanningManagerImpl]
>>>> (Job-Executor-1:job-63 = [ b8beac94-df09-
>>>> 495e-a373-56152bf4a9a3 ]) Deploy avoids pods: null, clusters: [1],
>> hosts:
>>>> [1]
>>>> 2013-07-25 12:35:22,889 DEBUG
>> [cloud.deploy.DeploymentPlanningManagerImpl]
>>>> (Job-Executor-1:job-63 = [ b8beac94-df09-
>>>> 495e-a373-56152bf4a9a3 ]) DeploymentPlanner allocation algorithm:
>>>> com.cloud.deploy.FirstFitPlanner_EnhancerByCloudSt
>>>> ack_208ecd91@3a67ba87
>>>> 2013-07-25 12:35:22,890 DEBUG
>> [cloud.deploy.DeploymentPlanningManagerImpl]
>>>> (Job-Executor-1:job-63 = [ b8beac94-df09-
>>>> 495e-a373-56152bf4a9a3 ]) Trying to allocate a host and storage pools
>> from
>>>> dc:1, pod:1,cluster:null, requested cpu:
>>>> 100, requested ram: 104857600
>>>> 2013-07-25 12:35:22,890 DEBUG
>> [cloud.deploy.DeploymentPlanningManagerImpl]
>>>> (Job-Executor-1:job-63 = [ b8beac94-df09-
>>>> 495e-a373-56152bf4a9a3 ]) Is ROOT volume READY (pool already
>> allocated)?: No
>>>> 2013-07-25 12:35:22,890 DEBUG [cloud.deploy.FirstFitPlanner]
>>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-5615
>>>> 2bf4a9a3 ]) Searching resources only under specified Pod: 1
>>>> 2013-07-25 12:35:22,890 DEBUG [cloud.deploy.FirstFitPlanner]
>>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ])
>> Listing
>>>> clusters in order of aggregate capacity, that have (atleast one host
>> with)
>>>> enough CPU and RAM capacity under this Pod: 1
>>>> 2013-07-25 12:35:22,895 DEBUG [cloud.deploy.FirstFitPlanner]
>>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ])
>> Removing
>>>> from the clusterId list these clusters from avoid set: [1]
>>>> 2013-07-25 12:35:22,895 DEBUG [cloud.deploy.FirstFitPlanner]
>>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) No
>>>> clusters found after removing disabled clusters and clusters in avoid
>> list,
>>>> returning.
>>>> 2013-07-25 12:35:22,906 DEBUG [cloud.capacity.CapacityManagerImpl]
>>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) VM
>> state
>>>> transitted from :Starting to Stopped with event: OperationFailedvm's
>>>> original host id: null new host id: null host id before state
>> transition: 1
>>>> 2013-07-25 12:35:22,916 DEBUG [cloud.capacity.CapacityManagerImpl]
>>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ])
>> Hosts's
>>>> actual total CPU: 2565 and CPU after applying overprovisioning: 76950
>>>> 2013-07-25 12:35:22,916 DEBUG [cloud.capacity.CapacityManagerImpl]
>>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ])
>> Hosts's
>>>> actual total RAM: 2977418304 and RAM after applying overprovisioning:
>>>> 89322545152
>>>> 2013-07-25 12:35:22,916 DEBUG [cloud.capacity.CapacityManagerImpl]
>>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ])
>> release
>>>> cpu from host: 1, old used: 100,reserved: 0, actual total: 2565, total
>> with
>>>> overprovisioning: 76950; new used: 0,reserved:0; movedfromreserved:
>>>> false,moveToReserveredfalse
>>>> 2013-07-25 12:35:22,916 DEBUG [cloud.capacity.CapacityManagerImpl]
>>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ])
>> release
>>>> mem from host: 1, old used: 104857600,reserved: 0, total: 89322545152;
>> new
>>>> used: 0,reserved:0; movedfromreserved: false,moveToReserveredfalse
>>>> 2013-07-25 12:35:22,921 WARN
>>>> [storage.secondary.SecondaryStorageManagerImpl] (Job-Executor-1:job-63
>> = [
>>>> b8beac94-df09-495e-a373-56152bf4a9a3 ]) Exception while trying to start
>>>> secondary storage vm
>>>> com.cloud.exception.InsufficientServerCapacityException: Unable to
>> create a
>>>> deployment for VM[SecondaryStorageVm|s-40-VM]Scope=interface
>>>> com.cloud.dc.DataCenter; id=1
>>>>       at
>>>> 
>> com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:882)
>>>>       at
>>>> 
>> com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:618)
>>>>       at
>>>> 
>> com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:611)
>>>>       at
>>>> 
>> com.cloud.storage.secondary.SecondaryStorageManagerImpl.startSecStorageVm(SecondaryStorageManagerImpl.java:265)
>>>>       at
>>>> 
>> com.cloud.server.ManagementServerImpl.startSecondaryStorageVm(ManagementServerImpl.java:2941)
>>>>       at
>>>> 
>> com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
>>>>       at
>>>> 
>> com.cloud.server.ManagementServerImpl.startSystemVM(ManagementServerImpl.java:3069)
>>>>       at
>>>> 
>> org.apache.cloudstack.api.command.admin.systemvm.StartSystemVMCmd.execute(StartSystemVMCmd.java:106)
>>>>       at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158)
>>>>       at
>>>> com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531)
>>>>       at
>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
>>>>       at
>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>>       at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>>>       at
>>>> 
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>>>> 
>>>> The following null host ids in the following line roughly half way
>> through
>>>> this snippet appears suspect:
>>>> 
>>>> 56152bf4a9a3 ]) VM state transitted from :Starting to Stopped with
>> event:
>>>> OperationFailedvm's original host id: null new host id: null host id
>> before
>>>> state transition: 1
>>>> 
>>>> The vmops.log is too large to attach, but I have attached the Marvin
>>>> configuration.  Any ideas as to why the SSVm and CPVM are not starting?
>>>> 
>>>> Thanks, for your help,
>>>> -John
>>>> 
>>>> 
>>>> 
>>>> 
>> 
>>

signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: System VM Capacity Issue

Reply via email to