Wei, Yes, the host has been rebooted. It is a Xen host, and the hypervisor status looks good. Do you any suggestions for additional checks I could run to verify the state of the host?
Thanks, -John On Jul 25, 2013, at 3:57 PM, Wei ZHOU <ustcweiz...@gmail.com> wrote: > John, > > I guess something wrong on your hosts. Did you reinstall the agent, and > clean the setting (like firewall rules,vms) on host (or reboot the host)? > > -Wei > > > 2013/7/25 John Burwell <jburw...@basho.com> > >> Marcus, >> >> According the management UI, the host is up and the cluster is enabled. I >> have verified that I can ping the host from the management server machine. >> What could cause the host to land in the avoid set? >> >> Thanks, >> -John >> >>> For some reason it doesn't like your cluster: >>> >>> 2013-07-25 12:35:22,895 DEBUG [cloud.deploy.FirstFitPlanner] >>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) >>> Removing from the clusterId list these clusters from avoid set: [1] >>> 2013-07-25 12:35:22,895 DEBUG [cloud.deploy.FirstFitPlanner] >>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) No >>> clusters found after removing disabled clusters and clusters in avoid >>> list, returning. >>> >>> I'm not sure from this why the cluster is in the avoid set... maybe >>> all hosts are down, or it's disabled? >>> >>> On Thu, Jul 25, 2013 at 12:38 PM, John Burwell <jburw...@basho.com> >> wrote: >>>> All, >>>> >>>> After pulling and building the latest from the 4.2 branch (around 10am >> on 25 >>>> July 2013), the SSVM and CPVM are being created, but will not start due >> to >>>> capacity issues. Before this build, the system VMs were allocating and >>>> starting as expected. For this test, I completely rebuilt the branch >>>> including the system vm image (i.e. mvn -P developer, systemvm clean >>>> install) and rebuilt the database from scratch. The following is an >> extract >>>> of the log form the startup: >>>> >>>> 2013-07-25 12:35:22,888 DEBUG >> [cloud.deploy.DeploymentPlanningManagerImpl] >>>> (Job-Executor-1:job-63 = [ b8beac94-df09- >>>> 495e-a373-56152bf4a9a3 ]) Deploy avoids pods: null, clusters: [1], >> hosts: >>>> [1] >>>> 2013-07-25 12:35:22,889 DEBUG >> [cloud.deploy.DeploymentPlanningManagerImpl] >>>> (Job-Executor-1:job-63 = [ b8beac94-df09- >>>> 495e-a373-56152bf4a9a3 ]) DeploymentPlanner allocation algorithm: >>>> com.cloud.deploy.FirstFitPlanner_EnhancerByCloudSt >>>> ack_208ecd91@3a67ba87 >>>> 2013-07-25 12:35:22,890 DEBUG >> [cloud.deploy.DeploymentPlanningManagerImpl] >>>> (Job-Executor-1:job-63 = [ b8beac94-df09- >>>> 495e-a373-56152bf4a9a3 ]) Trying to allocate a host and storage pools >> from >>>> dc:1, pod:1,cluster:null, requested cpu: >>>> 100, requested ram: 104857600 >>>> 2013-07-25 12:35:22,890 DEBUG >> [cloud.deploy.DeploymentPlanningManagerImpl] >>>> (Job-Executor-1:job-63 = [ b8beac94-df09- >>>> 495e-a373-56152bf4a9a3 ]) Is ROOT volume READY (pool already >> allocated)?: No >>>> 2013-07-25 12:35:22,890 DEBUG [cloud.deploy.FirstFitPlanner] >>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-5615 >>>> 2bf4a9a3 ]) Searching resources only under specified Pod: 1 >>>> 2013-07-25 12:35:22,890 DEBUG [cloud.deploy.FirstFitPlanner] >>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) >> Listing >>>> clusters in order of aggregate capacity, that have (atleast one host >> with) >>>> enough CPU and RAM capacity under this Pod: 1 >>>> 2013-07-25 12:35:22,895 DEBUG [cloud.deploy.FirstFitPlanner] >>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) >> Removing >>>> from the clusterId list these clusters from avoid set: [1] >>>> 2013-07-25 12:35:22,895 DEBUG [cloud.deploy.FirstFitPlanner] >>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) No >>>> clusters found after removing disabled clusters and clusters in avoid >> list, >>>> returning. >>>> 2013-07-25 12:35:22,906 DEBUG [cloud.capacity.CapacityManagerImpl] >>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) VM >> state >>>> transitted from :Starting to Stopped with event: OperationFailedvm's >>>> original host id: null new host id: null host id before state >> transition: 1 >>>> 2013-07-25 12:35:22,916 DEBUG [cloud.capacity.CapacityManagerImpl] >>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) >> Hosts's >>>> actual total CPU: 2565 and CPU after applying overprovisioning: 76950 >>>> 2013-07-25 12:35:22,916 DEBUG [cloud.capacity.CapacityManagerImpl] >>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) >> Hosts's >>>> actual total RAM: 2977418304 and RAM after applying overprovisioning: >>>> 89322545152 >>>> 2013-07-25 12:35:22,916 DEBUG [cloud.capacity.CapacityManagerImpl] >>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) >> release >>>> cpu from host: 1, old used: 100,reserved: 0, actual total: 2565, total >> with >>>> overprovisioning: 76950; new used: 0,reserved:0; movedfromreserved: >>>> false,moveToReserveredfalse >>>> 2013-07-25 12:35:22,916 DEBUG [cloud.capacity.CapacityManagerImpl] >>>> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) >> release >>>> mem from host: 1, old used: 104857600,reserved: 0, total: 89322545152; >> new >>>> used: 0,reserved:0; movedfromreserved: false,moveToReserveredfalse >>>> 2013-07-25 12:35:22,921 WARN >>>> [storage.secondary.SecondaryStorageManagerImpl] (Job-Executor-1:job-63 >> = [ >>>> b8beac94-df09-495e-a373-56152bf4a9a3 ]) Exception while trying to start >>>> secondary storage vm >>>> com.cloud.exception.InsufficientServerCapacityException: Unable to >> create a >>>> deployment for VM[SecondaryStorageVm|s-40-VM]Scope=interface >>>> com.cloud.dc.DataCenter; id=1 >>>> at >>>> >> com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:882) >>>> at >>>> >> com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:618) >>>> at >>>> >> com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:611) >>>> at >>>> >> com.cloud.storage.secondary.SecondaryStorageManagerImpl.startSecStorageVm(SecondaryStorageManagerImpl.java:265) >>>> at >>>> >> com.cloud.server.ManagementServerImpl.startSecondaryStorageVm(ManagementServerImpl.java:2941) >>>> at >>>> >> com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125) >>>> at >>>> >> com.cloud.server.ManagementServerImpl.startSystemVM(ManagementServerImpl.java:3069) >>>> at >>>> >> org.apache.cloudstack.api.command.admin.systemvm.StartSystemVMCmd.execute(StartSystemVMCmd.java:106) >>>> at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158) >>>> at >>>> com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531) >>>> at >>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) >>>> at >>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >>>> at java.util.concurrent.FutureTask.run(FutureTask.java:138) >>>> at >>>> >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) >>>> >>>> The following null host ids in the following line roughly half way >> through >>>> this snippet appears suspect: >>>> >>>> 56152bf4a9a3 ]) VM state transitted from :Starting to Stopped with >> event: >>>> OperationFailedvm's original host id: null new host id: null host id >> before >>>> state transition: 1 >>>> >>>> The vmops.log is too large to attach, but I have attached the Marvin >>>> configuration. Any ideas as to why the SSVm and CPVM are not starting? >>>> >>>> Thanks, for your help, >>>> -John >>>> >>>> >>>> >>>> >> >>
signature.asc
Description: Message signed with OpenPGP using GPGMail