John, I guess something wrong on your hosts. Did you reinstall the agent, and clean the setting (like firewall rules,vms) on host (or reboot the host)?
-Wei 2013/7/25 John Burwell <jburw...@basho.com> > Marcus, > > According the management UI, the host is up and the cluster is enabled. I > have verified that I can ping the host from the management server machine. > What could cause the host to land in the avoid set? > > Thanks, > -John > > > For some reason it doesn't like your cluster: > > > > 2013-07-25 12:35:22,895 DEBUG [cloud.deploy.FirstFitPlanner] > > (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) > > Removing from the clusterId list these clusters from avoid set: [1] > > 2013-07-25 12:35:22,895 DEBUG [cloud.deploy.FirstFitPlanner] > > (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) No > > clusters found after removing disabled clusters and clusters in avoid > > list, returning. > > > > I'm not sure from this why the cluster is in the avoid set... maybe > > all hosts are down, or it's disabled? > > > > On Thu, Jul 25, 2013 at 12:38 PM, John Burwell <jburw...@basho.com> > wrote: > >> All, > >> > >> After pulling and building the latest from the 4.2 branch (around 10am > on 25 > >> July 2013), the SSVM and CPVM are being created, but will not start due > to > >> capacity issues. Before this build, the system VMs were allocating and > >> starting as expected. For this test, I completely rebuilt the branch > >> including the system vm image (i.e. mvn -P developer, systemvm clean > >> install) and rebuilt the database from scratch. The following is an > extract > >> of the log form the startup: > >> > >> 2013-07-25 12:35:22,888 DEBUG > [cloud.deploy.DeploymentPlanningManagerImpl] > >> (Job-Executor-1:job-63 = [ b8beac94-df09- > >> 495e-a373-56152bf4a9a3 ]) Deploy avoids pods: null, clusters: [1], > hosts: > >> [1] > >> 2013-07-25 12:35:22,889 DEBUG > [cloud.deploy.DeploymentPlanningManagerImpl] > >> (Job-Executor-1:job-63 = [ b8beac94-df09- > >> 495e-a373-56152bf4a9a3 ]) DeploymentPlanner allocation algorithm: > >> com.cloud.deploy.FirstFitPlanner_EnhancerByCloudSt > >> ack_208ecd91@3a67ba87 > >> 2013-07-25 12:35:22,890 DEBUG > [cloud.deploy.DeploymentPlanningManagerImpl] > >> (Job-Executor-1:job-63 = [ b8beac94-df09- > >> 495e-a373-56152bf4a9a3 ]) Trying to allocate a host and storage pools > from > >> dc:1, pod:1,cluster:null, requested cpu: > >> 100, requested ram: 104857600 > >> 2013-07-25 12:35:22,890 DEBUG > [cloud.deploy.DeploymentPlanningManagerImpl] > >> (Job-Executor-1:job-63 = [ b8beac94-df09- > >> 495e-a373-56152bf4a9a3 ]) Is ROOT volume READY (pool already > allocated)?: No > >> 2013-07-25 12:35:22,890 DEBUG [cloud.deploy.FirstFitPlanner] > >> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-5615 > >> 2bf4a9a3 ]) Searching resources only under specified Pod: 1 > >> 2013-07-25 12:35:22,890 DEBUG [cloud.deploy.FirstFitPlanner] > >> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) > Listing > >> clusters in order of aggregate capacity, that have (atleast one host > with) > >> enough CPU and RAM capacity under this Pod: 1 > >> 2013-07-25 12:35:22,895 DEBUG [cloud.deploy.FirstFitPlanner] > >> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) > Removing > >> from the clusterId list these clusters from avoid set: [1] > >> 2013-07-25 12:35:22,895 DEBUG [cloud.deploy.FirstFitPlanner] > >> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) No > >> clusters found after removing disabled clusters and clusters in avoid > list, > >> returning. > >> 2013-07-25 12:35:22,906 DEBUG [cloud.capacity.CapacityManagerImpl] > >> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) VM > state > >> transitted from :Starting to Stopped with event: OperationFailedvm's > >> original host id: null new host id: null host id before state > transition: 1 > >> 2013-07-25 12:35:22,916 DEBUG [cloud.capacity.CapacityManagerImpl] > >> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) > Hosts's > >> actual total CPU: 2565 and CPU after applying overprovisioning: 76950 > >> 2013-07-25 12:35:22,916 DEBUG [cloud.capacity.CapacityManagerImpl] > >> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) > Hosts's > >> actual total RAM: 2977418304 and RAM after applying overprovisioning: > >> 89322545152 > >> 2013-07-25 12:35:22,916 DEBUG [cloud.capacity.CapacityManagerImpl] > >> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) > release > >> cpu from host: 1, old used: 100,reserved: 0, actual total: 2565, total > with > >> overprovisioning: 76950; new used: 0,reserved:0; movedfromreserved: > >> false,moveToReserveredfalse > >> 2013-07-25 12:35:22,916 DEBUG [cloud.capacity.CapacityManagerImpl] > >> (Job-Executor-1:job-63 = [ b8beac94-df09-495e-a373-56152bf4a9a3 ]) > release > >> mem from host: 1, old used: 104857600,reserved: 0, total: 89322545152; > new > >> used: 0,reserved:0; movedfromreserved: false,moveToReserveredfalse > >> 2013-07-25 12:35:22,921 WARN > >> [storage.secondary.SecondaryStorageManagerImpl] (Job-Executor-1:job-63 > = [ > >> b8beac94-df09-495e-a373-56152bf4a9a3 ]) Exception while trying to start > >> secondary storage vm > >> com.cloud.exception.InsufficientServerCapacityException: Unable to > create a > >> deployment for VM[SecondaryStorageVm|s-40-VM]Scope=interface > >> com.cloud.dc.DataCenter; id=1 > >> at > >> > com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:882) > >> at > >> > com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:618) > >> at > >> > com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:611) > >> at > >> > com.cloud.storage.secondary.SecondaryStorageManagerImpl.startSecStorageVm(SecondaryStorageManagerImpl.java:265) > >> at > >> > com.cloud.server.ManagementServerImpl.startSecondaryStorageVm(ManagementServerImpl.java:2941) > >> at > >> > com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125) > >> at > >> > com.cloud.server.ManagementServerImpl.startSystemVM(ManagementServerImpl.java:3069) > >> at > >> > org.apache.cloudstack.api.command.admin.systemvm.StartSystemVMCmd.execute(StartSystemVMCmd.java:106) > >> at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158) > >> at > >> com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531) > >> at > >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) > >> at > >> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > >> at java.util.concurrent.FutureTask.run(FutureTask.java:138) > >> at > >> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) > >> > >> The following null host ids in the following line roughly half way > through > >> this snippet appears suspect: > >> > >> 56152bf4a9a3 ]) VM state transitted from :Starting to Stopped with > event: > >> OperationFailedvm's original host id: null new host id: null host id > before > >> state transition: 1 > >> > >> The vmops.log is too large to attach, but I have attached the Marvin > >> configuration. Any ideas as to why the SSVm and CPVM are not starting? > >> > >> Thanks, for your help, > >> -John > >> > >> > >> > >> > >