[ https://issues.apache.org/jira/browse/CLOUDSTACK-5713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
sadhu suresh updated CLOUDSTACK-5713: ------------------------------------- Comment: was deleted (was: tested on upgrade environment (upgrade from 4.2.1. to 4.3).KVM noticed runtime exception while synching the shutdown vm state which is shut-downed outside of CS. 14-01-23 19:58:04,010 DEBUG [c.c.c.CapacityManagerImpl] (AgentConnectTaskPool-3:ctx-776109f5) VM state transitted from :Starting to Stopped with event: AgentReportStoppedvm's original host id: 1 new host id: null host id before state transition: 1 2014-01-23 19:58:04,023 INFO [o.a.c.s.v.VolumeServiceImpl] (AgentConnectTaskPool-2:ctx-6f12c788) Volume Sync did not find ROOT-10 ready on image store 1, will request download to start/resume shortly 2014-01-23 19:58:04,023 INFO [o.a.c.s.v.VolumeServiceImpl] (AgentConnectTaskPool-2:ctx-6f12c788) Skip downloading volume 18 since no download url is specified. 2014-01-23 19:58:04,023 INFO [o.a.c.s.v.VolumeServiceImpl] (AgentConnectTaskPool-2:ctx-6f12c788) Skip downloading volume 16 since no download url is specified. 2014-01-23 19:58:04,023 INFO [o.a.c.s.v.VolumeServiceImpl] (AgentConnectTaskPool-2:ctx-6f12c788) Skip downloading volume 14 since no download url is specified. 2014-01-23 19:58:04,023 INFO [o.a.c.s.v.VolumeServiceImpl] (AgentConnectTaskPool-2:ctx-6f12c788) Skip downloading volume 18 since no download url is specified. 2014-01-23 19:58:04,024 INFO [o.a.c.s.v.VolumeServiceImpl] (AgentConnectTaskPool-2:ctx-6f12c788) Skip downloading volume 16 since no download url is specified. 2014-01-23 19:58:04,024 INFO [o.a.c.s.v.VolumeServiceImpl] (AgentConnectTaskPool-2:ctx-6f12c788) Skip downloading volume 14 since no download url is specified. 2014-01-23 19:58:04,212 DEBUG [c.c.a.t.Request] (AgentConnectTaskPool-2:ctx-6f12c788) Seq 3-1996685317: Sending { Cmd , MgmtId: 7322717913215, via: 3(s-1-VM), Ver: v1, Flags: 100111, [{"com.cloud.agent.api.storage.ListTemplateCommand":{"store":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://10.147.28.7/export/home/sadhu/felton/secondary","_role":"Image"}},"wait":0}}] } 2014-01-23 19:58:04,291 DEBUG [c.c.c.CapacityManagerImpl] (AgentConnectTaskPool-4:ctx-67b75d21) Hosts's actual total CPU: 12404 and CPU after applying overprovisioning: 12404 2014-01-23 19:58:04,291 DEBUG [c.c.c.CapacityManagerImpl] (AgentConnectTaskPool-4:ctx-67b75d21) Hosts's actual total RAM: 8244576256 and RAM after applying overprovisioning: 8244576256 2014-01-23 19:58:04,293 DEBUG [c.c.c.CapacityManagerImpl] (AgentConnectTaskPool-3:ctx-776109f5) Hosts's actual total CPU: 12404 and CPU after applying overprovisioning: 12404 2014-01-23 19:58:04,293 DEBUG [c.c.c.CapacityManagerImpl] (AgentConnectTaskPool-3:ctx-776109f5) Hosts's actual total RAM: 8244576256 and RAM after applying overprovisioning: 8244576256 2014-01-23 19:58:04,365 DEBUG [c.c.c.CapacityManagerImpl] (AgentConnectTaskPool-3:ctx-776109f5) release cpu from host: 1, old used: 1000,reserved: 0, actual total: 12404, total with overprovisioning: 12404; new used: 500,reserved:0; movedfromreserved: false,moveToReserveredfalse 2014-01-23 19:58:04,366 DEBUG [c.c.c.CapacityManagerImpl] (AgentConnectTaskPool-3:ctx-776109f5) release mem from host: 1, old used: 1073741824,reserved: 0, total: 8244576256; new used: 536870912,reserved:0; movedfromreserved: false,moveToReserveredfalse 2014-01-23 19:58:04,371 DEBUG [c.c.c.CapacityManagerImpl] (AgentConnectTaskPool-4:ctx-67b75d21) release cpu from host: 4, old used: 2000,reserved: 0, actual total: 12404, total with overprovisioning: 12404; new used: 1500,reserved:0; movedfromreserved: false,moveToReserveredfalse 2014-01-23 19:58:04,371 DEBUG [c.c.c.CapacityManagerImpl] (AgentConnectTaskPool-4:ctx-67b75d21) release mem from host: 4, old used: 2281701376,reserved: 0, total: 8244576256; new used: 1744830464,reserved:0; movedfromreserved: false,moveToReserveredfalse 2014-01-23 19:58:04,654 ERROR [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-4:ctx-67b75d21) Monitor ClusteredVirtualMachineManagerImpl says there is an error in the connect process for 4 due to We cannot proceed with stop VM VM[User|VM-f77326e9-ecc5-4963-9ae5-3730c5df60dd] since it is not in 'Stopping' state, current state: Stopped com.cloud.utils.exception.CloudRuntimeException: We cannot proceed with stop VM VM[User|VM-f77326e9-ecc5-4963-9ae5-3730c5df60dd] since it is not in 'Stopping' state, current state: Stopped at com.cloud.vm.VirtualMachineManagerImpl.advanceStop(VirtualMachineManagerImpl.java:1434) at com.cloud.vm.VirtualMachineManagerImpl.orchestrateStop(VirtualMachineManagerImpl.java:1334) at com.cloud.vm.VirtualMachineManagerImpl.advanceStop(VirtualMachineManagerImpl.java:1302) at com.cloud.ha.HighAvailabilityManagerImpl.scheduleRestart(HighAvailabilityManagerImpl.java:346) at com.cloud.vm.VirtualMachineManagerImpl.compareState(VirtualMachineManagerImpl.java:2810) at com.cloud.vm.VirtualMachineManagerImpl.fullHostSync(VirtualMachineManagerImpl.java:2367) at com.cloud.vm.VirtualMachineManagerImpl.processConnect(VirtualMachineManagerImpl.java:3018) at com.cloud.agent.manager.AgentManagerImpl.notifyMonitorsOfConnection(AgentManagerImpl.java:495) at com.cloud.agent.manager.AgentManagerImpl.handleConnectedAgent(AgentManagerImpl.java:999) at com.cloud.agent.manager.AgentManagerImpl.access$000(AgentManagerImpl.java:117) at com.cloud.agent.manager.AgentManagerImpl$HandleAgentConnectTask.runInContext(AgentManagerImpl.java:1082) at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) 2014-01-23 19:58:04,659 ERROR [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-3:ctx-776109f5) Monitor ClusteredVirtualMachineManagerImpl says there is an error in the connect process for 1 due to We cannot proceed with stop VM VM[User|VM-5c4b6ff8-7d49-4de2-bbc3-fba4159d8c0e] since it is not in 'Stopping' state, current state: Stopped com.cloud.utils.exception.CloudRuntimeException: We cannot proceed with stop VM VM[User|VM-5c4b6ff8-7d49-4de2-bbc3-fba4159d8c0e] since it is not in 'Stopping' state, current state: Stopped at com.cloud.vm.VirtualMachineManagerImpl.advanceStop(VirtualMachineManagerImpl.java:1434) at com.cloud.vm.VirtualMachineManagerImpl.orchestrateStop(VirtualMachineManagerImpl.java:1334) at com.cloud.vm.VirtualMachineManagerImpl.advanceStop(VirtualMachineManagerImpl.java:1302) at com.cloud.ha.HighAvailabilityManagerImpl.scheduleRestart(HighAvailabilityManagerImpl.java:346) at com.cloud.vm.VirtualMachineManagerImpl.compareState(VirtualMachineManagerImpl.java:2810) at com.cloud.vm.VirtualMachineManagerImpl.fullHostSync(VirtualMachineManagerImpl.java:2367) at com.cloud.vm.VirtualMachineManagerImpl.processConnect(VirtualMachineManagerImpl.java:3018) at com.cloud.agent.manager.AgentManagerImpl.notifyMonitorsOfConnection(AgentManagerImpl.java:495) at com.cloud.agent.manager.AgentManagerImpl.handleConnectedAgent(AgentManagerImpl.java:999) at com.cloud.agent.manager.AgentManagerImpl.access$000(AgentManagerImpl.java:117) at com.cloud.agent.manager.AgentManagerImpl$HandleAgentConnectTask.runInContext(AgentManagerImpl.java:1082) at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) ) > vmsync:KVM:NPE when agentmanager tries to update the destroyed vm state > during vmsync > -------------------------------------------------------------------------------------- > > Key: CLOUDSTACK-5713 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5713 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: KVM > Affects Versions: 4.3.0 > Reporter: sadhu suresh > Assignee: Kelven Yang > Priority: Critical > Fix For: 4.3.0 > > Attachments: management-server.rar > > > env:4.3,kvm > steps: > 1. create a advanced zone and add a host > 2. deploy 2 vms > 3. from host destroy the vm > 4. check the vm state in Cloudstack > actual result: > during vm sync ,agent manager throws NULLpointer exception and due to this > it fail to update the vm state. > VM":{"state":"PowerOn","host":"centos62"}},"_gatewayAccessible":true,"_vnetAccessible":true,"hostType":"Routing","hostId":3,"wait":0}}] > } > 2014-01-02 08:56:14,614 DEBUG [c.c.v.VirtualMachineManagerImpl] > (AgentManager-Handler-13:null) VM i-2-7-VM: cs state = Running and realState > = Stopped > 2014-01-02 08:56:14,615 DEBUG [c.c.v.VirtualMachineManagerImpl] > (AgentManager-Handler-13:null) VM i-2-7-VM: cs state = Running and realState > = Stopped > 2014-01-02 08:56:14,672 DEBUG [c.c.h.HighAvailabilityManagerImpl] > (AgentManager-Handler-13:null) VM does not require investigation so I'm > marking it as Stopped: VM[User|vm2] > 2014-01-02 08:56:14,673 WARN [o.a.c.alerts] (AgentManager-Handler-13:null) > alertType:: 8 // dataCenterId:: 1 // podId:: 1 // clusterId:: null // > message:: VM (name: vm2, id: 7) stopped unexpectedly on host id:3, > availability zone id:1, pod id:1 > 2014-01-02 08:56:14,786 DEBUG [c.c.h.HighAvailabilityManagerImpl] > (AgentManager-Handler-13:null) VM is not HA enabled so we're done. > 2014-01-02 08:56:14,825 WARN [c.c.a.m.AgentManagerImpl] > (AgentManager-Handler-13:null) Caught: > java.lang.NullPointerException > at > com.cloud.vm.VirtualMachineManagerImpl.advanceStop(VirtualMachineManagerImpl.java:1268) > at > com.cloud.ha.HighAvailabilityManagerImpl.scheduleRestart(HighAvailabilityManagerImpl.java:346) > at > com.cloud.vm.VirtualMachineManagerImpl.compareState(VirtualMachineManagerImpl.java:2719) > at > com.cloud.vm.VirtualMachineManagerImpl.deltaHostSync(VirtualMachineManagerImpl.java:2325) > at > com.cloud.vm.VirtualMachineManagerImpl.processCommands(VirtualMachineManagerImpl.java:2855) > at > com.cloud.agent.manager.AgentManagerImpl.handleCommands(AgentManagerImpl.java:292) > at > com.cloud.agent.manager.AgentManagerImpl$AgentHandler.processRequest(AgentManagerImpl.java:1184) > at > com.cloud.agent.manager.AgentManagerImpl$AgentHandler.doTask(AgentManagerImpl.java:1270) > at > com.cloud.agent.manager.ClusteredAgentManagerImpl$ClusteredAgentHandler.doTask(ClusteredAgentManagerImpl.java:677) > at com.cloud.utils.nio.Task.run(Task.java:83) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:701) > 2014-01-02 08:56:14,827 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentManager-Handler-13:null) SeqA 3-196: Sending Seq 3-196: { Ans: , > MgmtId: 7020459589851, via: 3, Ver: v1, Flags: 100010, > [{"com.cloud.agent.api.Answer":{"result":false,"wait":0}}] } > 2014-01-02 08:56:15,657 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentManager-Handler-14:null) SeqA 4-1523: Processing Seq 4-1523: { Cmd , > MgmtId: -1, via: 4, Ver: v1, Flags: 11, > [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":3,"_loadInfo":"{\n > \"connections\": []\n}","wait":0}}] } > 2014-01-02 08:56:15,755 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentManager-Handler-14:null) SeqA 4-1523: Sending Seq 4-1523: { Ans: , > MgmtId: 7020459589851, via: 4, Ver: v1, Flags: 100010, > [{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] } > virsh # destroy i-2-7-VM > Domain i-2-7-VM destroyed > virsh # list > Id Name State > ---------------------------------------------------- > 1 s-4-VM running > 2 v-3-VM running > 3 r-6-VM running > but on cloudstack it still shows as running. > mysql> select * from vm_instance where id=7\G; > *************************** 1. row *************************** > id: 7 > name: vm2 > uuid: ff79a39f-0309-47c9-be1a-62b7f9223943 > instance_name: i-2-7-VM > state: Running > vm_template_id: 4 > guest_os_id: 112 > private_mac_address: 02:00:52:5d:00:03 > private_ip_address: 10.1.1.240 > pod_id: 1 > data_center_id: 1 > host_id: 3 > last_host_id: 3 > proxy_id: NULL > proxy_assign_time: NULL > vnc_password: U8wzZMnrqJ/JXjGSV5QeStfk3S6tTpeN > ha_enabled: 0 > limit_cpu_use: 0 > update_count: 3 > update_time: 2014-01-02 11:11:39 > created: 2014-01-02 11:11:36 > removed: NULL > type: User > vm_type: User > account_id: 2 > domain_id: 1 > service_offering_id: 1 > reservation_id: ab656607-c35a-409e-90aa-9a029703b3ef > hypervisor_type: KVM > disk_offering_id: NULL > owner: 2 > host_name: vm2 > display_name: vm2 > desired_state: NULL > dynamically_scalable: 0 > display_vm: 1 > power_state: NULL > power_state_update_time: NULL > power_state_update_count: 0 > power_host: NULL > 1 row in set (0.00 sec) -- This message was sent by Atlassian JIRA (v6.1.5#6160)