[ https://issues.apache.org/jira/browse/CLOUDSTACK-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13807863#comment-13807863 ]
Likitha Shetty commented on CLOUDSTACK-4975: -------------------------------------------- In the reported issue an NPE is thrown during the ROOT volume deletion of a VM because - There are 2 jobs that are running in parallel. First one is the account cleanup job that deletes the account including all its resources. Second one is the VM expunge task that runs periodically to expunge any VM that has been destroyed or is in an expunging state. In this case both the jobs are trying to cleanup the same VM simultaneously. When the second job is trying to cleanup the ROOT volume of VM 205, since the volume has not yet been expunged it makes a call to the hypervisor host to delete the volume. But in the mean time the host has already deleted the volume because of the expunge request made by the first job. And this results in an NPE during state transition being made by the second job. But both jobs exit successfully 2013-10-28 14:24:25,572 DEBUG [cloud.async.AsyncJobManagerImpl] (Job-Executor-101:job-95 = [ 805d2fd9-ce54-49db-a2a0-3479c0eeee84 ]) Done executing org.apache.cloudstack.api.command.admin.account.DeleteAccountCmd for job-95 = [ 805d2fd9-ce54-49db-a2a0-3479c0eeee84 ] 2013-10-28 14:24:25,306 DEBUG [cloud.vm.UserVmManagerImpl] (UserVm-Scavenger-1:null) Successfully cleaned up vm VM[User|newuser1f5instance1] resources as a part of expunge process > NPE while deleting the volume > ----------------------------- > > Key: CLOUDSTACK-4975 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4975 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Storage Controller, VMware > Affects Versions: 4.2.1 > Reporter: Sailaja Mada > Assignee: Likitha Shetty > Priority: Critical > Fix For: 4.2.1 > > Attachments: newlogs.rar > > > Steps: > 1. Configure Adv zone with VMWARE Hypervisor > 2. Created shared network with ROOT domain as scope > 3. Created a user account and deployed VM using this shared network > 4. After VM is up and running , Destroy the VM > Observation: > Observed NPE while deleting the volume. > 2013-10-28 14:24:07,209 INFO [storage.resource.VmwareStorageProcessor] > (DirectAgent-442:10.102.192.23) Executing resource DestroyCommand: > {"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"739f096d-1b1d-4c20-895f-c416c95bace5","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"7e18404c-acd1-3b13-9c9c-bbaf2d28bccd","id":4,"poolType":"NetworkFilesystem","host":"10.102.192.100","path":"/cpg_vol/sailaja/newdixonps3","port":2049}},"name":"ROOT-205","size":2147483648,"path":"ROOT-205","volumeId":217,"vmName":"i-6-205-userinstance","accountId":6,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[7e18404cacd13b139c9cbbaf2d28bccd] > > i-6-205-userinstance/ROOT-205.vmdk\"]}","format":"OVA","id":217,"hypervisorType":"VMware"}},"wait":0} > 2013-10-28 14:24:07,343 INFO [storage.resource.VmwareStorageProcessor] > (DirectAgent-442:10.102.192.23) Destroy root volume and VM itself. vmName > i-6-205-userinstance > 2013-10-28 14:24:07,376 DEBUG [cloud.api.ApiServlet] (catalina-exec-9:null) > ===START=== 10.104.255.45 -- GET > command=queryAsyncJobResult&jobId=805d2fd9-ce54-49db-a2a0-3479c0eeee84&response=json&sessionkey=t%2FFTuAD%2Fvg3MVIIB7VszPAxZw40%3D&_=1382950771190 > 2013-10-28 14:24:07,383 DEBUG [vmware.mo.VirtualMachineMO] > (DirectAgent-442:10.102.192.23) Retrieved 1 networks with key : 2 > 2013-10-28 14:24:07,414 DEBUG [cloud.api.ApiServlet] (catalina-exec-9:null) > ===END=== 10.104.255.45 -- GET > command=queryAsyncJobResult&jobId=805d2fd9-ce54-49db-a2a0-3479c0eeee84&response=json&sessionkey=t%2FFTuAD%2Fvg3MVIIB7VszPAxZw40%3D&_=1382950771190 > 2013-10-28 14:24:10,374 DEBUG [cloud.api.ApiServlet] (catalina-exec-2:null) > ===START=== 10.104.255.45 -- GET > command=queryAsyncJobResult&jobId=805d2fd9-ce54-49db-a2a0-3479c0eeee84&response=json&sessionkey=t%2FFTuAD%2Fvg3MVIIB7VszPAxZw40%3D&_=1382950774191 > 2013-10-28 14:24:10,398 DEBUG [cloud.api.ApiServlet] (catalina-exec-2:null) > ===END=== 10.104.255.45 -- GET > command=queryAsyncJobResult&jobId=805d2fd9-ce54-49db-a2a0-3479c0eeee84&response=json&sessionkey=t%2FFTuAD%2Fvg3MVIIB7VszPAxZw40%3D&_=1382950774191 > 2013-10-28 14:24:12,978 DEBUG [agent.manager.AgentManagerImpl] > (AgentManager-Handler-8:null) SeqA 3-66860: Processing Seq 3-66860: { Cmd , > MgmtId: -1, via: 3, Ver: v1, Flags: 11, > [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":2,"_loadInfo":"{\n > \"connections\": []\n}","wait":0}}] } > 2013-10-28 14:24:12,984 DEBUG [agent.manager.AgentManagerImpl] > (AgentManager-Handler-8:null) SeqA 3-66860: Sending Seq 3-66860: { Ans: , > MgmtId: 187767034175903, via: 3, Ver: v1, Flags: 100010, > [{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] } > 2013-10-28 14:24:13,371 DEBUG [agent.manager.DirectAgentAttache] > (DirectAgent-375:null) Ping from 5 > 2013-10-28 14:24:13,585 DEBUG [cloud.api.ApiServlet] (catalina-exec-10:null) > ===START=== 10.104.255.45 -- GET > command=queryAsyncJobResult&jobId=805d2fd9-ce54-49db-a2a0-3479c0eeee84&response=json&sessionkey=t%2FFTuAD%2Fvg3MVIIB7VszPAxZw40%3D&_=1382950777196 > 2013-10-28 14:24:13,606 DEBUG [cloud.api.ApiServlet] (catalina-exec-10:null) > ===END=== 10.104.255.45 -- GET > command=queryAsyncJobResult&jobId=805d2fd9-ce54-49db-a2a0-3479c0eeee84&response=json&sessionkey=t%2FFTuAD%2Fvg3MVIIB7VszPAxZw40%3D&_=1382950777196 > 2013-10-28 14:24:23,403 DEBUG [cloud.vm.VirtualMachineManagerImpl] > (UserVm-Scavenger-1:null) Stopped called on VM[User|newuser1f5instance1] but > the state is Expunging > 2013-10-28 14:24:23,412 DEBUG [cloud.capacity.CapacityManagerImpl] > (UserVm-Scavenger-1:null) VM state transitted from :Expunging to Expunging > with event: ExpungeOperationvm's original host id: 1 new host id: null host > id before state transition: null > 2013-10-28 14:24:23,412 DEBUG [cloud.vm.VirtualMachineManagerImpl] > (UserVm-Scavenger-1:null) Destroying vm VM[User|newuser1f5instance1] > 2013-10-28 14:24:23,412 DEBUG [cloud.vm.VirtualMachineManagerImpl] > (UserVm-Scavenger-1:null) Cleaning up NICS > 2013-10-28 14:24:23,414 DEBUG [cloud.network.NetworkManagerImpl] > (UserVm-Scavenger-1:null) Cleaning network for vm: 205 > 2013-10-28 14:24:23,417 DEBUG [cloud.storage.VolumeManagerImpl] > (UserVm-Scavenger-1:null) Cleaning storage for vm: 205 > 2013-10-28 14:24:23,419 DEBUG [cloud.storage.VolumeManagerImpl] > (UserVm-Scavenger-1:null) Skipping destroy for the volume > Vol[217|vm=205|ROOT] as its in state Expunging > 2013-10-28 14:24:23,451 DEBUG [agent.transport.Request] > (UserVm-Scavenger-1:null) Seq 1-1580729924: Sending { Cmd , MgmtId: > 187767034175903, via: 1, Ver: v1, Flags: 100011, > [{"org.apache.cloudstack.storage.command.DeleteCommand":{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"739f096d-1b1d-4c20-895f-c416c95bace5","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"7e18404c-acd1-3b13-9c9c-bbaf2d28bccd","id":4,"poolType":"NetworkFilesystem","host":"10.102.192.100","path":"/cpg_vol/sailaja/newdixonps3","port":2049}},"name":"ROOT-205","size":2147483648,"path":"ROOT-205","volumeId":217,"vmName":"i-6-205-userinstance","accountId":6,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[7e18404cacd13b139c9cbbaf2d28bccd] > > i-6-205-userinstance/ROOT-205.vmdk\"]}","format":"OVA","id":217,"hypervisorType":"VMware"}},"wait":0}}] > } > 2013-10-28 14:24:23,452 DEBUG [agent.transport.Request] > (UserVm-Scavenger-1:null) Seq 1-1580729924: Executing: { Cmd , MgmtId: > 187767034175903, via: 1, Ver: v1, Flags: 100011, > [{"org.apache.cloudstack.storage.command.DeleteCommand":{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"739f096d-1b1d-4c20-895f-c416c95bace5","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"7e18404c-acd1-3b13-9c9c-bbaf2d28bccd","id":4,"poolType":"NetworkFilesystem","host":"10.102.192.100","path":"/cpg_vol/sailaja/newdixonps3","port":2049}},"name":"ROOT-205","size":2147483648,"path":"ROOT-205","volumeId":217,"vmName":"i-6-205-userinstance","accountId":6,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[7e18404cacd13b139c9cbbaf2d28bccd] > > i-6-205-userinstance/ROOT-205.vmdk\"]}","format":"OVA","id":217,"hypervisorType":"VMware"}},"wait":0}}] > } > 2013-10-28 14:24:23,452 DEBUG [agent.manager.DirectAgentAttache] > (DirectAgent-375:null) Seq 1-1580729924: Executing request > 2013-10-28 14:24:23,452 INFO [storage.resource.VmwareStorageProcessor] > (DirectAgent-375:10.102.192.23) Executing resource DestroyCommand: > {"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"739f096d-1b1d-4c20-895f-c416c95bace5","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"7e18404c-acd1-3b13-9c9c-bbaf2d28bccd","id":4,"poolType":"NetworkFilesystem","host":"10.102.192.100","path":"/cpg_vol/sailaja/newdixonps3","port":2049}},"name":"ROOT-205","size":2147483648,"path":"ROOT-205","volumeId":217,"vmName":"i-6-205-userinstance","accountId":6,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[7e18404cacd13b139c9cbbaf2d28bccd] > > i-6-205-userinstance/ROOT-205.vmdk\"]}","format":"OVA","id":217,"hypervisorType":"VMware"}},"wait":0} > 2013-10-28 14:24:23,572 INFO [storage.resource.VmwareStorageProcessor] > (DirectAgent-375:10.102.192.23) Destroy root volume and VM itself. vmName > i-6-205-userinstance > 2013-10-28 14:24:23,605 DEBUG [vmware.mo.VirtualMachineMO] > (DirectAgent-375:10.102.192.23) Retrieved 1 networks with key : 2 > 2013-10-28 14:24:25,165 DEBUG [agent.manager.DirectAgentAttache] > (DirectAgent-442:null) Seq 1-1580729923: Response Received: > 2013-10-28 14:24:25,166 DEBUG [agent.transport.Request] > (DirectAgent-442:null) Seq 1-1580729923: Processing: { Ans: , MgmtId: > 187767034175903, via: 1, Ver: v1, Flags: 10, > [{"com.cloud.agent.api.Answer":{"result":true,"details":"Success","wait":0}}] > } > 2013-10-28 14:24:25,166 DEBUG [agent.transport.Request] > (Job-Executor-101:job-95 = [ 805d2fd9-ce54-49db-a2a0-3479c0eeee84 ]) Seq > 1-1580729923: Received: { Ans: , MgmtId: 187767034175903, via: 1, Ver: v1, > Flags: 10, { Answer } } > 2013-10-28 14:24:25,178 INFO [storage.volume.VolumeServiceImpl] > (Job-Executor-101:job-95 = [ 805d2fd9-ce54-49db-a2a0-3479c0eeee84 ]) Volume > 217 is not referred anywhere, remove it from volumes table > 2013-10-28 14:24:25,189 ERROR [storage.resource.VmwareStorageProcessor] > (DirectAgent-375:10.102.192.23) delete volume failed due to Exception: > javax.xml.ws.soap.SOAPFaultException > Message: The object has already been deleted or has not been completely > created > javax.xml.ws.soap.SOAPFaultException: The object has already been deleted or > has not been completely created > at > com.sun.xml.internal.ws.fault.SOAP11Fault.getProtocolException(SOAP11Fault.java:178) > at > com.sun.xml.internal.ws.fault.SOAPFaultBuilder.createException(SOAPFaultBuilder.java:119) > at > com.sun.xml.internal.ws.client.sei.SyncMethodHandler.invoke(SyncMethodHandler.java:108) > at > com.sun.xml.internal.ws.client.sei.SyncMethodHandler.invoke(SyncMethodHandler.java:78) > at com.sun.xml.internal.ws.client.sei.SEIStub.invoke(SEIStub.java:107) > at $Proxy91.destroyTask(Unknown Source) > at com.cloud.hypervisor.vmware.mo.BaseMO.destroy(BaseMO.java:73) > at > com.cloud.storage.resource.VmwareStorageProcessor.deleteVolume(VmwareStorageProcessor.java:1508) > at > com.cloud.storage.resource.StorageSubsystemCommandHandlerBase.execute(StorageSubsystemCommandHandlerBase.java:114) > at > com.cloud.storage.resource.StorageSubsystemCommandHandlerBase.handleStorageCommands(StorageSubsystemCommandHandlerBase.java:53) > at > com.cloud.hypervisor.vmware.resource.VmwareResource.executeRequest(VmwareResource.java:559) > at > com.cloud.agent.manager.DirectAgentAttache$Task.run(DirectAgentAttache.java:186) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:679) > 2013-10-28 14:24:25,195 DEBUG [agent.manager.DirectAgentAttache] > (DirectAgent-375:null) Seq 1-1580729924: Response Received: > 2013-10-28 14:24:25,195 DEBUG [agent.transport.Request] > (DirectAgent-375:null) Seq 1-1580729924: Processing: { Ans: , MgmtId: > 187767034175903, via: 1, Ver: v1, Flags: 10, > [{"com.cloud.agent.api.Answer":{"result":false,"details":"delete volume > failed due to Exception: javax.xml.ws.soap.SOAPFaultException\nMessage: The > object has already been deleted or has not been completely > created\n","wait":0}}] } > 2013-10-28 14:24:25,196 DEBUG [agent.transport.Request] > (UserVm-Scavenger-1:null) Seq 1-1580729924: Received: { Ans: , MgmtId: > 187767034175903, via: 1, Ver: v1, Flags: 10, { Answer } } > 2013-10-28 14:24:25,197 DEBUG [storage.volume.VolumeObject] > (UserVm-Scavenger-1:null) Failed to update state > java.lang.NullPointerException > at com.cloud.utils.fsm.StateMachine2.transitTo(StateMachine2.java:99) > at > org.apache.cloudstack.storage.volume.VolumeObject.stateTransit(VolumeObject.java:153) > at > org.apache.cloudstack.storage.volume.VolumeObject.processEvent(VolumeObject.java:288) > at > org.apache.cloudstack.storage.volume.VolumeServiceImpl.deleteVolumeCallback(VolumeServiceImpl.java:311) > at sun.reflect.GeneratedMethodAccessor130.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:616) > at > org.apache.cloudstack.framework.async.AsyncCallbackDispatcher.dispatch(AsyncCallbackDispatcher.java:142) > at > org.apache.cloudstack.framework.async.InplaceAsyncCallbackDriver.performCompletionCallback(InplaceAsyncCallbackDriver.java:26) > at > org.apache.cloudstack.framework.async.AsyncCallbackDispatcher.complete(AsyncCallbackDispatcher.java:120) > at > org.apache.cloudstack.storage.datastore.driver.CloudStackPrimaryDataStoreDriverImpl.deleteAsync(CloudStackPrimaryDataStoreDriverImpl.java:177) > at > org.apache.cloudstack.storage.volume.VolumeServiceImpl.expungeVolumeAsync(VolumeServiceImpl.java:293) > at > com.cloud.storage.VolumeManagerImpl.cleanupVolumes(VolumeManagerImpl.java:2180) > at > com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125) > at > com.cloud.vm.VirtualMachineManagerImpl.advanceExpunge(VirtualMachineManagerImpl.java:479) > at com.cloud.vm.UserVmManagerImpl.expunge(UserVmManagerImpl.java:1598) > at > com.cloud.vm.UserVmManagerImpl$ExpungeTask.run(UserVmManagerImpl.java:1767) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:679) > 2013-10-28 14:24:25,199 DEBUG [storage.volume.VolumeServiceImpl] > (UserVm-Scavenger-1:null) ignore delete volume status update failure, it will > be picked up by storage clean up thread later > java.lang.NullPointerException > at > org.apache.cloudstack.storage.volume.VolumeObject.processEvent(VolumeObject.java:294) > at > org.apache.cloudstack.storage.volume.VolumeServiceImpl.deleteVolumeCallback(VolumeServiceImpl.java:311) > at sun.reflect.GeneratedMethodAccessor130.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:616) > at > org.apache.cloudstack.framework.async.AsyncCallbackDispatcher.dispatch(AsyncCallbackDispatcher.java:142) > at > org.apache.cloudstack.framework.async.InplaceAsyncCallbackDriver.performCompletionCallback(InplaceAsyncCallbackDriver.java:26) > at > org.apache.cloudstack.framework.async.AsyncCallbackDispatcher.complete(AsyncCallbackDispatcher.java:120) > at > org.apache.cloudstack.storage.datastore.driver.CloudStackPrimaryDataStoreDriverImpl.deleteAsync(CloudStackPrimaryDataStoreDriverImpl.java:177) > at > org.apache.cloudstack.storage.volume.VolumeServiceImpl.expungeVolumeAsync(VolumeServiceImpl.java:293) > at > com.cloud.storage.VolumeManagerImpl.cleanupVolumes(VolumeManagerImpl.java:2180) > at > com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125) > at > com.cloud.vm.VirtualMachineManagerImpl.advanceExpunge(VirtualMachineManagerImpl.java:479) > at com.cloud.vm.UserVmManagerImpl.expunge(UserVmManagerImpl.java:1598) > at > com.cloud.vm.UserVmManagerImpl$ExpungeTask.run(UserVmManagerImpl.java:1767) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:679) > 2013-10-28 14:24:25,206 DEBUG [cloud.vm.VirtualMachineManagerImpl] > (Job-Executor-101:job-95 = [ 805d2fd9-ce54-49db-a2a0-3479c0eeee84 ]) Expunged > VM[User|newuser1f5instance1] > 2013-10-28 14:24:25,210 DEBUG [cloud.vm.VirtualMachineManagerImpl] > (UserVm-Scavenger-1:null) Expunged VM[User|newuser1f5instance1] -- This message was sent by Atlassian JIRA (v6.1#6144)