le cong duan created CLOUDSTACK-4902: ----------------------------------------
Summary: Fail to create snapshot with KVM when run multiple Hosts in Cluster Key: CLOUDSTACK-4902 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4902 Project: CloudStack Issue Type: Bug Security Level: Public (Anyone can view this level - this is the default.) Components: KVM, Management Server Affects Versions: 4.2.0 Environment: Managegemnet server: CentOS 6.3 64bit KVM Host: Ubuntu 12.04.1 64bit Reporter: le cong duan Priority: Critical Fix For: 4.2.0 When only one Host in the Cluster, I always success to create snapshot for the volume by manual or automation. But when have multiple Host in Cluster. The creating is failed with status "CreatedOnPrimary", sometimes it is successfull. The following is error log on Management when occur error, there are three error situations. ------> First Situation 2013-10-18 22:46:27,163 DEBUG [agent.transport.Request] (Job-Executor-122:job-120 = [ d07688b0- bd86-4259-bdc2-441c36c4727d ]) Seq 11-714015661: Received: { Ans: , MgmtId: 113353561884, via: 11, Ver: v1, Flags: 110, { CopyCmdAnswer } } 2013-10-18 22:46:27,175 DEBUG [storage.snapshot.SnapshotManagerImpl] (Job-Executor-122:job-120 = [ d07688b0-bd86-4259-bdc2-441c36c4727d ]) Failed to create snapshot com.cloud.utils.exception.CloudRuntimeException: org.libvirt.LibvirtException: Domain snapshot not found: no snapshot with matching name '8a1d6db7-9ffe-43b5-a7df-df627329a168' at org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot (SnapshotServiceImpl.java:280) at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot (XenserverSnapshotStrategy.java:138) at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot (XenserverSnapshotStrategy.java:264) at com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot (SnapshotManagerImpl.java:1013) at com.cloud.utils.component.ComponentInstantiationPostProcessor $InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125) at org.apache.cloudstack.storage.volume.VolumeServiceImpl.takeSnapshot (VolumeServiceImpl.java:1307) at com.cloud.storage.VolumeManagerImpl.takeSnapshot(VolumeManagerImpl.java:2719) at org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd.execute (CreateSnapshotCmd.java:170) at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158) at com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) 2013-10-18 22:46:27,181 DEBUG [storage.volume.VolumeServiceImpl] (Job-Executor-122:job-120 = [ d07688b0-bd86-4259-bdc2-441c36c4727d ]) Take snapshot: 12 failed com.cloud.utils.exception.CloudRuntimeException: Failed to create snapshot at com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot (SnapshotManagerImpl.java:1040) at com.cloud.utils.component.ComponentInstantiationPostProcessor $InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125) at org.apache.cloudstack.storage.volume.VolumeServiceImpl.takeSnapshot (VolumeServiceImpl.java:1307) at com.cloud.storage.VolumeManagerImpl.takeSnapshot(VolumeManagerImpl.java:2719) at org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd.execute (CreateSnapshotCmd.java:170) at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158) at com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) Caused by: com.cloud.utils.exception.CloudRuntimeException: org.libvirt.LibvirtException: Domain snapshot not found: no snapshot with matching name '8a1d6db7-9ffe-43b5-a7df-df627329a168' at org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot (SnapshotServiceImpl.java:280) at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot (XenserverSnapshotStrategy.java:138) at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot (XenserverSnapshotStrategy.java:264) at com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot (SnapshotManagerImpl.java:1013) ... 16 more 2013-10-18 22:46:27,189 DEBUG [cloud.async.AsyncJobManagerImpl] (Job-Executor-122:job-120 = [ d07688b0-bd86-4259-bdc2-441c36c4727d ]) Complete async job-120 = [ d07688b0-bd86-4259-bdc2- 441c36c4727d ], jobStatus: 2, resultCode: 530, result: Error Code: 530 Error text: Failed to create snapshot due to an internal error creating snapshot for volume 12 ------> Second Situation 2013-10-18 23:30:23,010 DEBUG [agent.transport.Request] (AgentManager-Handler-10:null) Seq 8- 1133380250: Processing: { Ans: , MgmtId: 113353561884, via: 8, Ver: v1, Flags: 110, [{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"result":false,"details":"Failed to backup snapshot: /usr/share/cloudstack-common/scripts/storage/qcow2/managesnapshot.sh: line 121: 17900 Segmentation fault (core dumped) $qemu_img snapshot -d \"$snapshotname\" $diskFailed to delete snapshot 0c6ec28f-eaaf-487a-a96c-cd58e6771367 for path /mnt/primary2/208ba7ed-ffcd-4cd7-9fd9- 924a3196cc08","wait":0}}] } 2013-10-18 23:30:23,010 DEBUG [agent.manager.AgentAttache] (AgentManager-Handler-10:null) Seq 8- 1133380250: No more commands found 2013-10-18 23:30:23,010 DEBUG [agent.transport.Request] (Job-Executor-131:job-129 = [ 1bd4b8c8- 0680-470b-8577-baf7fbcf0fd2 ]) Seq 8-1133380250: Received: { Ans: , MgmtId: 113353561884, via: 8, Ver: v1, Flags: 110, { CopyCmdAnswer } } 2013-10-18 23:30:23,022 DEBUG [storage.snapshot.SnapshotManagerImpl] (Job-Executor-131:job-129 = [ 1bd4b8c8-0680-470b-8577-baf7fbcf0fd2 ]) Failed to create snapshot com.cloud.utils.exception.CloudRuntimeException: Failed to backup snapshot: /usr/share/cloudstack- common/scripts/storage/qcow2/managesnapshot.sh: line 121: 17900 Segmentation fault (core dumped) $qemu_img snapshot -d "$snapshotname" $diskFailed to delete snapshot 0c6ec28f-eaaf-487a- a96c-cd58e6771367 for path /mnt/primary2/208ba7ed-ffcd-4cd7-9fd9-924a3196cc08 at org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot (SnapshotServiceImpl.java:280) at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot (XenserverSnapshotStrategy.java:138) at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot (XenserverSnapshotStrategy.java:264) at com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot (SnapshotManagerImpl.java:1013) at com.cloud.utils.component.ComponentInstantiationPostProcessor $InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125) at org.apache.cloudstack.storage.volume.VolumeServiceImpl.takeSnapshot (VolumeServiceImpl.java:1307) at com.cloud.storage.VolumeManagerImpl.takeSnapshot(VolumeManagerImpl.java:2719) at org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd.execute (CreateSnapshotCmd.java:170) at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158) at com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) 2013-10-18 23:30:23,028 DEBUG [storage.volume.VolumeServiceImpl] (Job-Executor-131:job-129 = [ 1bd4b8c8-0680-470b-8577-baf7fbcf0fd2 ]) Take snapshot: 11 failed com.cloud.utils.exception.CloudRuntimeException: Failed to create snapshot at com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot (SnapshotManagerImpl.java:1040) at com.cloud.utils.component.ComponentInstantiationPostProcessor $InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125) at org.apache.cloudstack.storage.volume.VolumeServiceImpl.takeSnapshot (VolumeServiceImpl.java:1307) at com.cloud.storage.VolumeManagerImpl.takeSnapshot(VolumeManagerImpl.java:2719) at org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd.execute (CreateSnapshotCmd.java:170) at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158) at com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) Caused by: com.cloud.utils.exception.CloudRuntimeException: Failed to backup snapshot: /usr/share/cloudstack-common/scripts/storage/qcow2/managesnapshot.sh: line 121: 17900 Segmentation fault (core dumped) $qemu_img snapshot -d "$snapshotname" $diskFailed to delete snapshot 0c6ec28f-eaaf-487a-a96c-cd58e6771367 for path /mnt/primary2/208ba7ed-ffcd-4cd7-9fd9-924a3196cc08 at org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot (SnapshotServiceImpl.java:280) at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot (XenserverSnapshotStrategy.java:138) at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot (XenserverSnapshotStrategy.java:264) at com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot (SnapshotManagerImpl.java:1013) ... 16 more 2013-10-18 23:30:23,036 DEBUG [cloud.async.AsyncJobManagerImpl] (Job-Executor-131:job-129 = [ 1bd4b8c8-0680-470b-8577-baf7fbcf0fd2 ]) Complete async job-129 = [ 1bd4b8c8-0680-470b-8577- baf7fbcf0fd2 ], jobStatus: 2, resultCode: 530, result: Error Code: 530 Error text: Failed to create snapshot due to an internal error creating snapshot for volume 11 -----> Third Situation. 2013-10-19 10:39:03,220 DEBUG [storage.snapshot.SnapshotManagerImpl] (Job-Executor-43:job-206 = [ 661fcd14-8569-4e13-9e89-496adea6d054 ]) Failed to create snapshot com.cloud.utils.exception.CloudRuntimeException: Failed to backup 8ef1d6d8-50af-4204-8f79- 8b8cfd907d50 for disk /mnt/primary2/208ba7ed-ffcd-4cd7-9fd9-924a3196cc08 to /mnt/144a992e-7532- 316b-a15d-8e0ae22724c0/snapshots/2/11 at org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot (SnapshotServiceImpl.java:280) at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot (XenserverSnapshotStrategy.java:138) at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot (XenserverSnapshotStrategy.java:264) at com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot (SnapshotManagerImpl.java:1013) at com.cloud.utils.component.ComponentInstantiationPostProcessor $InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125) at org.apache.cloudstack.storage.volume.VolumeServiceImpl.takeSnapshot (VolumeServiceImpl.java:1307) at com.cloud.storage.VolumeManagerImpl.takeSnapshot(VolumeManagerImpl.java:2719) at org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd.execute (CreateSnapshotCmd.java:170) at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158) at com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) 2013-10-19 10:39:03,227 DEBUG [storage.volume.VolumeServiceImpl] (Job-Executor-43:job-206 = [ 661fcd14-8569-4e13-9e89-496adea6d054 ]) Take snapshot: 11 failed com.cloud.utils.exception.CloudRuntimeException: Failed to create snapshot at com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot (SnapshotManagerImpl.java:1040) at com.cloud.utils.component.ComponentInstantiationPostProcessor $InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125) at org.apache.cloudstack.storage.volume.VolumeServiceImpl.takeSnapshot (VolumeServiceImpl.java:1307) at com.cloud.storage.VolumeManagerImpl.takeSnapshot(VolumeManagerImpl.java:2719) at org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd.execute (CreateSnapshotCmd.java:170) at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158) at com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) Caused by: com.cloud.utils.exception.CloudRuntimeException: Failed to backup 8ef1d6d8-50af-4204- 8f79-8b8cfd907d50 for disk /mnt/primary2/208ba7ed-ffcd-4cd7-9fd9-924a3196cc08 to /mnt/144a992e-7532 -316b-a15d-8e0ae22724c0/snapshots/2/11 at org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot (SnapshotServiceImpl.java:280) at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot (XenserverSnapshotStrategy.java:138) at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot (XenserverSnapshotStrategy.java:264) at com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot (SnapshotManagerImpl.java:1013) ... 16 more 2013-10-19 10:39:03,235 DEBUG [cloud.async.AsyncJobManagerImpl] (Job-Executor-43:job-206 = [ 661fcd14-8569-4e13-9e89-496adea6d054 ]) Complete async job-206 = [ 661fcd14-8569-4e13-9e89- 496adea6d054 ], jobStatus: 2, resultCode: 530, result: Error Code: 530 Error text: Failed to create snapshot due to an internal error creating snapshot for volume 11 -- This message was sent by Atlassian JIRA (v6.1#6144)