France created CLOUDSTACK-6060: ---------------------------------- Summary: Excessive use of LVM snapshots on XenServer, that leads to snapshot failure and unnecessary disk usage. Key: CLOUDSTACK-6060 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-6060 Project: CloudStack Issue Type: Bug Security Level: Public (Anyone can view this level - this is the default.) Components: Management Server, XenServer Affects Versions: 4.1.1 Environment: CS 4.1.1, XS S602E027 Reporter: France
When user created multiple snapshots in CS GUI (in my case 3 daily, 2 weekly and 2 monthly) snapshot creation soon failed, because the maximum amount of LVM snapshots on XenServer was reached. >From SMlog on XenServer: [9294] 2014-02-07 15:16:58.326838 ***** vdi_snapshot: EXCEPTION SR.SROSError, The snapshot chain is too long File "/opt/xensource/sm/SRCommand.py", line 94, in run return self._run_locked(sr) File "/opt/xensource/sm/SRCommand.py", line 131, in _run_locked return self._run(sr, target) File "/opt/xensource/sm/SRCommand.py", line 170, in _run return target.snapshot(self.params['sr_uuid'], self.vdi_uuid) File "/opt/xensource/sm/LVHDSR.py", line 1440, in snapshot return self._snapshot(snapType) File "/opt/xensource/sm/LVHDSR.py", line 1509, in _snapshot raise xs_errors.XenError('SnapshotChainTooLong') File "/opt/xensource/sm/xs_errors.py", line 49, in __init__ raise SR.SROSError(errorcode, errormessage) >From CS: WARN [xen.resource.CitrixResourceBase] (DirectAgent-150:) ManageSnapshotCommand operation: create Failed for snapshotId: 489, reason: SR_BACKEND_FAILURE_109The snapshot chain is too long SR_BACKEND_FAILURE_109The snapshot chain is too long at com.xensource.xenapi.Types.checkResponse(Types.java:1936) at com.xensource.xenapi.Connection.dispatch(Connection.java:368) at com.cloud.hypervisor.xen.resource.XenServerConnectionPool$XenServerConnection.dispatch(XenServerConnectionPool.java:909) at com.xensource.xenapi.VDI.miamiSnapshot(VDI.java:1217) at com.xensource.xenapi.VDI.snapshot(VDI.java:1192) at com.cloud.hypervisor.xen.resource.CitrixResourceBase.execute(CitrixResourceBase.java:6293) at com.cloud.hypervisor.xen.resource.CitrixResourceBase.executeRequest(CitrixResourceBase.java:487) at com.cloud.hypervisor.xen.resource.XenServer56Resource.executeRequest(XenServer56Resource.java:73) at com.cloud.agent.manager.DirectAgentAttache$Task.run(DirectAgentAttache.java:186) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:701) Here is the snapshot list for the VM: [root@x1 ~]# xe vdi-list is-a-snapshot=true | grep XZY name-label ( RW): XZY_ROOT-385_20140125020342 name-label ( RW): XZY_ROOT-385_20140121020342 name-label ( RW): XZY_ROOT-385_20140121020342 name-label ( RW): XZY_ROOT-385_20140124020342 name-label ( RW): XZY_ROOT-385_20140122020342 name-label ( RW): XZY_ROOT-385_20140125020342 name-label ( RW): XZY_ROOT-385_20140123020342 name-label ( RW): XZY_ROOT-385_20140122020342 name-label ( RW): XZY_ROOT-385_20140125020342 name-label ( RW): XZY_ROOT-385_20140124020342 name-label ( RW): XZY_ROOT-385_20140120020341 name-label ( RW): XZY_ROOT-385_20140123020342 name-label ( RW): XZY_ROOT-385_20140124020342 name-label ( RW): XZY_ROOT-385_20140121020342 name-label ( RW): XZY_ROOT-385_20140122020342 name-label ( RW): XZY_ROOT-385_20140120020341 name-label ( RW): XZY_ROOT-385_20140122020342 name-label ( RW): XZY_ROOT-385_20140120020341 name-label ( RW): XZY_ROOT-385_20140123020342 name-label ( RW): XZY_ROOT-385_20140123020342 name-label ( RW): XZY_ROOT-385_20140122020342 name-label ( RW): XZY_ROOT-385_20140120020341 name-label ( RW): XZY_ROOT-385_20140123020342 name-label ( RW): XZY_ROOT-385_20140124020342 name-label ( RW): XZY_ROOT-385_20140121020342 name-label ( RW): XZY_ROOT-385_20140124020342 name-label ( RW): XZY_ROOT-385_20140121020342 name-label ( RW): XZY_ROOT-385_20140120020341 I see lot's of of other LVM snapshots: xe vdi-list is-a-snapshot=true | grep name-label name-label ( RW): XYZ_ROOT-385_20140125020342 name-label ( RW): Template c2b3a07f-d16f-4abb-9162-55e4130a417c name-label ( RW): Template e80af9a4-e087-4220-977b-868fa4ec75b6 name-label ( RW): XYZ_ROOT-385_20140121020342 name-label ( RW): XYZ_ROOT-385_20140121020342 name-label ( RW): XBBBC_20140206040441 name-label ( RW): XYZ_ROOT-385_20140124020342 name-label ( RW): OCWWW_20140112000011 name-label ( RW): XYZ_ROOT-385_20140122020342 name-label ( RW): Template routing-1 name-label ( RW): Template aa0bcd7c-4b03-4778-a038-da80fdfb7a43 name-label ( RW): OCWWWXXXXX_ROOT-330_20140201130342 name-label ( RW): Template e80af9a4-e087-4220-977b-868fa4ec75b6 name-label ( RW): XYZ_ROOT-385_20140125020342 name-label ( RW): XYZ_ROOT-385_20140123020342 name-label ( RW): XYZ_ROOT-385_20140122020342 name-label ( RW): Template 58e13a51-affa-4fe2-a66b-19e89091290d name-label ( RW): ABCCCDDDD_ROOT-334_20140201160342 name-label ( RW): ANON_ROOT-324_20131121124532 name-label ( RW): Template fc0262f2-7609-498b-a1ac-ed71e1ebe7f9 name-label ( RW): XYZ_ROOT-385_20140125020342 name-label ( RW): Template d768db3f-6d42-48f9-bdfb-7dceccef9f3e name-label ( RW): XYZ_ROOT-385_20140124020342 name-label ( RW): Template fc0262f2-7609-498b-a1ac-ed71e1ebe7f9 name-label ( RW): XYZ_ROOT-385_20140120020341 name-label ( RW): detached_hrosci_20130513190437 name-label ( RW): XYZ_ROOT-385_20140123020342 name-label ( RW): XYZ_ROOT-385_20140124020342 name-label ( RW): Template routing-1 name-label ( RW): NGGQQQ_ROOT-423_20140202030342 name-label ( RW): XYZ_ROOT-385_20140121020342 name-label ( RW): SOME_work_ROOT-295_20130322150148 name-label ( RW): XYZ_ROOT-385_20140122020342 name-label ( RW): XYZ_ROOT-385_20140120020341 name-label ( RW): Template 57d7c73c-ca06-4225-8a9f-7cc5776c5610 name-label ( RW): XYZ_ROOT-385_20140122020342 name-label ( RW): Template e80af9a4-e087-4220-977b-868fa4ec75b6 name-label ( RW): Template 90d42566-b956-4d9d-9685-91e19b693f86 name-label ( RW): Template c2b3a07f-d16f-4abb-9162-55e4130a417c name-label ( RW): XYZ_ROOT-385_20140120020341 name-label ( RW): DDGGWW_ROOT-234_20140118030341 name-label ( RW): Template 8f99eaf7-6d33-4097-8d19-9cafd681f124 name-label ( RW): Template afa9d0a0-8242-443b-ac53-b0b1c760559c name-label ( RW): XYZ_ROOT-385_20140123020342 name-label ( RW): NNGGNN_ROOT-351_20140205020342 name-label ( RW): OOIIOO_ROOT-313_20131203084833 name-label ( RW): Template routing-1 name-label ( RW): Template 61d4df5b-bccf-4457-ad08-0ae57ea16a7e name-label ( RW): XYZ_ROOT-385_20140123020342 name-label ( RW): TTGGTT_ROOT-233_20121213090209 name-label ( RW): XYZ_ROOT-385_20140122020342 name-label ( RW): XYZ_ROOT-385_20140120020341 name-label ( RW): XYZ_ROOT-385_20140123020342 name-label ( RW): Template 1feab759-b573-4227-8b12-8e9846ee4bd6 name-label ( RW): Template 4e444f15-ac78-4b53-9899-c406478f99b2 name-label ( RW): Template 690fa285-3317-45d6-a563-35ddd5af493e name-label ( RW): XYZ_ROOT-385_20140124020342 name-label ( RW): XYZ_ROOT-385_20140121020342 name-label ( RW): ABCCCDDDD_DATA-334_20140207020441 name-label ( RW): Template 4e444f15-ac78-4b53-9899-c406478f99b2 name-label ( RW): XYZ_ROOT-385_20140124020342 name-label ( RW): Template c2b3a07f-d16f-4abb-9162-55e4130a417c name-label ( RW): Template c1e4d036-bd10-4b33-803a-9e34f2c755fe name-label ( RW): XYZ_ROOT-385_20140121020342 name-label ( RW): XYZ_ROOT-385_20140120020341 Is there a reason why LVM snapshot is not destroyed after the actual backup is made to NFS storage? If it's really necessary to have snapshots, we must limit their amount below 32 or what the Limix in XenServer is. It's also crazy to keep snapshots, if each uses the same amount of storage room as original VM on production iSCSI cluster. So from one VM of 30GB with 28 LVM snapshots i get 840GB of usage? -- This message was sent by Atlassian JIRA (v6.1.5#6160)