Hi Min and Edison, I hope you don't mind me addressing you directly. I see that you two have done most of the work on the Snapshot parts of CS. We've been having production impacting issues due to the bug I tried to describe below (and in ticket 692). Yes, it's my first time engaging in the community so I hope I've took the right approach. :) Also I've did some digging around in the CS 4.0, 4.1 and 4.2 code bases and see that large parts of the Snapshot process have been revised in 4.2. The issue that we have been having where using the 4.0 and 4.1 code bases and, more particularly, due to "snapshot ... is not recorded in DB, remove it" in CleanupSnapshotBackupCommand of NfsSecondaryStorageResource.java. Because CleanupSnapshotBackupCommand has been removed in commit https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=commit;h=27133fba7daefcea6ddba943efb9c96f23dacef2 I wonder if therefore this bug has also been solved?
Thanks in advanced. Kind regards, Joris van Lieshout -----Original Message----- From: Joris van Lieshout [mailto:jvanliesh...@schubergphilis.com] Sent: dinsdag 17 september 2013 15:56 To: 'dev@cloudstack.apache.org' Subject: (CLOUDSTACK-692) The CleanupSnapshotBackup process on SSVM deletes snapshots that are still in the process of being copied to secondary storage Hi there, I was wondering if anyone can help us with this issue? There seems to be a situation where the CleanupSnapshotBackup process deletes vhd files belonging to an active BackupSnapshot process. I've created CLOUDSTACK-692 for it and logged as much info as possible, including the steps I use to clean the mess up after we have hit this. We have seen it happen in CS 4.0 and 4.1.1, and from what I have seen in the code it probably also exists in 4.2. I haven't reproduced the issue in a lab because we are hitting it quite often in production and uat so I have all the examples I need. :) But I guess the best way to reproduce it is to create a vm with quite some io activity (so snapshots will be big), enable hourly snapshot and shorten the storage.cleanup.interval global setting so the cleanup process gets trigger more frequently. We are hitting this on XenServer 6.0.2 but if this snapshot cleanup and backup process is generally the same across other HVs type I would image this is relevant for that as well... Kind regards, Joris van Lieshout