Hi Jérôme, About this concern:
But my Op retains my arm and asks: "Are you sure that the snapshot is safe > and will be restored before truncating data we have?" Make sure to enable snapshot on truncate (cassandra.yaml) or do it manually. This way if the restored dataset is worst than the current one (the one you plan to truncate), you can always rollback this truncate / restore action. This way you can tell your "Op" that this is perfectly safe anyway, no data would be lost, even in the worst case scenario (not considering the downtime that would be induced). Plus this snapshot is cheap (hard links) and do not need to be moved around or kept once you are sure the old backup fits your need. Truncate is definitely the way to go before restoring a backup. Parsing the data to delete it all is not really an option imho. Then about the technical question "how to know that a snapshot is clean" it would be good to define "clean". You can make sure the backup is readable, consistent enough and correspond to what you want by inserting all the sstables into a testing cluster and performing some reads there before doing it in production. You can use for example AWS EC2 machines with big EBS attached or whatever and use the sstableloader to load data into it. If you are just worried about SSTables format validity, there is no tool I am aware of to check sstables well formatted but it might exist or be doable. An other option might be to do a checksum on each sstable before uploading it elsewhere and make sure it matches when downloaded back. That's the first things that come to my mind. Hope that is helpful. Hopefully, someone else will be able to point you to an existing tool to do this work. Cheers, ----------------------- Alain Rodriguez - @arodream - al...@thelastpickle.com France The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2017-01-12 11:33 GMT+01:00 Jérôme Mainaud <jer...@mainaud.com>: > Hello, > > Is there any tool to test the integrity of a snapshot? > > Suppose I have a snapshot based backup stored in an external low cost > storage system that I want to restore to a database after someone deleted > important data by mistake. > > Before restoring the files, I will truncate the table to remove the > problematic tombstones. > > But my Op retains my arm and asks: "Are you sure that the snapshot is safe > and will be restored before truncating data we have?" > > If this scenario is a theoretical, the question is good. How can I verify > that a snapshot is clean? > > Thank you, > > -- > Jérôme Mainaud > jer...@mainaud.com >