Hi Jérôme,

About this concern:

But my Op retains my arm and asks: "Are you sure that the snapshot is safe
> and will be restored before truncating data we have?"


Make sure to enable snapshot on truncate (cassandra.yaml) or do it
manually. This way if the restored dataset is worst than the current one
(the one you plan to truncate), you can always rollback this truncate /
restore action. This way you can tell your "Op" that this is perfectly safe
anyway, no data would be lost, even in the worst case scenario (not
considering the downtime that would be induced). Plus this snapshot is
cheap (hard links) and do not need to be moved around or kept once you are
sure the old backup fits your need.

Truncate is definitely the way to go before restoring a backup. Parsing the
data to delete it all is not really an option imho.

Then about the technical question "how to know that a snapshot is clean" it
would be good to define "clean". You can make sure the backup is readable,
consistent enough and correspond to what you want by inserting all  the
sstables into a testing cluster and performing some reads there before
doing it in production. You can use for example AWS EC2 machines with big
EBS attached or whatever and use the sstableloader to load data into it.

If you are just worried about SSTables format validity, there is no tool I
am aware of to check sstables well formatted but it might exist or be
doable. An other option might be to do a checksum on each sstable before
uploading it elsewhere and make sure it matches when downloaded back.
That's the first things that come to my mind.

Hope that is helpful. Hopefully, someone else will be able to point you to
an existing tool to do this work.

Cheers,
-----------------------
Alain Rodriguez - @arodream - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2017-01-12 11:33 GMT+01:00 Jérôme Mainaud <jer...@mainaud.com>:

> Hello,
>
> Is there any tool to test the integrity of a snapshot?
>
> Suppose I have a snapshot based backup stored in an external low cost
> storage system that I want to restore to a database after someone deleted
> important data by mistake.
>
> Before restoring the files, I will truncate the table to remove the
> problematic tombstones.
>
> But my Op retains my arm and asks: "Are you sure that the snapshot is safe
> and will be restored before truncating data we have?"
>
> If this scenario is a theoretical, the question is good. How can I verify
> that a snapshot is clean?
>
> Thank you,
>
> --
> Jérôme Mainaud
> jer...@mainaud.com
>

Reply via email to