> 1. Is it feasible to run directly against a Cassandra data directory
> restored from an EBS snapshot? (as opposed to nodetool snapshots restored
> from an EBS snapshot).

Assuming EBS is not buggy, including honor write barriers, including
the linux guest kernel etc, then yes. EBS snapshots of a single
volumes are promised to be atomic. As such, a restore from an EBS
snapshot should be semantically identical to recover after a power
outage or sudden reboot of the node.

I make no claims as to how well EBS snapshot atomicity is actually
tested in practice.

> 2. Noting the wiki's consistent Cassandra backups advice; if I schedule
> nodetool snapshots across the cluster, should the relative age of the
> 'sibling' snapshots be a concern? How far apart can they be before its a
> problem? (seconds? minutes? hours?)

The only strict requirement from Cassandra's point of view, that I can
think of, is the tombstone problem. It is the same as for a node going
offline for an extended period; if GC grace times are exceeded than
bringing a node back up can cause data that was deleted to re-appear
in the cluster. The same is true when restoring a node from an EBS
snapshot (essentially equivalent of the node being down for a while).

Once you have satisfied that requirement, the remaining concern is
mostly that of your application. I.e., to what extent it is acceptable
for your application that the cluster contains data representing
different points in time. Remember that any data not on the same row
key will essentially have their own "timeline" with respect to
back/restore, since different rows will never be guaranteed to be
contained on overlapping nodes in the cluster.

Also be aware that while per-node restores from EBS snapshots is
probably a pretty good catastrophic failure recovery technique, do
realize that a "total loss and restore" event will have an impact on
consistency other than going back in time - unless you can co-ordinate
strictly a fully synchronized snapshot across all nodes in the cluster
(not really feasible on EC2 without extensive mucking about in
userland and temporarily bringing down the cluster). For example, if
you do one QUORUM write to row key A followed by a QUORUM write to row
key B, and you rely on referential integrity (for example) of data in
B referring tot he data in A, that integrity can be broken after a
non-globally-consistent restore happens.

Whether that is a problem will be entirely up to your application.

In any case, after a restore from snapshots, you'll want to run
rolling 'nodetool repair':s to make sure all data is replicated as
soon as possible to the greatest extent possible. At least, again, if
your application benefits from this. The only hard requirement is the
repair schedule relative to GC grace time, and that requirement does
not change - just be mindful of the timing of the EBS snapshots and
what that means to your repair schedule.

-- 
/ Peter Schuller

Reply via email to