That will not necessarily scale, and I wouldn't recommend it - your "backup
node" will need as much disk space as an entire replica of the cluster
data. For a cluster with a couple of nodes that may be OK, for dozens of
nodes, probably not. You also lose the ability to restore individual nodes
- the only way to replace a dead node is with a full repair.


On Thu, Jun 12, 2014 at 1:38 PM, Jabbar Azam <[email protected]> wrote:

> There is another way. You create a cassandra node in it's own datacentre,
> then any changes going to the main cluster will be replicated to this node.
> You can backup from this node. In the event of a disaster the data from
> both clusters and wiped and then replayed to the individual node. The data
> will then be replicated to the main cluster.
>
> This will also work for the case when the main cluster increases or
> decreases in size.
>
> Thanks
>
> Jabbar Azam
>
>
> On 12 June 2014 18:27, Andrew <[email protected]> wrote:
>
>> There isn’t a lot of “actual documentation” on the act of backing up, but
>> I did research for my own company into the act of backing up and
>> unfortunately, you’re not going to have a similar setup as Oracle.  There
>> are reasons for this, however.
>>
>> If you have more than one replica of the data, that means each node in
>> the cluster will likely be holding it’s own unique set of data.  So you
>> would need to back up the ENTIRE set of nodes in order to get an accurate
>> snapshot.  Likewise, you would need to restore it to the cluster of the
>> same size in order to restore it (and then run refresh to tell Cassandra to
>> reload the tables from disk).
>>
>> Copying the snapshots is easy—it’s just a bunch of files in your data
>> directory.  It’s even smaller if you use incremental snapshots.  I’ll
>> admit, I’m no expert on tape drives, but I’d imagine it’s as easy as
>> copy/pasting the snapshots to the drive (or whatever the equivalent tape
>> drive operation is).
>>
>> What you (and I, admittedly) would really like to see is a way to back up
>> all the logical *data*, and then simply replay it.  This is possible on
>> Oracle because it’s typically restricted to either one (plus maybe one or
>> two standbys) that don’t “share” any data.  What you could do, in theory,
>> is literally select all the data in the entire cluster and simply dump it
>> to a file—but this could take hours, days, or even weeks to complete,
>> depending on the size of your data, and then simply re-load it.  This is
>> probably not a great solution, but hey—maybe it will work for you.
>>
>> Netflix (thankfully) has posted a lot of their operational observations
>> and what not, including their utility Priam.  In their documentation, they
>> include some overviews of what they use:
>> https://github.com/Netflix/Priam/wiki/Backups
>>
>> Hope this helps!
>>
>> Andrew
>>
>> On June 12, 2014 at 6:18:57 AM, Jack Krupansky ([email protected])
>> wrote:
>>
>>   The doc for backing up – and restoring – Cassandra is here:
>>
>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_backup_restore_c.html
>>
>> That doesn’t tell you how to move the “snapshot” to or from tape, but a
>> snapshot is the starting point for backing up Cassandra.
>>
>> -- Jack Krupansky
>>
>>  *From:* Camacho, Maria (NSN - FI/Espoo) <[email protected]>
>> *Sent:* Thursday, June 12, 2014 4:57 AM
>> *To:* [email protected]
>> *Subject:* Backup Cassandra to
>>
>>
>>  Hi there,
>>
>>
>>
>>  I'm trying to find information/instructions about backing up and
>> restoring a Cassandra DB to and from a tape unit.
>>
>>
>>
>>  I was hopping someone in this forum could help me with this since I
>> could not find anything useful in Google :(
>>
>>
>>
>>  Thanks in advance,
>>
>>  Maria
>>
>>
>>
>>
>

Reply via email to