Thanks, Rob, for clarifying!
- Takenori
(2013/09/18 10:01), Robert Coli wrote:
On Tue, Sep 17, 2013 at 5:46 PM, Takenori Sato <ts...@cloudian.com
<mailto:ts...@cloudian.com>> wrote:
> So in fact, incremental backup of Cassandra is just hard link
all the new SSTable files being generated during the incremental
backup period. It could contain any data, not just the data being
update/insert/delete in this period, correct?
Correct.
But over time, some old enough SSTable files are usually shared
across multiple snapshots.
To be clear, "incremental backup" feature backs up the data being
modified in that period, because it writes only those files to the
incremental backup dir as hard links, between full snapshots.
http://www.datastax.com/docs/1.0/operations/backup_restore
"
When incremental backups are enabled (disabled by default), Cassandra
hard-links each flushed SSTable to a backups directory under the
keyspace data directory. This allows you to store backups offsite
without transferring entire snapshots. Also, incremental backups
combine with snapshots to provide a dependable, up-to-date backup
mechanism.
"
What Takenori is referring to is that a full snapshot is in some ways
an "incremental backup" because it shares hard linked SSTables with
other snapshots.
=Rob