Dumb question, but referenced twice now: which files are the SSTables and why is backing them up incrementally a win?
Or should I not bother to understand internals, and instead just roll with the "backup my keyspace(s) and system in a compressed tar" strategy, as while it may be excessive, it's guaranteed to work and work easily (which I like, a great deal). will On Fri, Apr 29, 2011 at 4:58 AM, Daniel Doubleday <daniel.double...@gmx.net>wrote: > What we are about to set up is a time machine like backup. This is more > like an add on to the s3 backup. > > Our boxes have an additional larger drive for local backup. We create a new > backup snaphot every x hours which hardlinks the files in the previous > snapshot (bit like cassandras incremental_backups thing) and than we sync > that snapshot dir with the cassandra data dir. We can do archiving / backup > to external system from there without impacting the main data raid. > > But the main reason to do this is to have an 'omg we screwed up big time > and deleted / corrupted data' recovery. > > On Apr 28, 2011, at 9:53 PM, William Oberman wrote: > > Even with N-nodes for redundancy, I still want to have backups. I'm an > amazon person, so naturally I'm thinking S3. Reading over the docs, and > messing with nodeutil, it looks like each new snapshot contains the previous > snapshot as a subset (and I've read how cassandra uses hard links to avoid > excessive disk use). When does that pattern break down? > > I'm basically debating if I can do a "rsync" like backup, or if I should do > a compressed tar backup. And I obviously want multiple points in time. S3 > does allow file versioning, if a file or file name is changed/resused over > time (only matters in the rsync case). My only concerns with compressed > tars is I'll have to have free space to create the archive and I get no > "delta" space savings on the backup (the former is solved by not allowing > the disk space to get so low and/or adding more nodes to bring down the > space, the latter is solved by S3 being really cheap anyways). > > -- > Will Oberman > Civic Science, Inc. > 3030 Penn Avenue., First Floor > Pittsburgh, PA 15201 > (M) 412-480-7835 > (E) ober...@civicscience.com > > > -- Will Oberman Civic Science, Inc. 3030 Penn Avenue., First Floor Pittsburgh, PA 15201 (M) 412-480-7835 (E) ober...@civicscience.com