Thanks for the details. I think we were slowly starting to realize a similar pattern, but you definitely helped fill in the gaps: home brew rsync with lzop in the middle. We have raid1 system/commit log drives we are copying to once a day, and off cluster...maybe once a week.
Thanks On Tue, Nov 9, 2010 at 12:04 PM, Edward Capriolo <edlinuxg...@gmail.com>wrote: > On Tue, Nov 9, 2010 at 8:15 AM, Wayne <wav...@gmail.com> wrote: > > I got some very good advice on manual compaction so I thought I would > throw > > out another question on raid/backup strategies for production clusters. > > > > We are debating going with raid 0 vs. raid 10 on our nodes for data > storage. > > Currently all storage we use is raid 10 as drives always fail and raid 10 > > basically makes a drive failure a non event. With Cassandra and a > > replication factor of 3 we start thinking that maybe raid 0 is good > enough. > > Also since we are buying a lot more inexpensive servers raid 0 just seems > to > > hit that price point a lot more. > > > > The problem now becomes how do we deal with the drives that WILL fail in > a > > raid 0 node? We are trying to use snapshots etc. to back up the data but > it > > is slow (hours) and slows down the entire node. We assume this will work > if > > we backup every 2 days at the least in that hinted handoff/reads could > help > > bring the data back into sync. If we can not backup every 1-2 days then > we > > are stuck with nodetool repair, decommission, etc. and using some of > > Cassandra's build in capabilities but here things become more out of our > > control and we are "afraid" to trust it. Like many in recent posts we > have > > been less than successful in testing this out in the .6.x branch. > > > > Can anyone share their decisions for the same and how they managed to > deal > > with these issues? Coming from the relational world raid 10 has been an > > "assumption" for years, and we are not sure whether this assumption > should > > be dropped or held on to. Our nodes in dev are currently around 500Gb so > for > > us the question is how can we restore a node with this amount of data and > > how long will it take? Drives can and will fail, how can we make recovery > a > > non event? What is our total recovery time window? We want it to be in > hours > > after drive replacement (which will be in minutes). > > > > Thanks. > > > > Wayne > > > > Wayne, > > We were more worried about a DR scenario. > > Since SSTables are write once they make good candidates for > incremental and/or differential backups. One option is do run > cassandra snapshots and do incremental backups on that directory. > > We are doing something somewhat cool that I wanted to share. I hacked > together an application that is something like cassandra/hadoop/rsync. > Essentially take the SSTables from each node that are not in hadoop > and copy them there. Write an index file of what SSTables lived on > that node at time of snapshot. This gives us a couple of days > retention as well. > > Snapshots X times daily and off cluster once a day. Makes me feel > safer about our RAID-0 > > I have seen you mention in two threads that you are looking to do > 500GB/node. You have brought up the point yourself "How long will it > take to recover a 500 GB Node?" Good question. Neighbour nodes need to > anti-compact and stream data to the new node. (This is being optimized > in 7.0 but still involves some heavy lifting). You may want to look at > more nodes with less storage per node if you are worried about how > long recovering a RAID-0 node will take. These things can take time > (depending on hardware and load) and pretty much need to restart from > 0 if they do not complete. >