Re: Rules for Major Compaction

2012-06-19 Thread Jonathan Ellis
On Tue, Jun 19, 2012 at 2:30 PM, Edward Capriolo wrote: > You final two sentences are good ground rules. In our case we have > some column families that have high churn, for example a gc_grace > period of 4 days but the data is re-written completely every day. > Write activity over time will event

Re: Rules for Major Compaction

2012-06-19 Thread Raj N
Thanks Ed. I am on 0.8.4. So I don't have Leveled option, only SizeTiered. I have a strange problem. I have a 6 node cluster(DC1=3, DC2=3). One of the nodes has 105 GB data where as every other node has 60 GB in spite of each one being a replica of the other. And I am contemplating whether I shoul

Re: Rules for Major Compaction

2012-06-19 Thread Edward Capriolo
Hey my favorite question! It is a loaded question and it depends on your workload. The answer has evolved over time. In the old days <0.6.5 the only way to remove tombstones was major compaction. This is not true in any modern version. (Also in the old days you had to run cleanup to clear hints)

Rules for Major Compaction

2012-06-19 Thread Raj N
DataStax recommends not to run major compactions. Edward Capriolo's Cassandra High Performance book suggests that major compaction is a good thing. And should be run on a regular basis. Are there any ground rules about running major compactions? For example, if you have write-once kind of data that