Re: Scalability question

2011-08-16 Thread Teijo Holzer
Hi, Unfortunately my data set really does grow because it s a time series. I'm going to add a trick to aggregate old data but it will still grow. That's fine, then you need to scale horizontally. Simply add a new node when the load on a node exceeds a threshold (ballpark figure here is a max

Re: Scalability question

2011-08-16 Thread Philippe
Teijo, Unfortunately my data set really does grow because it s a time series. I'm going to add a trick to aggregate old data but it will still grow. How often do you repair per day (or is it really continuous ?) I've been running experiments and I wonder if your decision to perform continuous rep

Re: Scalability question

2011-08-15 Thread Teijo Holzer
Hi, we have come across this as well. We run continuously run rolling repairs followed by major compactions followed by a gc() (or node restart) to get rid of all these sstables files. Combined with aggressive ttls on most inserts, the cluster stays nice and lean. You don't want your working

Re: Scalability question

2011-08-15 Thread Jonathan Ellis
This is more an artifact of repair's problems than compaction per se. We're addressing these in https://issues.apache.org/jira/browse/CASSANDRA-2816 and https://issues.apache.org/jira/browse/CASSANDRA-2280. On Mon, Aug 15, 2011 at 3:06 PM, Philippe wrote: >> It's another reason to avoid major / m

Re: Scalability question

2011-08-15 Thread Philippe
Forgot to mention that stopping & restarting the server brought the data directory down to 283GB in less than 1 minute. Philippe 2011/8/15 Philippe > It's another reason to avoid major / manual compactions which create a >> single big SSTable. Minor compactions keep things in buckets which mea

Re: Scalability question

2011-08-15 Thread Philippe
> > It's another reason to avoid major / manual compactions which create a > single big SSTable. Minor compactions keep things in buckets which means > newer SSTable can be compacted needing to read the bigger older tables. > I've never run a major/manual compaction on this ring. In my case runni

Re: Scalability question

2011-08-14 Thread aaron morton
Multi threaded compaction helps there https://issues.apache.org/jira/browse/CASSANDRA-2191 It's another reason to avoid major / manual compactions which create a single big SSTable. Minor compactions keep things in buckets which means newer SSTable can be compacted needing to read the bigger

Scalability question

2011-08-14 Thread Philippe
Hi, As on-disk SSTables become bigger and bigger because more data is added in the ring, compactions take longer and longer because each file is becoming bigger. Isn't there a time where compacting will take so long that compaction just can't keep up with the amount of data ? It looks to me like t