Re: Practical node size limits

2012-09-09 Thread aaron morton
> The bottleneck now seems to be the repair time. If any node becomes too > inconsistent, or needs to be replaced, the rebuilt time is over a week. This is why i've recommended 300GB to 400GB per node in the past. It's not a hard limit, but it seems to be a nice balance. You need to take into c

Re: Practical node size limits

2012-09-06 Thread Dustin Wenz
This is actually another problem that we've encountered with Cassandra; the range of platforms it can be deployed on is fairly limited. If you want to run with Oracle's JRE (which is apparently recommended), you are pretty much stuck with Linux on x86/64 (I haven't tried the new JDK on ARM yet,

Re: Practical node size limits

2012-09-05 Thread Rob Coli
On Sun, Jul 29, 2012 at 7:40 PM, Dustin Wenz wrote: > We've just set up a new 7-node cluster with Cassandra 1.1.2 running under > OpenJDK6. It's worth noting that Cassandra project recommends Sun JRE. Without the Sun JRE, you might not be able to use JAMM to determine the live ratio. Very few pe

Re: Practical node size limits

2012-09-05 Thread Віталій Тимчишин
You can try increasing streaming throttle. 2012/9/4 Dustin Wenz > I'm following up on this issue, which I've been monitoring for the last > several weeks. I thought people might find my observations interesting. > > Ever since increasing the heap size to 64GB, we've had no OOM conditions > that

Re: Practical node size limits

2012-09-04 Thread Dustin Wenz
I'm following up on this issue, which I've been monitoring for the last several weeks. I thought people might find my observations interesting. Ever since increasing the heap size to 64GB, we've had no OOM conditions that resulted in a JVM termination. Our nodes have around 2.5TB of data each, a

Re: Practical node size limits

2012-07-30 Thread Tyler Hobbs
On Mon, Jul 30, 2012 at 2:04 PM, Dustin Wenz wrote: > CFStats reports that the bloom filter size is currently several gigabytes Just so you know, you can control bloom filter sizes now with the per-cf bloom_filter_fp_chance attribute. -- Tyler Hobbs DataStax

Re: Practical node size limits

2012-07-30 Thread Dustin Wenz
Thanks for the pointer! It sounds likely that's what I'm seeing. CFStats reports that the bloom filter size is currently several gigabytes. Is there any way to estimate how much heap space a repair would require? Is it a function of simply adding up the filter file sizes, plus some fraction of n

Re: Practical node size limits

2012-07-29 Thread Edward Capriolo
Yikes. You should read: http://wiki.apache.org/cassandra/LargeDataSetConsiderations Essentially what it sounds like your are now running into is this: The BloomFilters for each SSTable must exist in main memory. Repair tends to create some extra data which normally gets compacted away later. Yo

Practical node size limits

2012-07-29 Thread Dustin Wenz
I'm trying to determine if there are any practical limits on the amount of data that a single node can handle efficiently, and if so, whether I've hit that limit or not. We've just set up a new 7-node cluster with Cassandra 1.1.2 running under OpenJDK6. Each node is 12-core Xeon with 24GB of RA