Thank you Bryan and Mark. I have redesigned my schema in such a way that I only 
have 50CFs and I’ve given 2GB for the Heap space and now it’s working fine.

De: Mark Reddy [mailto:mark.l.re...@gmail.com]
Enviado el: martes, 28 de octubre de 2014 18:31
Para: user@cassandra.apache.org
Asunto: Re: OldGen saturation

Hi Adrià,

We have about 50.000 CFs of varying size

Before I read any further, having 50,000 CFs is something that I would highly 
discourage. Each column family is allocated 1MB of available memory 
(CASSANDRA-2252<https://issues.apache.org/jira/browse/CASSANDRA-2252>) so 
having anything over a few hundred on a 1GB heap would be the first thing I 
would reconsider. Also 1GB isn't something I'd run a production or load test  
Cassandra on. If your test machine has only 4GB give it half the total memory 
(2GB), for a production system you would want something more than a 4GB machine.

Here are some JIRAs and mailing list topics on the subject of large quantities 
of CFs:

https://issues.apache.org/jira/browse/CASSANDRA-7643
https://issues.apache.org/jira/browse/CASSANDRA-6794
https://issues.apache.org/jira/browse/CASSANDRA-7444
http://mail-archives.apache.org/mod_mbox/cassandra-user/201407.mbox/%3C10D771CCF4F243149C928D0CB32BCD78@JackKrupansky14%3E
http://mail-archives.apache.org/mod_mbox/cassandra-user/201408.mbox/%3ccaazu44m87c1yuffz08nzvtkqnww95yaw9bosy_ugu0fswl7...@mail.gmail.com%3E
http://mail-archives.apache.org/mod_mbox/incubator-cassandra-user/201408.mbox/%3CCALRai9Ao=mdkrklowrbyajjp+fc4h5tpx-ejgdqxtayqj5u...@mail.gmail.com%3E


Regards,
Mark

On 28 October 2014 17:19, Bryan Talbot 
<bryan.tal...@playnext.com<mailto:bryan.tal...@playnext.com>> wrote:
On Tue, Oct 28, 2014 at 9:02 AM, Adria Arcarons 
<adria.arcar...@greenpowermonitor.com<mailto:adria.arcar...@greenpowermonitor.com>>
 wrote:
Hi,
Hi



We have about 50.000 CFs of varying size



The writing test consists of a continuous flow of inserts. The inserts are done 
inside BATCH statements in groups of 1.000 to a single CF at a time to make 
them faster.



The problem I’m experiencing is that, eventually, when the script has been 
running for almost 40mins, the heap gets saturated. OldGen gets full and then 
there is an intensive GC activity trying to free OldGen objects, but it can 
only free very little space in each pass. Then GC saturates the CPU. Here are 
the graphs obtained with VisualVM that show this behavior:


My total heap size is 1GB and the the NewGen region of 256MB. The C* node has 
4GB RAM. Intel Xeon CPU E5520 @


Without looking at your VM graphs, I'm going to go out on a limb here and say 
that your host is woefully underpowered to host fifty-thousand column families 
and batch writes of one-thousand statements.

A 1 GB java heap size is sometimes acceptable for a unit test or playing around 
with but you can't actually expect it to be adequate for a load test can you?

Every CF consumes some permanent heap space for its metadata. Too many CF are a 
bad thing. You probably have ten times more CF than would be recommended as an 
upper limit.

-Bryan


Reply via email to