Hi Adrià,

We have about 50.000 CFs of varying size


Before I read any further, having 50,000 CFs is something that I would
highly discourage. Each column family is allocated 1MB of available memory (
CASSANDRA-2252 <https://issues.apache.org/jira/browse/CASSANDRA-2252>) so
having anything over a few hundred on a 1GB heap would be the first thing I
would reconsider. Also 1GB isn't something I'd run a production or load
test  Cassandra on. If your test machine has only 4GB give it half the
total memory (2GB), for a production system you would want something more
than a 4GB machine.

Here are some JIRAs and mailing list topics on the subject of large
quantities of CFs:

https://issues.apache.org/jira/browse/CASSANDRA-7643
https://issues.apache.org/jira/browse/CASSANDRA-6794
https://issues.apache.org/jira/browse/CASSANDRA-7444
http://mail-archives.apache.org/mod_mbox/cassandra-user/201407.mbox/%3C10D771CCF4F243149C928D0CB32BCD78@JackKrupansky14%3E
http://mail-archives.apache.org/mod_mbox/cassandra-user/201408.mbox/%3ccaazu44m87c1yuffz08nzvtkqnww95yaw9bosy_ugu0fswl7...@mail.gmail.com%3E
http://mail-archives.apache.org/mod_mbox/incubator-cassandra-user/201408.mbox/%3CCALRai9Ao=mdkrklowrbyajjp+fc4h5tpx-ejgdqxtayqj5u...@mail.gmail.com%3E


Regards,
Mark

On 28 October 2014 17:19, Bryan Talbot <bryan.tal...@playnext.com> wrote:

> On Tue, Oct 28, 2014 at 9:02 AM, Adria Arcarons <
> adria.arcar...@greenpowermonitor.com> wrote:
>
>>  Hi,
>>
>> Hi
>
>
>
>>
>>
>> We have about 50.000 CFs of varying size
>>
>
>
>>
>>
>
>>
>> The writing test consists of a continuous flow of inserts. The inserts
>> are done inside BATCH statements in groups of 1.000 to a single CF at a
>> time to make them faster.
>>
>
>
>
>>
>>
>> The problem I’m experiencing is that, eventually, when the script has
>> been running for almost 40mins, the heap gets saturated. OldGen gets full
>> and then there is an intensive GC activity trying to free OldGen objects,
>> but it can only free very little space in each pass. Then GC saturates the
>> CPU. Here are the graphs obtained with VisualVM that show this behavior:
>>
>>
>>
>>
>>
>> My total heap size is 1GB and the the NewGen region of 256MB. The C* node
>> has 4GB RAM. Intel Xeon CPU E5520 @
>>
>
>
> Without looking at your VM graphs, I'm going to go out on a limb here and
> say that your host is woefully underpowered to host fifty-thousand column
> families and batch writes of one-thousand statements.
>
> A 1 GB java heap size is sometimes acceptable for a unit test or playing
> around with but you can't actually expect it to be adequate for a load test
> can you?
>
> Every CF consumes some permanent heap space for its metadata. Too many CF
> are a bad thing. You probably have ten times more CF than would be
> recommended as an upper limit.
>
> -Bryan
>
>

Reply via email to