Hi Adrià, We have about 50.000 CFs of varying size
Before I read any further, having 50,000 CFs is something that I would highly discourage. Each column family is allocated 1MB of available memory ( CASSANDRA-2252 <https://issues.apache.org/jira/browse/CASSANDRA-2252>) so having anything over a few hundred on a 1GB heap would be the first thing I would reconsider. Also 1GB isn't something I'd run a production or load test Cassandra on. If your test machine has only 4GB give it half the total memory (2GB), for a production system you would want something more than a 4GB machine. Here are some JIRAs and mailing list topics on the subject of large quantities of CFs: https://issues.apache.org/jira/browse/CASSANDRA-7643 https://issues.apache.org/jira/browse/CASSANDRA-6794 https://issues.apache.org/jira/browse/CASSANDRA-7444 http://mail-archives.apache.org/mod_mbox/cassandra-user/201407.mbox/%3C10D771CCF4F243149C928D0CB32BCD78@JackKrupansky14%3E http://mail-archives.apache.org/mod_mbox/cassandra-user/201408.mbox/%3ccaazu44m87c1yuffz08nzvtkqnww95yaw9bosy_ugu0fswl7...@mail.gmail.com%3E http://mail-archives.apache.org/mod_mbox/incubator-cassandra-user/201408.mbox/%3CCALRai9Ao=mdkrklowrbyajjp+fc4h5tpx-ejgdqxtayqj5u...@mail.gmail.com%3E Regards, Mark On 28 October 2014 17:19, Bryan Talbot <bryan.tal...@playnext.com> wrote: > On Tue, Oct 28, 2014 at 9:02 AM, Adria Arcarons < > adria.arcar...@greenpowermonitor.com> wrote: > >> Hi, >> >> Hi > > > >> >> >> We have about 50.000 CFs of varying size >> > > >> >> > >> >> The writing test consists of a continuous flow of inserts. The inserts >> are done inside BATCH statements in groups of 1.000 to a single CF at a >> time to make them faster. >> > > > >> >> >> The problem I’m experiencing is that, eventually, when the script has >> been running for almost 40mins, the heap gets saturated. OldGen gets full >> and then there is an intensive GC activity trying to free OldGen objects, >> but it can only free very little space in each pass. Then GC saturates the >> CPU. Here are the graphs obtained with VisualVM that show this behavior: >> >> >> >> >> >> My total heap size is 1GB and the the NewGen region of 256MB. The C* node >> has 4GB RAM. Intel Xeon CPU E5520 @ >> > > > Without looking at your VM graphs, I'm going to go out on a limb here and > say that your host is woefully underpowered to host fifty-thousand column > families and batch writes of one-thousand statements. > > A 1 GB java heap size is sometimes acceptable for a unit test or playing > around with but you can't actually expect it to be adequate for a load test > can you? > > Every CF consumes some permanent heap space for its metadata. Too many CF > are a bad thing. You probably have ten times more CF than would be > recommended as an upper limit. > > -Bryan > >