> My last concern and for me it is a flaw for Cassandra and I am sad to admit > it because I love cassandra : how come that for 6Mb of data, Cassandra feels > the need to fill 500 Mb of RAM ? I can understand the need for, let's say, > 100 Mo because of cache and several Memtable being alive at the same time. > But 500 Mb of ram is 80 time the total amount of data I have. redis that you > mentionned uses 50 Mb.
It's just the way the JVM works, particularly when using the CMS gc. For efficiency reasons it'll tend to use up to your maximum heap size. In the case of default cassandra options, it's even specified that the initial heap size is equal to the maxium. It's not that it needs that much memory for 60 mb of data; it's that the way Cassandra is configured and run by default, in combination with JVM behavior, means that you'll end up eating a significant amount of data for a node. Increasing your 60 mb to 120 mb doesn't double the amount of memory you need for your node. You can change VM settings and tweak things like memtable thresholds and in-memory compaction limits to get it down and get away with a smaller heap size, but honestly I don't recommend doing so unless you're willing to spend some time getting that right and probably repeating some of the work in the future with future versions of Cassandra. I think the bottom line is that Cassandra isn't really primarily intended, out-of-the-box to run as one out of N database servers on a single machine. and no effort is put into trying to make Cassandra very useful for very very small heap sizes. -- / Peter Schuller