I should note up front that the JVM simply does not handle heap sizes above
20G very well because the GC starts to become problematic.

Do you read rows in a uniformly random way?  If not, caching is your best
bet for reducing read latencies.  You should have enough space to cache all
of your keys, and you may be able to have an effective row cache, depending
on your row sizes and access patterns.  Obviously the OS buffer cache will
help out tremendously once warm.

Some people do run Cassandra with really high memory boxes like yours, but
it's not optimal for Cassandra.  More nodes with less hardware have many
advantages.  I suppose splitting your CF is an attempt to do something
similar.

-- 
Tyler Hobbs
Software Engineer, DataStax <http://datastax.com/>
Maintainer of the pycassa <http://github.com/pycassa/pycassa> Cassandra
Python client library

Reply via email to