I should note up front that the JVM simply does not handle heap sizes above 20G very well because the GC starts to become problematic.
Do you read rows in a uniformly random way? If not, caching is your best bet for reducing read latencies. You should have enough space to cache all of your keys, and you may be able to have an effective row cache, depending on your row sizes and access patterns. Obviously the OS buffer cache will help out tremendously once warm. Some people do run Cassandra with really high memory boxes like yours, but it's not optimal for Cassandra. More nodes with less hardware have many advantages. I suppose splitting your CF is an attempt to do something similar. -- Tyler Hobbs Software Engineer, DataStax <http://datastax.com/> Maintainer of the pycassa <http://github.com/pycassa/pycassa> Cassandra Python client library