> > 1.) What have you found to be the best ratio of Cassandra row cache to memory free on the system for filesystem cache? Are you tuning it like an RDBMS so Cassandra has the vast majority of the RAM in the system or are you letting the filesystem cache do some of the work?
This depends on your exact case: how much rows are in a hot set. Throwing too much memory to JVM cache results in slower garbage collection with no effect on performance. There are cases (for ex, large rows, which are read mostly partially using get_slice), for which row cache will do things worse. I did a try and watch approach, changing size of row cache and watching for row cache hit ratio and op/s. Hit ratio of 0.9 was enough for my case. > > 2.) Is the Cassandra cache write-through (ie are new records held in the row cache as they're written to disk? Not exactly. Cassandra keeps recent writes (not rows) in memory, but after flushing memtable, it will reread from disk (and reconstruct) whole row to row cache on 1st read if data. > > 3.) When using the random partitioner how much difference should be expected (or has been observed) between nodes? 2%? 10%? This depends on data. It will distribute keys almost equal between nodes, nut sizes of row data could be different for different keys. In my case it was about 0.2%