Re: Cassandra Scaling Questions

Oleg Anastasjev Thu, 05 Aug 2010 01:43:05 -0700

> 
> 1.) What have you found to be the best ratio of Cassandra row cache to memory
free on the system for filesystem cache?  Are you tuning it like an RDBMS so
Cassandra has the vast majority of the RAM in the system or are you letting the
filesystem cache do some of the work?


This depends on your exact case: how much rows are in a hot set. Throwing too
much memory to JVM cache results in slower garbage collection with no effect on
performance. There are cases (for ex, large rows, which are read mostly
partially using get_slice), for which row cache will do things worse. I did a
try and watch approach, changing size of row cache and watching for row cache
hit ratio and op/s. Hit ratio of 0.9 was enough for my case.

> 
> 2.) Is the Cassandra cache write-through (ie are new records held in the row
cache as they're written to disk?

Not exactly. Cassandra keeps recent writes (not rows) in memory, but after
flushing memtable, it will reread from disk (and reconstruct) whole row to row
cache on 1st read if data. 

> 
> 3.) When using the random partitioner how much difference should be expected
(or has been observed) between nodes?  2%? 10%?

This depends on data. It will distribute keys almost equal between nodes, nut
sizes of row data could be different for different keys. In my case it was about
0.2%

Re: Cassandra Scaling Questions

Reply via email to