> Is your entire keyset active? If not set a sane starting point (default for
> key cache is 200,000 http://wiki.apache.org/cassandra/StorageConfiguration )
>  and see what the cache hit's are like. How many keys do you have? What
> was your hit rate with 100% key cache?

Also, keep in mind that the key cache will only eliminate one seek
(finding row position in the index is exactly one seek, unless cached
by the OS). Even if you dedicate your entire memory to JVM heap and
fill it with key cache, you will never do better than avoiding the
*one* seek per read. If your entire memory is wasted on key cache,
you'll take the row seek anyway so you only eliminated at most half
the overhead.

In the best case, the row is cached and the key saved you from going
to disk. In such a case, the key cache gave you quite a lot. But keep
in mind that if your data size is such that most row reads are cached
by the OS, then probably most index accesses would be too assuming the
rows are significantly bigger than the index (which is normal).

I'd say the key cache is most effective when your active set is small
enough that a reasonably sized key cache will eliminate the majority
of seeks on reads without blowing away significant amounts of memory.
Especially now in 0.7 (is it backported to 0.6.x?) where the key cache
is efficiently saved and re-loaded on start, giving you guaranteed
hotness of the key cache. Also the bigger discrepancy between row size
and key size, the more useful I would expect the key cache to be
(i.e., the fatter the rows, the more useful the key cache).

Ok, that was a bit unclearly stated... I'm not sure how to phrase it
sensibly. I guess the bottom line is that unless you specifically know
you need to and there are special circumstances, the key cache should
likely not be huge in comparison to available memory.

-- 
/ Peter Schuller

Reply via email to