After a bit of searching, I think I've found the answer I've been looking for. I guess I didn't search hard enough before sending out this email. Thank you all for the responses.
According to the datastax documentation [1], there are two types of row cache providers: row_cache_provider (Default: SerializingCacheProvider) Specifies what kind of implementation to use for the row cache. SerializingCacheProvider: Serializes the contents of the row and stores it in native memory, that is, off the JVM Heap. Serialized rows take significantly less memory than live rows in the JVM, so you can cache more rows in a given memory footprint. Storing the cache off-heap means you can use smaller heap sizes, which reduces the impact of garbage collection pauses. It is valid to specify the fully-qualified class name to a class that implementsorg.apache.cassandra.cache.IRowCacheProvider. ConcurrentLinkedHashCacheProvider: Rows are cached using the JVM heap, providing the same row cache behavior as Cassandra versions prior to 0.8. The SerializingCacheProvider is 5 to 10 times more memory-efficient than ConcurrentLinkedHashCacheProvider for applications that are not blob-intensive. However, SerializingCacheProvider may perform worse in update-heavy workload situations because it invalidates cached rows on update instead of updating them in place as ConcurrentLinkedHashCacheProvider does. The off-heap row cache provider does indeed invalidate rows. We're going to look into using the ConcurrentLinkedHashCacheProvider. Time to read some source code! :) Faraaz [1] http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/configuration/configCassandra_yaml_r.html#reference_ds_qfg_n1r_1k__row_cache_provider On Thursday, August 22, 2013 at 7:40 PM, Boris Yen wrote: > If you are using off-heap memory for row cache, "all writes invalidate the > entire row" should be correct. > > Boris > > > On Fri, Aug 23, 2013 at 8:32 AM, Robert Coli <rc...@eventbrite.com > (mailto:rc...@eventbrite.com)> wrote: > > On Wed, Aug 14, 2013 at 10:56 PM, Faraaz Sareshwala > > <fsareshw...@quantcast.com (mailto:fsareshw...@quantcast.com)> wrote: > > > All writes invalidate the entire row (updates thrown out the cached row) > > This is not correct. Writes are added to the row, if it is in the row > > cache. If it's not in the row cache, the row is not added to the cache. > > > > Citation from jbellis on stackoverflow, because I don't have time to find a > > better one and the code is not obvious about it : > > > > http://stackoverflow.com/a/12499422 > > > > > I have yet to go through the source code for the row cache. I do plan to > > > do that. Can someone point me to documentation on the row cache > > > internals? All I've found online so far is small discussion about it and > > > how to enable it. > > > > There is no such documentation, or at least if it exists I am unaware of it. > > > > In general, the rule of thumb is that the Row Cache should not be used > > unless the rows in question are : > > > > 1) Very hot in terms of access > > 2) Uniform in size > > 3) "Small" > > > > =Rob