On Tue, Jul 1, 2014 at 6:06 PM, Kevin Burton <bur...@spinn3r.com> wrote:
> you know.. one thing I failed to mention.. .is that this is going into a > "bucket" and while it's a logical row, the physical row is like 500MB … > according to compaction logs. > > is the ENTIRE physical row going into the cache as one unit? That's > definitely going to be a problem in this model. 500MB is a big atomic unit. > Yes, the row cache is a row cache. It caches what the storage engine calls rows, which CQL calls "partitions." [1] Rows have to be assembled from all of their row fragments in Memtables/SSTables. This is a big part of why the "off-heap" row cache's behavior of invalidation on write is so bad for its overall performance. Updating a single column in your 500MB row invalidates it and forces you to assemble the entire 500MB row from disk. The only valid use case for the current off-heap row cache seems to be : very small, very uniform in size, very hot, and very rarely modified. https://issues.apache.org/jira/browse/CASSANDRA-5357 Is the ticket for replacing the row cache and its unexpected characteristics with something more like an actual query cache. also.. I assume it's having to do a binary search within the physical row ? Since the column level bloom filter's removal in 1.2, the only way it can get to specific columns is via the index. =Rob [1] https://issues.apache.org/jira/browse/CASSANDRA-6632