I'm interested. :) On Thu, Jun 30, 2011 at 11:44 AM, Daniel Doubleday <daniel.double...@gmx.net> wrote: > Hi all - or rather devs > > we have been working on an alternative implementation to the existing row > cache(s) > > We have 2 main goals: > > - Decrease memory -> get more rows in the cache without suffering a huge > performance penalty > - Reduce gc pressure > > This sounds a lot like we should be using the new serializing cache in 0.8. > Unfortunately our workload consists of loads of updates which would > invalidate the cache all the time. > > The second unfortunate thing is that the idea we came up with doesn't fit the > new cache provider api... > > It looks like this: > > Like the serializing cache we basically only cache the serialized byte > buffer. we don't serialize the bloom filter and try to do some other minor > compression tricks (var ints etc not done yet). The main difference is that > we don't deserialize but use the normal sstable iterators and filters as in > the regular uncached case. > > So the read path looks like this: > > return filter.collectCollatedColumns(memtable iter, cached row iter) > > The write path is not affected. It does not update the cache > > During flush we merge all memtable updates with the cached rows. > > These are early test results: > > - Depending on row width and value size the serialized cache takes between > 30% - 50% of memory compared with cached CF. This might be optimized further > - Read times increase by 5 - 10% > > We haven't tested the effects on gc but hope that we will see improvements > there because we only cache a fraction of objects (in terms of numbers) in > old gen heap which should make gc cheaper. Of course there's also the option > to use native mem like serializing cache does. > > We believe that this approach is quite promising but as I said it is not > compatible with the current cache api. > > So my question is: does that sound interesting enough to open a jira or has > that idea already been considered and rejected for some reason? > > Cheers, > Daniel >
-- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com