Re: Alternative Row Cache Implementation

Jonathan Ellis Thu, 30 Jun 2011 18:46:50 -0700

I'm interested. :)

On Thu, Jun 30, 2011 at 11:44 AM, Daniel Doubleday
<daniel.double...@gmx.net> wrote:
> Hi all - or rather devs
>
> we have been working on an alternative implementation to the existing row 
> cache(s)
>
> We have 2 main goals:
>
> - Decrease memory -> get more rows in the cache without suffering a huge 
> performance penalty
> - Reduce gc pressure
>
> This sounds a lot like we should be using the new serializing cache in 0.8.
> Unfortunately our workload consists of loads of updates which would 
> invalidate the cache all the time.
>
> The second unfortunate thing is that the idea we came up with doesn't fit the 
> new cache provider api...
>
> It looks like this:
>
> Like the serializing cache we basically only cache the serialized byte 
> buffer. we don't serialize the bloom filter and try to do some other minor 
> compression tricks (var ints etc not done yet). The main difference is that 
> we don't deserialize but use the normal sstable iterators and filters as in 
> the regular uncached case.
>
> So the read path looks like this:
>
> return filter.collectCollatedColumns(memtable iter, cached row iter)
>
> The write path is not affected. It does not update the cache
>
> During flush we merge all memtable updates with the cached rows.
>
> These are early test results:
>
> - Depending on row width and value size the serialized cache takes between 
> 30% - 50% of memory compared with cached CF. This might be optimized further
> - Read times increase by 5 - 10%
>
> We haven't tested the effects on gc but hope that we will see improvements 
> there because we only cache a fraction of objects (in terms of numbers) in 
> old gen heap which should make gc cheaper. Of course there's also the option 
> to use native mem like serializing cache does.
>
> We believe that this approach is quite promising but as I said it is not 
> compatible with the current cache api.
>
> So my question is: does that sound interesting enough to open a jira or has 
> that idea already been considered and rejected for some reason?
>
> Cheers,
> Daniel
>




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Alternative Row Cache Implementation

Reply via email to