Re: Fetching ONE cell with a row cache hit takes 1 second on an idle box?

Colin Tue, 01 Jul 2014 20:47:15 -0700

Rowcache is typically turned off because it is only useful in very specific 
situations-the row(s) need to fit in memory.  Also, the access patterns have to 
fit.


If all the rows you're accessing can fit, Rowcache is a great thing. Otherwise, 
not so great.

--
Colin
320-221-9531


> On Jul 1, 2014, at 10:40 PM, Kevin Burton <bur...@spinn3r.com> wrote:
> 
> WOW.. so based on your advice, and a test, I disabled the row cache for the 
> table.
> 
> The query was instantly 20x faster.
> 
> so this is definitely an anti-pattern.. I suspect cassandra just tries to 
> read in they entire physical row into memory and since my physical row is 
> rather big.. ha.  Well that wasn't very fun :)
> 
> BIG win though ;)
> 
> 
>> On Tue, Jul 1, 2014 at 6:52 PM, Kevin Burton <bur...@spinn3r.com> wrote:
>> A work around for this, is the VFS page cache.. basically, disabling 
>> compression, and then allowing the VFS page cache to keep your data in 
>> memory.
>> 
>> The only downside is the per column overhead.  But if you can store 
>> everything in a 'blob' which is optionally compressed, you're generally 
>> going to be ok.
>> 
>> Kevin
>> 
>> 
>>> On Tue, Jul 1, 2014 at 6:50 PM, Kevin Burton <bur...@spinn3r.com> wrote:
>>> so.. caching the *queries* ?
>>> 
>>> it seems like a better mechanism would be to cache the actually logical 
>>> row…, not the physical row.  
>>> 
>>> Query caches just don't work in production,  If you re-word your query, or 
>>> structure it a different way, you get a miss…
>>> 
>>> In my experience.. query caches have a 0% hit rate.
>>> 
>>> 
>>>> On Tue, Jul 1, 2014 at 6:40 PM, Robert Coli <rc...@eventbrite.com> wrote:
>>>>> On Tue, Jul 1, 2014 at 6:06 PM, Kevin Burton <bur...@spinn3r.com> wrote:
>>>>> you know.. one thing I failed to mention.. .is that this is going into a 
>>>>> "bucket" and while it's a logical row, the physical row is like 500MB … 
>>>>> according to compaction logs.
>>>>> 
>>>>> is the ENTIRE physical row going into the cache as one unit?  That's 
>>>>> definitely going to be a problem in this model.  500MB is a big atomic 
>>>>> unit.
>>>> 
>>>> Yes, the row cache is a row cache. It caches what the storage engine calls 
>>>> rows, which CQL calls "partitions." [1] Rows have to be assembled from all 
>>>> of their row fragments in Memtables/SSTables.
>>>> 
>>>> This is a big part of why the "off-heap" row cache's behavior of 
>>>> invalidation on write is so bad for its overall performance. Updating a 
>>>> single column in your 500MB row invalidates it and forces you to assemble 
>>>> the entire 500MB row from disk. 
>>>> 
>>>> The only valid use case for the current off-heap row cache seems to be : 
>>>> very small, very uniform in size, very hot, and very rarely modified.
>>>> 
>>>> https://issues.apache.org/jira/browse/CASSANDRA-5357
>>>> 
>>>> Is the ticket for replacing the row cache and its unexpected 
>>>> characteristics with something more like an actual query cache.
>>>> 
>>>>> also.. I assume it's having to do a binary search within the physical row 
>>>>> ? 
>>>> 
>>>> Since the column level bloom filter's removal in 1.2, the only way it can 
>>>> get to specific columns is via the index.
>>>> 
>>>> =Rob
>>>> [1] https://issues.apache.org/jira/browse/CASSANDRA-6632
>>> 
>>> 
>>> 
>>> -- 
>>> Founder/CEO Spinn3r.com
>>> Location: San Francisco, CA
>>> blog: http://burtonator.wordpress.com
>>> … or check out my Google+ profile
>>> 
>> 
>> 
>> 
>> -- 
>> Founder/CEO Spinn3r.com
>> Location: San Francisco, CA
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> 
> 
> 
> 
> -- 
> Founder/CEO Spinn3r.com
> Location: San Francisco, CA
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
>

Re: Fetching ONE cell with a row cache hit takes 1 second on an idle box?

Reply via email to