Re: Row caching + Wide row column family == almost crashed?

aaron morton Tue, 04 Dec 2012 01:48:33 -0800

I responded on your other thread. 

Cheers


-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/12/2012, at 5:31 PM, Yiming Sun <yiming....@gmail.com> wrote:

> I ran into a different problem with Row cache recently, sent a message to the 
> list, but it didn't get picked up.  I am hoping someone can help me 
> understand the issue.  Our data also has rather wide rows, not necessarily in 
> the thousands range, but definitely in the upper-hundreds levels.   They are 
> hosted in v1.1.1.   I was doing a performance test and enabled off-heap row 
> cache of 1GB for each of our cassandra node (each node has at least 16GB of 
> memory).   The test code was requesting a fixed set of 5000 rows from the 
> cluster and ran a few times, but using nodetool info,  the row cache hit rate 
> was very low, and a few of the nodes had 0 hits despite the row cache was 
> full.
> 
> so what i was trying to understand is how the row cache can be full but with 
> 0 hits?
> 
> 
> On Mon, Dec 3, 2012 at 6:55 PM, Bill de hÓra <b...@dehora.net> wrote:
> A Cassandra JVM will generally not function well with with caches and wide 
> rows. Probably the most important thing to understand is Ed's point, that the 
> row cache caches the entire row, not just the slice that was read out. What 
> you've seen is almost exactly the observed behaviour I'd expect with enabling 
> either cache provider over wide rows.
> 
>  - the on-heap cache will result in evictions that crush the JVM trying to 
> manage garbage. This is also the case so if the rows have an uneven size 
> distribution (as small rows can push out a single large row, large rows push 
> out many small ones, etc).
> 
>  - the off heap cache will spend a lot of time serializing and deserializing 
> wide rows, such that it can increase latency relative to just reading from 
> disk and leverage the filesystem's cache directly.
> 
> The cache resizing behaviour does exist to preserve the server's memory, but 
> it can also cause a death spiral in the on-heap case, because a relatively 
> smaller cache may result in data being evicted more frequently.  I've seen 
> cases where sizing up the cache can stabilise a server's memory.
> 
> This isn't just a Cassandra thing, it simply happens to be very evident with 
> that system - generally to get an effective benefit from a cache, the data 
> should be contiguously sized and not too large to allow effective cache 
> 'lining'.
> 
> Bill
> 
> 
> On 02/12/12 21:36, Mike wrote:
> Hello,
> 
> We recently hit an issue within our Cassandra based application.  We
> have a relatively new Column Family with some very wide rows (10's of
> thousands of columns, or more in some cases).  During a periodic
> activity, we the range of columns to retrieve various pieces of
> information, a segment at a time.
> 
> We do these same queries frequently at various stages of the process,
> and I thought the application could see a performance benefit from row
> caching.  We have a small row cache (100MB per node) already enabled,
> and I enabled row caching on the new column family.
> 
> The results were very negative.  When performing range queries with a
> limit of 200 results, for a small minority of the rows in the new column
> family, performance plummeted.  CPU utilization on the Cassandra node
> went through the roof, and it started chewing up memory.  Some queries
> to this column family hung completely.
> 
> According to the logs, we started getting frequent GCInspector
> messages.  Cassandra started flushing the largest mem_tables due to
> hitting the "flush_largest_memtables_at" of 75%, and scaling back the
> key/row caches.  However, to Cassandra's credit, it did not die with an
> OutOfMemory error.  Its measures to emergency measures to conserve
> memory worked, and the cluster stayed up and running.  No real errors
> showed in the logs, except for Messages getting drop, which I believe
> was caused by what was going on with CPU and memory.
> 
> Disabling row caching on this new column family has resolved the issue
> for now, but, is there something fundamental about row caching that I am
> missing?
> 
> We are running Cassandra 1.1.2 with a 6 node cluster, with a replication
> factor of 3.
> 
> Thanks,
> -Mike
> 
> 
> 
>

Re: Row caching + Wide row column family == almost crashed?

Reply via email to