Re: Row caching + Wide row column family == almost crashed?

Bill de hÓra Mon, 03 Dec 2012 15:55:38 -0800

A Cassandra JVM will generally not function well with with caches andwide rows. Probably the most important thing to understand is Ed'spoint, that the row cache caches the entire row, not just the slice thatwas read out. What you've seen is almost exactly the observed behaviourI'd expect with enabling either cache provider over wide rows.

- the on-heap cache will result in evictions that crush the JVM tryingto manage garbage. This is also the case so if the rows have an unevensize distribution (as small rows can push out a single large row, largerows push out many small ones, etc).

- the off heap cache will spend a lot of time serializing anddeserializing wide rows, such that it can increase latency relative tojust reading from disk and leverage the filesystem's cache directly.

The cache resizing behaviour does exist to preserve the server's memory,but it can also cause a death spiral in the on-heap case, because arelatively smaller cache may result in data being evicted morefrequently. I've seen cases where sizing up the cache can stabilise aserver's memory.

This isn't just a Cassandra thing, it simply happens to be very evidentwith that system - generally to get an effective benefit from a cache,the data should be contiguously sized and not too large to alloweffective cache 'lining'.


Bill

On 02/12/12 21:36, Mike wrote:

Hello,

We recently hit an issue within our Cassandra based application.  We
have a relatively new Column Family with some very wide rows (10's of
thousands of columns, or more in some cases).  During a periodic
activity, we the range of columns to retrieve various pieces of
information, a segment at a time.

We do these same queries frequently at various stages of the process,
and I thought the application could see a performance benefit from row
caching.  We have a small row cache (100MB per node) already enabled,
and I enabled row caching on the new column family.

The results were very negative.  When performing range queries with a
limit of 200 results, for a small minority of the rows in the new column
family, performance plummeted.  CPU utilization on the Cassandra node
went through the roof, and it started chewing up memory.  Some queries
to this column family hung completely.

According to the logs, we started getting frequent GCInspector
messages.  Cassandra started flushing the largest mem_tables due to
hitting the "flush_largest_memtables_at" of 75%, and scaling back the
key/row caches.  However, to Cassandra's credit, it did not die with an
OutOfMemory error.  Its measures to emergency measures to conserve
memory worked, and the cluster stayed up and running.  No real errors
showed in the logs, except for Messages getting drop, which I believe
was caused by what was going on with CPU and memory.

Disabling row caching on this new column family has resolved the issue
for now, but, is there something fundamental about row caching that I am
missing?

We are running Cassandra 1.1.2 with a 6 node cluster, with a replication
factor of 3.

Thanks,
-Mike

Re: Row caching + Wide row column family == almost crashed?

Reply via email to