Yup, got it. Thanks Aaron.
On Tue, Dec 4, 2012 at 4:47 AM, aaron morton <aa...@thelastpickle.com>wrote: > I responded on your other thread. > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 4/12/2012, at 5:31 PM, Yiming Sun <yiming....@gmail.com> wrote: > > I ran into a different problem with Row cache recently, sent a message to > the list, but it didn't get picked up. I am hoping someone can help me > understand the issue. Our data also has rather wide rows, not necessarily > in the thousands range, but definitely in the upper-hundreds levels. They > are hosted in v1.1.1. I was doing a performance test and enabled off-heap > row cache of 1GB for each of our cassandra node (each node has at least > 16GB of memory). The test code was requesting a fixed set of 5000 rows > from the cluster and ran a few times, but using nodetool info, the row > cache hit rate was very low, and a few of the nodes had 0 hits despite the > row cache was full. > > so what i was trying to understand is how the row cache can be full but > with 0 hits? > > > On Mon, Dec 3, 2012 at 6:55 PM, Bill de hÓra <b...@dehora.net> wrote: > >> A Cassandra JVM will generally not function well with with caches and >> wide rows. Probably the most important thing to understand is Ed's point, >> that the row cache caches the entire row, not just the slice that was read >> out. What you've seen is almost exactly the observed behaviour I'd expect >> with enabling either cache provider over wide rows. >> >> - the on-heap cache will result in evictions that crush the JVM trying >> to manage garbage. This is also the case so if the rows have an uneven size >> distribution (as small rows can push out a single large row, large rows >> push out many small ones, etc). >> >> - the off heap cache will spend a lot of time serializing and >> deserializing wide rows, such that it can increase latency relative to just >> reading from disk and leverage the filesystem's cache directly. >> >> The cache resizing behaviour does exist to preserve the server's memory, >> but it can also cause a death spiral in the on-heap case, because a >> relatively smaller cache may result in data being evicted more frequently. >> I've seen cases where sizing up the cache can stabilise a server's memory. >> >> This isn't just a Cassandra thing, it simply happens to be very evident >> with that system - generally to get an effective benefit from a cache, the >> data should be contiguously sized and not too large to allow effective >> cache 'lining'. >> >> Bill >> >> >> On 02/12/12 21:36, Mike wrote: >> >>> Hello, >>> >>> We recently hit an issue within our Cassandra based application. We >>> have a relatively new Column Family with some very wide rows (10's of >>> thousands of columns, or more in some cases). During a periodic >>> activity, we the range of columns to retrieve various pieces of >>> information, a segment at a time. >>> >>> We do these same queries frequently at various stages of the process, >>> and I thought the application could see a performance benefit from row >>> caching. We have a small row cache (100MB per node) already enabled, >>> and I enabled row caching on the new column family. >>> >>> The results were very negative. When performing range queries with a >>> limit of 200 results, for a small minority of the rows in the new column >>> family, performance plummeted. CPU utilization on the Cassandra node >>> went through the roof, and it started chewing up memory. Some queries >>> to this column family hung completely. >>> >>> According to the logs, we started getting frequent GCInspector >>> messages. Cassandra started flushing the largest mem_tables due to >>> hitting the "flush_largest_memtables_at" of 75%, and scaling back the >>> key/row caches. However, to Cassandra's credit, it did not die with an >>> OutOfMemory error. Its measures to emergency measures to conserve >>> memory worked, and the cluster stayed up and running. No real errors >>> showed in the logs, except for Messages getting drop, which I believe >>> was caused by what was going on with CPU and memory. >>> >>> Disabling row caching on this new column family has resolved the issue >>> for now, but, is there something fundamental about row caching that I am >>> missing? >>> >>> We are running Cassandra 1.1.2 with a 6 node cluster, with a replication >>> factor of 3. >>> >>> Thanks, >>> -Mike >>> >>> >>> >> > >