Hi,

Just taking a wild shot here, sorry if it does not help. Could it be thrown
during reading the sstable? That is, try to find the configuration
parameters for read operation, tune down a little for those settings. Also
check on the the chunk_length_kb.

http://www.datastax.com/documentation/cql/3.1/webhelp/cql/cql_reference/cql_storage_options_c.html

/Jason


On Fri, Dec 6, 2013 at 6:01 PM, Klaus Brunner <klaus.brun...@gmail.com>wrote:

> We're getting fairly reproducible OOMs on a 2-node cluster using
> Cassandra 1.2.11, typically in situations with a heavy read load. A
> sample of some stack traces is at
> https://gist.github.com/KlausBrunner/7820902 - they're all failing
> somewhere down from table.getRow(), though I don't know if that's part
> of query processing, compaction, or something else.
>
> - The main CFs contain some 100k rows, none of them particularly wide.
> - Heap dumps invariably show a single huge byte array (~1.6 GiB
> associated with the OOM'ing thread) hogging > 80% of the Java heap.
> The array seems to contain all/many rows of one CF.
> - We're moderately certain there's no "killer query" with a huge
> result set involved here, but we can't see exactly what triggers this.
> - We've tried to switch to LeveledCompaction, to no avail.
> - Xms/x is set to some 4 GB.
> - The logs show the usual signs of panic ("flushing memtables") before
> actually OOMing. It seems that this scenario is often or even always
> after a compaction, but it's not quite conclusive.
>
> I'm somewhat worried that Cassandra would read so much data into a
> single contiguous byte[] at any point. Could this be related to
> compaction? Any ideas what we could do about this?
>
> Thanks
>
> Klaus
>

Reply via email to