I didn't get a response to this, so I'll give it another shot. I tweaked some 
parameters and cleaned up my schema.
My Hadoop/Cassandra job got further, but still dies with an OOM error. This 
time, the heap dump displays
a JMXConfigurableThradPoolExecutor with a retained heap of 7.5G. I presume this 
means that the Hadoop
job is writing to Cassandra faster than Cassandra can write to disk. Is there 
anything I can do to throttle the job? The
Cassandra cluster is set up with default configuration values except for a 
reduced memtable size.

Forgot to mention this is Cassandra 1.1.2

Thanks in advance.

Brian

On Sep 12, 2012, at 7:52 AM, Brian Jeltema wrote:

> I'm a fairly novice Cassandra/Hadoop guy. I have written a Hadoop job (using 
> the Cassandra/Hadoop integration API)
> that performs a full table scan and attempts to populate a new table from the 
> results of the map/reduce. The read
> works fine and is fast, but the table insertion is failing with OOM errors 
> (in the Cassandra VM). The resulting heap dump from one node shows that
> 2.9G of the heap is consumed by a JMXConfigurableThreadPoolExecutor that 
> appears to be full of batch mutations.
> 
> I'm using a 6-node cluster, 32G per node, 8G heap, RF=3, if any of that 
> matters.
> 
> Any suggestions would be appreciated regarding configuration changes or 
> additional information I might
> capture to understand this problem.
> 
> Thanks
> 
> Brian J

Reply via email to