I was wondering/I figured that /var/log/kern indicated the OS was killing
java (versus an internal OOM).

The nodetool repair is interesting.  My application never deletes, so I
didn't bother running it.  But, if that helps prevent OOMs as well, I'll add
it to the crontab....

(plan A is still upgrading to 0.8.0).

will

On Wed, Jun 22, 2011 at 8:53 AM, Sasha Dolgy <sdo...@gmail.com> wrote:

> Yes ... this is because it was the OS that killed the process, and
> wasn't related to Cassandra "crashing".  Reviewing our monitoring, we
> saw that memory utilization was pegged at 100% for days and days
> before it was finally killed because 'apt' was fighting for resource.
> At least, that's as far as I got in my investigation before giving up,
> moving to 0.8.0 and implementing 24hr nodetool repair on each node via
> cronjob....so far ... no problems.
>
> On Wed, Jun 22, 2011 at 2:49 PM, William Oberman
> <ober...@civicscience.com> wrote:
> > Well, I managed to run 50 days before an OOM, so any changes I make will
> > take a while to test ;-)  I've seen the GCInspector log lines appear
> > periodically in my logs, but I didn't see a correlation with the crash.
> > I'll read the instructions on how to properly do a rolling upgrade today,
> > practice on test, and try that on production first.
> > will
>



-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) ober...@civicscience.com

Reply via email to