Are you running with the default heap settings? what else is running on the
boxes?



On Wed, Jun 22, 2011 at 9:06 AM, William Oberman
<ober...@civicscience.com>wrote:

> I was wondering/I figured that /var/log/kern indicated the OS was killing
> java (versus an internal OOM).
>
> The nodetool repair is interesting.  My application never deletes, so I
> didn't bother running it.  But, if that helps prevent OOMs as well, I'll add
> it to the crontab....
>
> (plan A is still upgrading to 0.8.0).
>
> will
>
>
> On Wed, Jun 22, 2011 at 8:53 AM, Sasha Dolgy <sdo...@gmail.com> wrote:
>
>> Yes ... this is because it was the OS that killed the process, and
>> wasn't related to Cassandra "crashing".  Reviewing our monitoring, we
>> saw that memory utilization was pegged at 100% for days and days
>> before it was finally killed because 'apt' was fighting for resource.
>> At least, that's as far as I got in my investigation before giving up,
>> moving to 0.8.0 and implementing 24hr nodetool repair on each node via
>> cronjob....so far ... no problems.
>>
>> On Wed, Jun 22, 2011 at 2:49 PM, William Oberman
>> <ober...@civicscience.com> wrote:
>> > Well, I managed to run 50 days before an OOM, so any changes I make will
>> > take a while to test ;-)  I've seen the GCInspector log lines appear
>> > periodically in my logs, but I didn't see a correlation with the crash.
>> > I'll read the instructions on how to properly do a rolling upgrade
>> today,
>> > practice on test, and try that on production first.
>> > will
>>
>
>
>
> --
> Will Oberman
> Civic Science, Inc.
> 3030 Penn Avenue., First Floor
> Pittsburgh, PA 15201
> (M) 412-480-7835
> (E) ober...@civicscience.com
>



-- 
http://twitter.com/tjake

Reply via email to