Hi Stan,

Put some monitoring on this.  The first thing I think of when I hear
"chewing up CPU" for Java apps is GC.  In SPM <http://sematext.com/spm/>
you can easily see individual JVM memory pools and see if any of them are
at (close to) 100%.  You can typically correlate that to increased GC times
and counts.  I'd look at that before looking at strace and such.

Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


On Tue, Nov 25, 2014 at 11:07 PM, Stan Lemon <sle...@salesforce.com> wrote:

> We are using v2.0.11 and have seen several instances in our 24 node
> cluster where the node becomes unresponsive, when we look into it we find
> that there is a cassandra process chewing up a lot of CPU. There are no
> other indications in logs or anything as to what might be happening,
> however if we strace the process that is chewing up CPU we see a segmental
> fault:
>
> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> rt_sigreturn(0x7fd61110f862)            = 30618997712
> futex(0x7fd614844054, FUTEX_WAIT_PRIVATE, 27333, NULL) = -1 EAGAIN
> (Resource temporarily unavailable)
> futex(0x7fd614844028, FUTEX_WAKE_PRIVATE, 1) = 0
> futex(0x7fd6148e2e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7fd6148e2e50,
> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
> futex(0x7fd6148e2e28, FUTEX_WAKE_PRIVATE, 1) = 1
> futex(0x7fd614844054, FUTEX_WAIT_PRIVATE, 27335, NULL) = 0
> futex(0x7fd614844028, FUTEX_WAKE_PRIVATE, 1) = 0
> futex(0x7fd6148e2e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7fd6148e2e50,
> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
> futex(0x7fd6148e2e28, FUTEX_WAKE_PRIVATE, 1) = 1
>
> And this happens over and over again while running strafe.
>
> Has anyone seen this? Does anyone have any ideas what might be happening,
> or how we could debug it further?
>
> Thanks for your help,
>
> Stan
>
>

Reply via email to