Hi Stan, Put some monitoring on this. The first thing I think of when I hear "chewing up CPU" for Java apps is GC. In SPM <http://sematext.com/spm/> you can easily see individual JVM memory pools and see if any of them are at (close to) 100%. You can typically correlate that to increased GC times and counts. I'd look at that before looking at strace and such.
Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Tue, Nov 25, 2014 at 11:07 PM, Stan Lemon <sle...@salesforce.com> wrote: > We are using v2.0.11 and have seen several instances in our 24 node > cluster where the node becomes unresponsive, when we look into it we find > that there is a cassandra process chewing up a lot of CPU. There are no > other indications in logs or anything as to what might be happening, > however if we strace the process that is chewing up CPU we see a segmental > fault: > > --- SIGSEGV (Segmentation fault) @ 0 (0) --- > rt_sigreturn(0x7fd61110f862) = 30618997712 > futex(0x7fd614844054, FUTEX_WAIT_PRIVATE, 27333, NULL) = -1 EAGAIN > (Resource temporarily unavailable) > futex(0x7fd614844028, FUTEX_WAKE_PRIVATE, 1) = 0 > futex(0x7fd6148e2e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7fd6148e2e50, > {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 > futex(0x7fd6148e2e28, FUTEX_WAKE_PRIVATE, 1) = 1 > futex(0x7fd614844054, FUTEX_WAIT_PRIVATE, 27335, NULL) = 0 > futex(0x7fd614844028, FUTEX_WAKE_PRIVATE, 1) = 0 > futex(0x7fd6148e2e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7fd6148e2e50, > {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 > futex(0x7fd6148e2e28, FUTEX_WAKE_PRIVATE, 1) = 1 > > And this happens over and over again while running strafe. > > Has anyone seen this? Does anyone have any ideas what might be happening, > or how we could debug it further? > > Thanks for your help, > > Stan > >