hi;
   do we see any hung process like Repairs on those 3 nodes?  what does
"nodetool netstats" show??

thanks
Sai

On Tue, Apr 19, 2016 at 8:24 AM, Erik Forsberg <forsb...@opera.com> wrote:

> Hi!
>
> I have this problem where 3 of my 84 nodes misbehave with too long GC
> times, leading to them being marked as DN.
>
> This happens when I load data to them using CQL from a hadoop job, so
> quite a lot of inserts at a time. The CQL loading job is using
> TokenAwarePolicy with fallback to DCAwareRoundRobinPolicy. Cassandra java
> driver version 2.1.7.1 is in use.
>
> My other observation is that around the time the GC starts to work like
> crazy, there is a lot of outbound network traffic from the troublesome
> nodes. If a healthy node has around 25 Mbit/s in, 25 Mbit/s out, an
> unhealthy sees 25 Mbit/s in, 200 Mbit/s out.
>
> So, something is iffy with these 3 nodes, but I have some trouble finding
> out exactly what makes them differ.
>
> This is Cassandra 2.0.13 (yes, old) using vnodes. Keyspace is using
> NetworkTopologyStrategy with replication 2, in one datacenter.
>
> One thing I know I'm doing wrong is that I have slightly differing number
> of hosts in each of my 6 chassies (One of them have 15 nodes, one of have
> 13, the remaining have 14). Could what I'm seeing here be the effect of
> that?
>
> Other ideas on what could be wrong? Some kind of vnode imbalance? How can
> I diagnose that? What metrics should I be looking at?
>
> Thanks,
> \EF
>
>
>

Reply via email to