I have this situation where a few (like, 3-4 out of 84) nodes misbehave. Very long GC pauses, dropping out of cluster etc.

This happens while loading data (via CQL), and analyzing metrics it looks like on these few nodes, a lot of hints are being generated close to the time when they start to misbehave.

Since this is Cassandra 2.0.13 which have a less than optimal hints implementation, largs numbers of hints is a GC troublemaker.

Again looking at metrics, it looks like hints are being generated for a large number of nodes, so it doesn't look like the destination nodes are at fault. So, I'm confused.

Any Hints (pun intended) on what could cause a few nodes to generate more hints than the rest of the cluster?

Regards,
\EF

Reply via email to