Hi,

we have a cluster of 11 nodes running Cassandra 2.2.9 where we regularly
get READ messages dropped:
> READ messages were dropped in last 5000 ms: 974 for internal timeout
> and 0 for cross node timeout
Looking at the logs, some are logged at the same time as Old Gen GCs.
These GCs all take around 4 to 6s to run. To me, it's "normal" that
these could cause reads to be dropped.However, we also have reads dropped 
without Old Gen GCs occurring, only
Young Gen.
I'm wondering if anyone has a good way of determining what the _root_
cause could be. Up until now, the only way we managed to decrease load
on our cluster was by guessing some stuff, trying it out and being
lucky, essentially. I'd love a way to make sure what the problem is
before tackling it. Doing schema changes is not a problem, but changing
stuff blindly is not super efficient :)
What I do see in the logs, is that these happen almost exclusively when
we do a lot of SELECT.  The time logged almost always correspond to
times where our schedules SELECTs are happening. That narrows the scope
a little, but still.
Anyway, I'd appreciate any information about troubleshooting this
scenario.Thanks.

Reply via email to