Debugging high tail read latencies (internal timeout)

Nimi Wariboko Jr Wed, 06 Jul 2016 18:22:59 -0700

Hi,

I've begun experiencing very high tail latencies across my clusters. While
Cassandra's internal metrics report <1ms read latencies, measuring
responses from within the driver in my applications (roundtrips of
query/execute frames), have 90% round trip times of up to a second for very
basic queries (SELECT a,b FROM table WHERE pk=x).


I've been studying the logs to try and get a handle on what could be going
wrong. I don't think there are GC issues, but the logs mention dropped
messages due to timeouts while the threadpools are nearly empty -

https://gist.github.com/nemothekid/28b2a8e8353b3e60d7bbf390ed17987c

Relevant line:
REQUEST_RESPONSE messages were dropped in last 5000 ms: 1 for internal
timeout and 0 for cross node timeout. Mean internal dropped latency: 54930
ms and Mean cross-node dropped latency: 0 ms

Are there any tools I can use to start to understand what is causing these
issues?

Nimi

Debugging high tail read latencies (internal timeout)

Reply via email to