Good catch. We ran repairs few times but don't do it on a regular basis. But I found no dependency between count of DigestMismatchExceptions and latency spikes (see attached graphs for example). One important point I didn't mention in the original mail is that all requests (both reads and writes) have CL=LOCAL_QUORUM.
ср, 12 дек. 2018 г. в 19:49, Nitan Kainth <nitankai...@gmail.com>: > DigestMismatchExceptions --> could be due to data out of sync.Are you > running repairs? > > On Wed, Dec 12, 2018 at 11:39 AM Виталий Савкин <vitaliysav...@gmail.com> > wrote: > >> Hi everyone! >> >> Few times a day I see spikes of requests latencies on my cassandra >> clients. Usually 99thPercentile is below 100ms but that times it grows >> above 1 second. >> Type of request doesn't matter: different services are affected and I >> found that three absolutely identical requests (to the same partition key, >> issued in a three-second interval) completed in 1ms, 30ms and 1100ms. Also >> I found no correlation between spikes and patterns of load. G1 GC does not >> report any significant (>50ms) delays. >> Few suspicious things: >> >> - nodetool shows that there are dropped READs >> - there are DigestMismatchExceptions in logs >> - in tracing events I see that event "Executing single-partition >> query on *" sometimes happens right after "READ message received from >> /*.*.*.*" (in less than 100 micros) and sometimes after hundreds of >> milliseconds >> >> My cluster runs on six c5.2xlarge Amazon instances, data is stored on >> EBS. Cassandra version is 3.10. >> Any help in explaining this behavior is appreciated. I'm glad to share >> more details if needed. >> >> Thanks, >> Vitaliy Savkin. >> >
--------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org