Re: Debugging high coordinator latencies

Alain RODRIGUEZ Wed, 04 Jul 2018 03:13:10 -0700

Hello,

If your problem is in the read path, there are things you can check to see
what's wrong along the way:


- 'nodetool tablestats' (look at the more important tables first - biggest
volume/throughput). A lot of information at the table level, very useful to
troubleshoot. If you have any question with the interpretation of the
output, just let us know :).
- 'nodetool tablehistogram' will give you information on the local reads.
Ideally latencies are low around the millisecond (or maybe a few ms,
depends on disk speed, caches...)  this should allow understanding if the
local reads are performant or not, it might not be related to
'coordination'. Also see the number of sstables hit per read. This should
be as low as possible, as each hit to another SSTable should touch the
disk, which is the slowest part of our infrastructures. Compaction strategy
and tuning can help here.
- 'nodetool tpstats' - This shows the thread pool stats. Here we are
especially interested in the pending/blocked/dropped tasks. It's sometimes
enlightening to use the following command during a peak in traffic when the
pressure is high: 'watch -d nodetool tpstats'
- 'nodetool compactionstats -H' - Make sure compactions are keeping up, in
particular if reads are hitting a lot of sstables.
- You can trace some queries using cqlsh for example. See what is slow - if
you manage to find out which query is slow.

Beyond that, debugging usually takes a heap dump and inspection with
> yourkit or MAT or similar
>

Yes. About similar tools, you can also give a try to
https://github.com/aragozin/jvm-tools

With commands like, the following you could start understanding what is
happening in the heap.

java -jar sjk-0.10.1.jar ttop -p <Cassandra_pid> -n 20 -o CPU
# On my mac/ccm test cluster I ran something like this:
java -jar sjk-0.10.1.jar ttop -p $(ps u | grep cassandra | grep -v
grep | awk '{print $2}' | head -n 1) -n 25 -o CPU


Anything else I can do to conclude whether this is GC related or not ?
>

In most cases, it is possible to have GC pauses reaching lower values than
3 - 5% of stop the world pauses (ie. node unavailable 3 to 5% of the time
doing GC). Which leaves 95+% of the time for the user application, Apache
Cassandra in our case. If you want to share the gc.logs during a spike in
latencies we could probably let you know how GC is performing.

What hardware are you using?

C*heers,
-----------------------
Alain Rodriguez - @arodream - al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2018-07-01 8:26 GMT+01:00 Tunay Gür <tunay...@gmail.com>:

> Thanks for the recommendation Jeff, I'll try to get a heap dump next time
> this happens and try the other changes in the mean time.
>
> Also not sure but this CASSANDRA-13900 looked it might be related.
>
> On Sat, Jun 30, 2018 at 9:51 PM, Jeff Jirsa <jji...@gmail.com> wrote:
>
>> The young gcs loom suspicious
>>
>> Without seeing the heap it’s hard to be sure, but be sure you’re
>> adjusting your memtable (size and flush threshold), and you may find moving
>> it offheap helps too
>>
>> Beyond that, debugging usually takes a heap dump and inspection with
>> yourkit or MAT or similar
>>
>> 3.0.14.10 reads like a Datastax version - I know there’s a few reports of
>> recyclers not working great in 3.11.x but haven’t seen many heap related
>> leak concerns with 3.0.14
>>
>> --
>> Jeff Jirsa
>>
>>
>> On Jun 30, 2018, at 5:49 PM, Tunay Gür <tunay...@gmail.com> wrote:
>>
>> Dear Cassandra users,
>>
>> I'm observing high coordinator latencies (spikes going over 1sec for P99)
>> without corresponding keyspace read latencies. After researching this list
>> and public web, I focused my investigation around GC, but still couldn't
>> convince myself %100 (mainly because my lack of experience in JVM GC and
>> Cassandra behavior). I'd appreciate if you can help me out.
>>
>> *Setup:*
>> - 2DC 40 nodes each
>> - Cassandra Version: 3.0.14.10
>> - G1GC
>> - -Xms30500M -Xmx30500M
>> - Traffic mix:  20K continuous RPS  + 10K continuous WPS + 40K WPS daily
>> bulk ingestion (for 2 hours)
>> - Row cache disabled, Keycache 5GB capacity
>>
>> *Some observations:*
>> - I don't have clear repro steps, but I feel like high coordinator
>> latencies gets triggered by some sudden change in traffic (i.e bulk
>> ingestion or DC failover). For example last time it happened, bulk
>> ingestion triggered it and coordinator latencies keep spiraling up until I
>> drain some of the traffic:
>>
>> <Screen Shot 2018-06-30 at 5.15.31 PM.png>
>> 
>> - I see corresponding increase in GC warning logs that looks similar to
>> this:
>>
>> G1 Young Generation GC in 3543ms. G1 Eden Space: 1535115264 -> 0; G1 Old
>> Gen: 14851011568 -> 14585937368; G1 Survivor Space: 58720256 -> 83886080;
>> - Also I see the following warnings every once in a while:
>>
>> Not marking nodes down due to local pause of 5169439644 > 5000000000
>>
>> - Looks like cluster goes into this state after a while, maybe after 10
>> days or so. Restarting cluster helps. When things are working I've seen
>> this cluster handling 1M RPS without a problem.
>>
>> - I don't have root access on the machines but I can collect GC logs, I'm
>> not sure if I interpret them correctly but one observation is that a lot
>> more young gen GC happening with less memory reclaimed during latency
>> spikes.
>>
>> Anything else I can do to conclude whether this is GC related or not ?
>>
>>
>>
>>
>>
>>
>>
>>
>

Re: Debugging high coordinator latencies

Reply via email to