CIL

From: Alain RODRIGUEZ [mailto:arodr...@gmail.com]
Sent: Saturday, December 17, 2016 5:18 AM
To: user@cassandra.apache.org
Subject: Re: High CPU on nodes

Hi,

What does 'nodetool netstats' looks like on those nodes?

Its not doing any streaming.

we have 30GB heap

How is the JVM / GC doing? Are you using G1GC or CMS? This setting would be bad 
for CMS.

G1. GC is doing fine. I don’t see any long pauses beyond 200 ms.

You can use this tool to understand were the CPU is being used 
https://github.com/aragozin/jvm-tools/blob/master/sjk-core/COMMANDS.md#ttop-command<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Faragozin%2Fjvm-tools%2Fblob%2Fmaster%2Fsjk-core%2FCOMMANDS.md%23ttop-command&data=02%7C01%7CAnubhav.Kale%40microsoft.com%7Cab2c0fcf99a447694b0908d4267f3036%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636175775106811606&sdata=R%2FouOelExm1C3okjg9zEJsdlCiDRrhy8%2B9n3SIqC4fg%3D&reserved=0>.

I hope that helps,

C*heers,
-----------------------
Alain Rodriguez - @arodream - 
al...@thelastpickle.com<mailto:al...@thelastpickle.com>
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.thelastpickle.com&data=02%7C01%7CAnubhav.Kale%40microsoft.com%7Cab2c0fcf99a447694b0908d4267f3036%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636175775106811606&sdata=kZPi%2B43OyWGNr%2FAmJsLflVOWkSMI0V7oK4x%2Ff%2FR27BU%3D&reserved=0>



2016-12-17 0:10 GMT+01:00 Anubhav Kale 
<anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>>:
Hello,

I am trying to fight a high CPU problem on some of our nodes. Thread dumps show 
that it’s not GC threads (we have 30GB heap), iostat %iowait confirms it’s not 
disk (ranges between 0.3 – 0.9%). One of the ways in which the problem 
manifests is that the nodes can’t compact SSTables and it happens randomly. We 
run Cassandra 2.1.13 on Azure Premium Storage (network attached SSDs).

One of the sample threads that was taking high CPU shows :

"pool-13-thread-1" 
#3352<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsupport.datastax.com%2Fhc%2Frequests%2F3352&data=02%7C01%7CAnubhav.Kale%40microsoft.com%7Cab2c0fcf99a447694b0908d4267f3036%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636175775106811606&sdata=OP%2FepExQP5HyrBitvVlyjCj4cVXpB0zc8Oj5TWapduY%3D&reserved=0>
 prio=5 os_prio=0 tid=0x00007f2275340bb0 nid=0x1b0b runnable 
[0x00007f33ffaae000]
java.lang.Thread.State: RUNNABLE
at java.util.TimSort.gallopRight(TimSort.java:632)
at java.util.TimSort.mergeLo(TimSort.java:739)
at java.util.TimSort.mergeAt(TimSort.java:514)
at java.util.TimSort.mergeCollapse(TimSort.java:441)
at java.util.TimSort.sort(TimSort.java:245)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.ArrayList.sort(ArrayList.java:1454)
at java.util.Collections.sort(Collections.java:175)
at 
org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithScore(DynamicEndpointSnitch.java:163)
at 
org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithBadness(DynamicEndpointSnitch.java:200)
at 
org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximity(DynamicEndpointSnitch.java:152)
at 
org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1581)
at 
org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1739)

Looking at code, I can’t figure out why things like this would require a high 
CPU and I don’t find any JIRAs relating this as well. So, what can I do next to 
troubleshoot this ?

Thanks !

Reply via email to