Hello, I am trying to fight a high CPU problem on some of our nodes. Thread dumps show that it's not GC threads (we have 30GB heap), iostat %iowait confirms it's not disk (ranges between 0.3 - 0.9%). One of the ways in which the problem manifests is that the nodes can't compact SSTables and it happens randomly. We run Cassandra 2.1.13 on Azure Premium Storage (network attached SSDs).
One of the sample threads that was taking high CPU shows : "pool-13-thread-1" #3352<https://support.datastax.com/hc/requests/3352> prio=5 os_prio=0 tid=0x00007f2275340bb0 nid=0x1b0b runnable [0x00007f33ffaae000] java.lang.Thread.State: RUNNABLE at java.util.TimSort.gallopRight(TimSort.java:632) at java.util.TimSort.mergeLo(TimSort.java:739) at java.util.TimSort.mergeAt(TimSort.java:514) at java.util.TimSort.mergeCollapse(TimSort.java:441) at java.util.TimSort.sort(TimSort.java:245) at java.util.Arrays.sort(Arrays.java:1512) at java.util.ArrayList.sort(ArrayList.java:1454) at java.util.Collections.sort(Collections.java:175) at org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithScore(DynamicEndpointSnitch.java:163) at org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithBadness(DynamicEndpointSnitch.java:200) at org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximity(DynamicEndpointSnitch.java:152) at org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1581) at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1739) Looking at code, I can't figure out why things like this would require a high CPU and I don't find any JIRAs relating this as well. So, what can I do next to troubleshoot this ? Thanks !