Hi, What does 'nodetool netstats' looks like on those nodes?
we have 30GB heap > How is the JVM / GC doing? Are you using G1GC or CMS? This setting would be bad for CMS. You can use this tool to understand were the CPU is being used https://github.com/aragozin/jvm-tools/blob/master/sjk-core/COMMANDS.md#ttop-command . I hope that helps, C*heers, ----------------------- Alain Rodriguez - @arodream - al...@thelastpickle.com France The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2016-12-17 0:10 GMT+01:00 Anubhav Kale <anubhav.k...@microsoft.com>: > Hello, > > > > I am trying to fight a high CPU problem on some of our nodes. Thread dumps > show that it’s not GC threads (we have 30GB heap), iostat %iowait confirms > it’s not disk (ranges between 0.3 – 0.9%). One of the ways in which the > problem manifests is that the nodes can’t compact SSTables and it happens > randomly. We run Cassandra 2.1.13 on Azure Premium Storage (network > attached SSDs). > > > > One of the sample threads that was taking high CPU shows : > > > > "pool-13-thread-1" #3352 <https://support.datastax.com/hc/requests/3352> > prio=5 os_prio=0 tid=0x00007f2275340bb0 nid=0x1b0b runnable > [0x00007f33ffaae000] > java.lang.Thread.State: RUNNABLE > at java.util.TimSort.gallopRight(TimSort.java:632) > at java.util.TimSort.mergeLo(TimSort.java:739) > at java.util.TimSort.mergeAt(TimSort.java:514) > at java.util.TimSort.mergeCollapse(TimSort.java:441) > at java.util.TimSort.sort(TimSort.java:245) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at org.apache.cassandra.locator.DynamicEndpointSnitch. > sortByProximityWithScore(DynamicEndpointSnitch.java:163) > at org.apache.cassandra.locator.DynamicEndpointSnitch. > sortByProximityWithBadness(DynamicEndpointSnitch.java:200) > at org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximity( > DynamicEndpointSnitch.java:152) > at org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints( > StorageProxy.java:1581) > at org.apache.cassandra.service.StorageProxy.getRangeSlice( > StorageProxy.java:1739) > > > > Looking at code, I can’t figure out why things like this would require a > high CPU and I don’t find any JIRAs relating this as well. So, what can I > do next to troubleshoot this ? > > > > Thanks ! >