https://issues.apache.org/jira/browse/CASSANDRA-6908
Disable DynamicSnitch by adding the following to cassandra.yaml (it is a not in the file by default): dynamic_snitch: false On Wed, Dec 21, 2016 at 8:40 AM, Anubhav Kale <[email protected]> wrote: > CIL > > > > *From:* Alain RODRIGUEZ [mailto:[email protected]] > *Sent:* Saturday, December 17, 2016 5:18 AM > *To:* [email protected] > *Subject:* Re: High CPU on nodes > > > > Hi, > > > > What does 'nodetool netstats' looks like on those nodes? > > > > *Its not doing any streaming.* > > > > we have 30GB heap > > > > How is the JVM / GC doing? Are you using G1GC or CMS? This setting would > be bad for CMS. > > > > *G1. GC is doing fine. I don’t see any long pauses beyond 200 ms.* > > > > You can use this tool to understand were the CPU is being used > https://github.com/aragozin/jvm-tools/blob/master/sjk-core/COMMANDS.md# > ttop-command > <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Faragozin%2Fjvm-tools%2Fblob%2Fmaster%2Fsjk-core%2FCOMMANDS.md%23ttop-command&data=02%7C01%7CAnubhav.Kale%40microsoft.com%7Cab2c0fcf99a447694b0908d4267f3036%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636175775106811606&sdata=R%2FouOelExm1C3okjg9zEJsdlCiDRrhy8%2B9n3SIqC4fg%3D&reserved=0> > . > > > > I hope that helps, > > > > C*heers, > > ----------------------- > > Alain Rodriguez - @arodream - [email protected] > > France > > > > The Last Pickle - Apache Cassandra Consulting > > http://www.thelastpickle.com > <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.thelastpickle.com&data=02%7C01%7CAnubhav.Kale%40microsoft.com%7Cab2c0fcf99a447694b0908d4267f3036%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636175775106811606&sdata=kZPi%2B43OyWGNr%2FAmJsLflVOWkSMI0V7oK4x%2Ff%2FR27BU%3D&reserved=0> > > > > > > > > 2016-12-17 0:10 GMT+01:00 Anubhav Kale <[email protected]>: > > Hello, > > > > I am trying to fight a high CPU problem on some of our nodes. Thread dumps > show that it’s not GC threads (we have 30GB heap), iostat %iowait confirms > it’s not disk (ranges between 0.3 – 0.9%). One of the ways in which the > problem manifests is that the nodes can’t compact SSTables and it happens > randomly. We run Cassandra 2.1.13 on Azure Premium Storage (network > attached SSDs). > > > > One of the sample threads that was taking high CPU shows : > > > > "pool-13-thread-1" #3352 > <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsupport.datastax.com%2Fhc%2Frequests%2F3352&data=02%7C01%7CAnubhav.Kale%40microsoft.com%7Cab2c0fcf99a447694b0908d4267f3036%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636175775106811606&sdata=OP%2FepExQP5HyrBitvVlyjCj4cVXpB0zc8Oj5TWapduY%3D&reserved=0> > prio=5 os_prio=0 tid=0x00007f2275340bb0 nid=0x1b0b runnable > [0x00007f33ffaae000] > java.lang.Thread.State: RUNNABLE > at java.util.TimSort.gallopRight(TimSort.java:632) > at java.util.TimSort.mergeLo(TimSort.java:739) > at java.util.TimSort.mergeAt(TimSort.java:514) > at java.util.TimSort.mergeCollapse(TimSort.java:441) > at java.util.TimSort.sort(TimSort.java:245) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at org.apache.cassandra.locator.DynamicEndpointSnitch. > sortByProximityWithScore(DynamicEndpointSnitch.java:163) > at org.apache.cassandra.locator.DynamicEndpointSnitch. > sortByProximityWithBadness(DynamicEndpointSnitch.java:200) > at org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximity( > DynamicEndpointSnitch.java:152) > at org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints( > StorageProxy.java:1581) > at org.apache.cassandra.service.StorageProxy.getRangeSlice( > StorageProxy.java:1739) > > > > Looking at code, I can’t figure out why things like this would require a > high CPU and I don’t find any JIRAs relating this as well. So, what can I > do next to troubleshoot this ? > > > > Thanks ! > > > -- ----------------- Nate McCall Wellington, NZ @zznate CTO Apache Cassandra Consulting http://www.thelastpickle.com
