Jonathan Ellis <jbellis <at> gmail.com> writes: > > Another thing that is odd is that even when the server nodes are quiescent > > because compacting is complete, I am still seeing cpu usage stay at > > about 40% . Even after several hours, no reading or writing to the database > > and all compactions complete, the cpu usage is staying around 40%. > > Possibly this is Hinted Handoff scanning going on. You can rm > data/system/Hint* (while the node is shut down) if you want to take a > shot in the dark. Otherwise you'll want to follow > http://publib.boulder.ibm.com/infocenter/javasdk/tools/index.jsp? topic=/com.ibm.java.doc.igaa/_1vg0001475cb4a-1190e2e0f74-8000_1007.html > to figure out which thread is actually consuming the CPU. >
Thank you for all of the helpful advice. We upgraded to 0.6.2 to see if any problems would resolve but we're still seeing the 40% cpu usage on certain nodes several hours after any reads or writes have taken place. We attached to one of these super busy cassandra nodes and have isolated it to a single thread “Thread-32” that seems to be buzzing in a tight loop. The loop is in IncomingStreamReader.java, line 62, a 3-line while loop. bytesRead is not changing. pendingFile.getExpectedBytes() returns 7,161,538,639 but bytesRead is stuck at 2,147,483,647. The stack is: Sun.nio.ch.Util.getTemporaryDirectBuffer() – line 50 Sun.nio.ch.FileChannelImpl.transferFromArbitraryChannel() – line 556 Sun.nio.ch.FileChannelImpl.transferFrom() – line 603 Org.apache.cassandra.streaming.IncomingStreamReader.read() – line 62 Org.apache.cassandra.net.IncomingTcpConnection.run() – line 66 This same 3 line loop in IncomingStreamReader was also getting stuck last night with 0.6.1 so whatever it is is still happening in 0.6.2. Thanks for your help, Julie