A thread dump on one of the machine that has a suspiciously high CPU might help figuring out what it is that is taking all that CPU.
On Wed, Aug 21, 2013 at 8:57 AM, Keith Wright <kwri...@nanigans.com> wrote: > Some last minute info on this to hopefully enlighten. We are doing ~200 > reads and writes across our 7 node SSD cluster right now (usually can do > closer to 20K reads at least) and seeing CPU load as follows for the nodes > (with some par new to give an idea of GC): > > 001 – 1200% (Par New at 120 ms / sec) > 002 – 6% (Par New at 0) > 003 – 600% (Par New at 45 ms / sec) > 004 – 900% > 005 – 500% > 006 – 10% > 007 – 130% > > There are no compactions running on 001 however I did see a broken pipe > error in the logs there (see below). Netstats for 001 shows nothing > pending. It appears that all of the load/latency is related to one column > family. You can see cfstats & cfhistograms output below and note that we > are using LCS. I have brought the odd cfhistograms behavior to the thread > before and am not sure what's going on there. We are in a production down > situation right now so any help would be much appreciated!!! > > Column Family: global_user > SSTable count: 7546 > SSTables in each level: [2, 10, 106/100, 453, 6975, 0, 0] > Space used (live): 83848742562 > Space used (total): 83848742562 > Number of Keys (estimate): 549792896 > Memtable Columns Count: 526746 > Memtable Data Size: 117408252 > Memtable Switch Count: 0 > Read Count: 11673 > Read Latency: 1950.062 ms. > Write Count: 118588 > Write Latency: 0.080 ms. > Pending Tasks: 0 > Bloom Filter False Positives: 4322 > Bloom Filter False Ratio: 0.84066 > Bloom Filter Space Used: 383507440 > Compacted row minimum size: 73 > Compacted row maximum size: 2816159 > Compacted row mean size: 324 > > [kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user > users/global_user histograms > Offset SSTables Write Latency Read Latency Row Size > Column Count > 1 8866 0 0 0 > 3420 > 2 1001 0 0 0 > 99218975 > 3 1249 0 0 0 > 319713048 > 4 1074 0 0 0 > 25073893 > 5 132 0 0 0 > 15359199 > 6 0 0 0 0 > 27794925 > 7 0 12 0 0 > 7954974 > 8 0 23 0 0 > 7733934 > 10 0 184 0 0 > 13276275 > 12 0 567 0 0 > 9077508 > 14 0 1098 0 0 > 5879292 > 17 0 2722 0 0 > 5693471 > 20 0 4379 0 0 > 3204131 > 24 0 8928 0 0 > 2614995 > 29 0 13525 0 0 > 1824584 > 35 0 16759 0 0 > 1265911 > 42 0 17048 0 0 > 868075 > 50 0 14162 5 0 > 596417 > 60 0 11806 15 0 > 467747 > 72 0 8569 108 0 > 354276 > 86 0 7042 276 227 > 269987 > 103 0 5936 372 2972 > 218931 > 124 0 4538 577 157 > 181360 > 149 0 2981 1076 7388090 > 144298 > 179 0 1929 1529 90535838 > 116628 > 215 0 1081 1450 182701876 > 93378 > 258 0 499 1125 141393480 > 74052 > 310 0 124 756 18883224 > 58617 > 372 0 31 460 24599272 > 45453 > 446 0 25 247 23516772 > 34310 > 535 0 10 146 13987584 > 26168 > 642 0 20 194 12091458 > 19965 > 770 0 8 196 9269197 > 14649 > 924 0 9 340 8082898 > 11015 > 1109 0 9 225 4762865 > 8058 > 1331 0 9 154 3330110 > 5866 > 1597 0 8 144 2367615 > 4275 > 1916 0 1 188 1633608 > 3087 > 2299 0 4 216 1139820 > 2196 > 2759 0 5 201 819019 > 1456 > 3311 0 4 194 600522 > 1135 > 3973 0 6 181 454566 > 786 > 4768 0 13 136 353886 > 587 > 5722 0 6 152 280630 > 400 > 6866 0 5 80 225545 > 254 > 8239 0 6 112 183285 > 138 > 9887 0 0 68 149820 > 109 > 11864 0 5 99 121722 > 66 > 14237 0 57 86 98352 > 50 > 17084 0 18 99 79085 > 35 > 20501 0 1 93 62423 > 11 > 24601 0 0 61 49471 > 9 > 29521 0 0 69 37395 > 5 > 35425 0 4 56 28611 > 6 > 42510 0 0 57 21876 > 1 > 51012 0 9 60 16105 > 0 > 61214 0 0 52 11996 > 0 > 73457 0 0 50 8791 > 0 > 88148 0 0 38 6430 > 0 > 105778 0 0 25 4660 > 0 > 126934 0 0 15 3308 > 0 > 152321 0 0 2 2364 > 0 > 182785 0 0 0 1631 > 0 > 219342 0 0 0 1156 > 0 > 263210 0 0 0 887 > 0 > 315852 0 0 0 618 > 0 > 379022 0 0 0 427 > 0 > 454826 0 0 0 272 > 0 > 545791 0 0 0 168 > 0 > 654949 0 0 0 115 > 0 > 785939 0 0 0 61 > 0 > 943127 0 0 0 58 > 0 > 1131752 0 0 0 34 > 0 > 1358102 0 0 0 19 > 0 > 1629722 0 0 0 9 > 0 > 1955666 0 0 0 4 > 0 > 2346799 0 0 0 5 > 0 > 2816159 0 0 0 2 > 0 > 3379391 0 0 0 0 > 0 > 4055269 0 0 0 0 > 0 > 4866323 0 0 0 0 > 0 > 5839588 0 0 0 0 > 0 > 7007506 0 0 0 0 > 0 > 8409007 0 0 0 0 > 0 > 10090808 0 0 0 0 > 0 > 12108970 0 0 0 0 > 0 > 14530764 0 0 0 0 > 0 > 17436917 0 0 0 0 > 0 > 20924300 0 0 0 0 > 0 > 25109160 0 0 0 0 > 0 > > ERROR [WRITE-/10.8.44.98] 2013-08-21 06:50:25,450 > OutboundTcpConnection.java (line 197) error writing to /10.8.44.98 > java.lang.RuntimeException: java.io.IOException: Broken pipe > at > org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:59) > at > org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30) > at > org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73) > at org.apache.cassandra.db.Row$RowSerializer.serialize(Row.java:62) > at > org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:78) > at > org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:69) > at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:131) > at > org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:221) > at > org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:186) > at > org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144) > Caused by: java.io.IOException: Broken pipe > at sun.nio.ch.FileDispatcherImpl.write0(Native Method) > at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) > at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94) > at sun.nio.ch.IOUtil.write(IOUtil.java:65) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450) > at java.nio.channels.Channels.writeFullyImpl(Channels.java:78) > at java.nio.channels.Channels.writeFully(Channels.java:98) > at java.nio.channels.Channels.access$000(Channels.java:61) > at java.nio.channels.Channels$1.write(Channels.java:174) > at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) > at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126) > at org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297) > at > org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244) > at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99) > at java.io.DataOutputStream.write(DataOutputStream.java:107) > at org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328) > at > org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315) > at > org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55) > ... 9 more > > > From: Keith Wright <kwri...@nanigans.com> > Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org> > Date: Wednesday, August 21, 2013 2:35 AM > To: "user@cassandra.apache.org" <user@cassandra.apache.org> > Subject: Re: Nodes get stuck > > Still looking for help! We have stopped almost ALL traffic to the cluster > and still some nodes are showing almost 1000% CPU for cassandra with no > iostat activity. We were running cleanup on one of the nodes that was not > showing load spikes however now when I attempt to stop cleanup there via > nodetool stop cleanup the java task for stopping cleanup itself is at 1500% > and has not returned after 2 minutes. This is VERY odd behavior. Any > ideas? Hardware failure? Network? We are not seeing anything there but > wanted to get ideas. > > Thanks > > From: Keith Wright <kwri...@nanigans.com> > Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org> > Date: Tuesday, August 20, 2013 8:32 PM > To: "user@cassandra.apache.org" <user@cassandra.apache.org> > Subject: Nodes get stuck > > Hi all, > > We are using C* 1.2.4 with Vnodes and SSD. We have seen behavior > recently where 3 of our nodes get locked up in high load in what appears to > be a GC spiral while the rest of the cluster (7 total nodes) appears fine. > When I run a tpstats, I see the following (assuming tpstats returns at > all) and top shows cassandra pegged at 2000%. Obviously we have a large > number of blocked reads. In the past I could explain this due to > unexpectedly wide rows however we have handled that. When the cluster > starts to meltdown like this its hard to get visibility into what's going > on and what triggered the issue as everything starts to pile on. Opscenter > becomes unusable and because the effected nodes are in GC pressure, getting > any data via nodetool or JMX is also difficult. What do people do to > handle these situations? We are going to start graphing > reads/writes/sec/CF to Ganglia in the hopes that it helps. > > Thanks > > Pool Name Active Pending Completed Blocked > All time blocked > ReadStage 256 381 1245117434 0 > 0 > RequestResponseStage 0 0 1161495947 0 > 0 > MutationStage 8 8 481721887 0 > 0 > ReadRepairStage 0 0 85770600 0 > 0 > ReplicateOnWriteStage 0 0 21896804 0 > 0 > GossipStage 0 0 1546196 0 > 0 > AntiEntropyStage 0 0 5009 0 > 0 > MigrationStage 0 0 1082 0 > 0 > MemtablePostFlusher 0 0 10178 0 > 0 > FlushWriter 0 0 6081 0 > 2075 > MiscStage 0 0 57 0 > 0 > commitlog_archiver 0 0 0 0 > 0 > AntiEntropySessions 0 0 0 0 > 0 > InternalResponseStage 0 0 6 0 > 0 > HintedHandoff 1 1 246 0 > 0 > > Message type Dropped > RANGE_SLICE 482 > READ_REPAIR 0 > BINARY 0 > READ 515762 > MUTATION 39 > _TRACE 0 > REQUEST_RESPONSE 29 > >