So after 4096 messages get pushed on the row-read-stage queue (or any other multiThreadedStage) the deserializer basically becomes a single-threaded blocking queue that prevents any other inter-node RPC from occurring..? Sounds like it's a problem either way. If the row read stage is what's backed up, why not have the messages stack up on that stage?
Mike On Wed, Aug 4, 2010 at 11:46 AM, Jonathan Ellis <jbel...@gmail.com> wrote: > No, MDP is backing up because Row-Read-Stage [the stage after MDP on > reads] is full at 4096, meaning you're not able to process reads as > quickly as the requests are coming in. > > On Wed, Aug 4, 2010 at 2:21 PM, Mike Malone <m...@simplegeo.com> wrote: > > This may be your > > problem: https://issues.apache.org/jira/browse/CASSANDRA-1358 > > The message deserializer executor is being created with a core pool size > of > > 1. Since it uses a queue with unbounded capacity new requests are always > > queued and the thread pool never grows. So the message deserializer > becomes > > a single-threaded bottleneck through which all traffic must pass. So your > 16 > > cores are reduced to one core for handling all inter-node communication > (and > > any intra-node communication that's being passed through the messaging > > service). > > Mike > > > > On Tue, Aug 3, 2010 at 10:02 PM, Dathan Pattishall <datha...@gmail.com> > > wrote: > >> > >> The output of htop shows threads as procs with a breakdown of how much > cpu > >> /etc per thread (in ncurses color!). All of these Java "procs" are just > Java > >> threads of only 1 instance of Cassandra per Server. > >> > >> On Sat, Jul 31, 2010 at 3:45 PM, Benjamin Black <b...@b3k.us> wrote: > >>> > >>> Sorry, I just noticed: are you running 14 instances of Cassandra on a > >>> single physical machine or are all those java processes something > >>> else? > >>> > >>> On Mon, Jul 26, 2010 at 12:22 PM, Dathan Pattishall < > datha...@gmail.com> > >>> wrote: > >>> > I have 4 nodes on enterprise type hardware (Lots of Ram 12GB, 16 i7 > >>> > cores, > >>> > RAID Disks). > >>> > > >>> > ~# /opt/cassandra/bin/nodetool --host=localhost --port=8181 tpstats > >>> > Pool Name Active Pending Completed > >>> > STREAM-STAGE 0 0 0 > >>> > RESPONSE-STAGE 0 0 516280 > >>> > ROW-READ-STAGE 8 4096 1164326 > >>> > LB-OPERATIONS 0 0 0 > >>> > MESSAGE-DESERIALIZER-POOL 1 682008 1818682 > >>> > GMFD 0 0 6467 > >>> > LB-TARGET 0 0 0 > >>> > CONSISTENCY-MANAGER 0 0 661477 > >>> > ROW-MUTATION-STAGE 0 0 998780 > >>> > MESSAGE-STREAMING-POOL 0 0 0 > >>> > LOAD-BALANCER-STAGE 0 0 0 > >>> > FLUSH-SORTER-POOL 0 0 0 > >>> > MEMTABLE-POST-FLUSHER 0 0 4 > >>> > FLUSH-WRITER-POOL 0 0 4 > >>> > AE-SERVICE-STAGE 0 0 0 > >>> > HINTED-HANDOFF-POOL 0 0 3 > >>> > > >>> > EQX r...@cass04:~# vmstat -n 1 > >>> > > >>> > procs -----------memory---------- ---swap-- -----io---- --system-- > >>> > -----cpu------ > >>> > r b swpd free buff cache si so bi bo in cs us > sy > >>> > id > >>> > wa st > >>> > 6 10 7096 121816 16244 10375492 0 0 1 3 0 0 > 5 > >>> > 1 > >>> > 94 0 0 > >>> > 2 10 7096 116484 16248 10381144 0 0 5636 4 21210 9820 > >>> > 2 1 > >>> > 79 18 0 > >>> > 1 9 7096 108920 16248 10387592 0 0 6216 0 21439 9878 > >>> > 2 1 > >>> > 81 16 0 > >>> > 0 9 7096 129108 16248 10364852 0 0 6024 0 23280 8753 > >>> > 2 1 > >>> > 80 17 0 > >>> > 2 9 7096 122460 16248 10370908 0 0 6072 0 20835 9461 > >>> > 2 1 > >>> > 83 14 0 > >>> > 2 8 7096 115740 16260 10375752 0 0 5168 292 21049 9511 > >>> > 3 1 > >>> > 77 20 0 > >>> > 1 10 7096 108424 16260 10382300 0 0 6244 0 21483 8981 > >>> > 2 1 > >>> > 75 22 0 > >>> > 3 8 7096 125028 16260 10364104 0 0 5584 0 21238 9436 > >>> > 2 1 > >>> > 81 16 0 > >>> > 3 9 7096 117928 16260 10370064 0 0 5988 0 21505 10225 > >>> > 2 1 > >>> > 77 19 0 > >>> > 1 8 7096 109544 16260 10376640 0 0 6340 28 20840 8602 > >>> > 2 1 > >>> > 80 18 0 > >>> > 0 9 7096 127028 16240 10357652 0 0 5984 0 20853 9158 > >>> > 2 1 > >>> > 79 18 0 > >>> > 9 0 7096 121472 16240 10363492 0 0 5716 0 20520 8489 > >>> > 1 1 > >>> > 82 16 0 > >>> > 3 9 7096 112668 16240 10369872 0 0 6404 0 21314 9459 > >>> > 2 1 > >>> > 84 13 0 > >>> > 1 9 7096 127300 16236 10353440 0 0 5684 0 38914 10068 > >>> > 2 1 > >>> > 76 21 0 > >>> > > >>> > > >>> > But the 16 cores are hardly utilized. Which indicates to me there is > >>> > some > >>> > bad thread thrashing, but why? > >>> > > >>> > > >>> > > >>> > 1 [||||| 8.3%] > >>> > Tasks: > >>> > 1070 total, 1 running > >>> > 2 [ 0.0%] > >>> > Load > >>> > average: 8.34 9.05 8.82 > >>> > 3 [ 0.0%] > >>> > Uptime: > >>> > 192 days(!), 15:29:52 > >>> > 4 [||||||||||| 17.9%] > >>> > 5 [||||| 5.7%] > >>> > 6 [|| 1.3%] > >>> > 7 [|| 2.6%] > >>> > 8 [| 0.6%] > >>> > 9 [| 0.6%] > >>> > 10 [|| 1.9%] > >>> > 11 [|| 1.9%] > >>> > 12 [|| 1.9%] > >>> > 13 [|| 1.3%] > >>> > 14 [| 0.6%] > >>> > 15 [|| 1.3%] > >>> > 16 [| 0.6%] > >>> > Mem[||||||||||||||||||||||||||||||||||||||||||||1791/12028MB] > >>> > Swp[| 6/1983MB] > >>> > > >>> > PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command > >>> > 30269 root 40 0 14100 2116 900 R 4.0 0.0 0:00.49 htop > >>> > 24878 root 40 0 20.6G 8345M 6883M D 3.0 69.4 1:23.03 > >>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC > -XX:+UseConcMark > >>> > 24879 root 40 0 20.6G 8345M 6883M D 3.0 69.4 1:22.93 > >>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC > -XX:+UseConcMark > >>> > 24874 root 40 0 20.6G 8345M 6883M D 2.0 69.4 1:22.73 > >>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC > -XX:+UseConcMark > >>> > 24880 root 40 0 20.6G 8345M 6883M D 2.0 69.4 1:22.93 > >>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC > -XX:+UseConcMark > >>> > 24875 root 40 0 20.6G 8345M 6883M D 2.0 69.4 1:23.17 > >>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC > -XX:+UseConcMark > >>> > 24658 root 40 0 20.6G 8345M 6883M D 2.0 69.4 1:23.06 > >>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC > -XX:+UseConcMark > >>> > 24877 root 40 0 20.6G 8345M 6883M S 2.0 69.4 1:23.43 > >>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC > -XX:+UseConcMark > >>> > 24873 root 40 0 20.6G 8345M 6883M D 1.0 69.4 1:23.65 > >>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC > -XX:+UseConcMark > >>> > 24876 root 40 0 20.6G 8345M 6883M S 1.0 69.4 1:23.62 > >>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC > -XX:+UseConcMark > >>> > 24942 root 40 0 20.6G 8345M 6883M S 1.0 69.4 0:23.50 > >>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC > -XX:+UseConcMark > >>> > 24943 root 40 0 20.6G 8345M 6883M S 0.0 69.4 0:29.53 > >>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC > -XX:+UseConcMark > >>> > 24933 root 40 0 20.6G 8345M 6883M S 0.0 69.4 0:22.57 > >>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC > -XX:+UseConcMark > >>> > 24939 root 40 0 20.6G 8345M 6883M S 0.0 69.4 0:12.73 > >>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC > -XX:+UseConcMark > >>> > 25280 root 40 0 20.6G 8345M 6883M S 0.0 69.4 0:00.10 > >>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC > -XX:+UseConcMark > >>> > > >> > > > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com >