Ah, the weird thing is I/O is assumed to be the limiting factor, but iops on the box was very low. Service time and atime very low, and the data access was only 6MB a second. With all of this, I'm tending to believe that the problem may be someplace else.
Maybe there is a preferred Java version for Cassandra 0.6.3? I am not running the latest 1.6 in production. On Tue, Jul 27, 2010 at 12:01 AM, Thorvaldsson Justus < justus.thorvalds...@svenskaspel.se> wrote: > AFAIK You could use more nodes and read in parallel from them making your > read rate go up. Also don’t write and read to the same disk may help some. > It’s not so much about “Cassandra’s” read rate but what your hardware can > manage. > > > > /Justus > > > > *Från:* Dathan Pattishall [mailto:datha...@gmail.com] > *Skickat:* den 27 juli 2010 08:56 > *Till:* user@cassandra.apache.org > *Ämne:* Re: what causes MESSAGE-DESERIALIZER-POOL to spike > > > > > > On Mon, Jul 26, 2010 at 8:30 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > > MDP is backing up because RRS is full at 4096. This means you're not > able to process reads as quickly as the requests are coming in. Make > whatever is doing those reads be less aggressive. > > > So, for cassandra to function correctly I need to throttle my reads? What > request rate is ideal? 100s reads a second, 1000s? For me I would love to do > 100s of thousands of reads a second. Is Cassandra not suited for this? > > > > As to why the reads are slow in the first place, usually this means > you are disk i/o bound. Posting your cfstats can help troubleshoot > but is no substitute for thinking about your application workload. > > > How should I think about my application workload? I use cassandra as a > distributed hash table accessing Cassandra by individual keys (BigO(1)). I > randomly hit a node through a F5 loadbalancer, using the storage CF > definition as defined in the sample storage-conf.xml. Each key is no more > then 30 bytes, the value is a time stamp. I store a total for 20 million > keys and update 1.5 million keys a day. Is there anything else I should > really think about? What are the limitations in Cassandra that would effect > this workload? > > > > > > > > On Mon, Jul 26, 2010 at 12:32 PM, Anthony Molinaro > <antho...@alumni.caltech.edu> wrote: > > It's usually I/O which causes backup in MESSAGE-DESERIALIZER-POOL. You > > should check iostat and see what it looks like. It may be that you > > need more nodes in order to deal with the read/write rate. You can also > > use JMX to get latency values on reads and writes and see if the backup > > has a corresponding increase in latency. You may be able to get more > > out of your hardware and memory with row caching but that really depends > > on your data set. > > > > -Anthony > > > > On Mon, Jul 26, 2010 at 12:22:46PM -0700, Dathan Pattishall wrote: > >> I have 4 nodes on enterprise type hardware (Lots of Ram 12GB, 16 i7 > cores, > >> RAID Disks). > >> > >> ~# /opt/cassandra/bin/nodetool --host=localhost --port=8181 tpstats > >> Pool Name Active Pending Completed > >> STREAM-STAGE 0 0 0 > >> RESPONSE-STAGE 0 0 516280 > >> ROW-READ-STAGE 8 4096 1164326 > >> LB-OPERATIONS 0 0 0 > >> *MESSAGE-DESERIALIZER-POOL 1 682008 1818682* > >> GMFD 0 0 6467 > >> LB-TARGET 0 0 0 > >> CONSISTENCY-MANAGER 0 0 661477 > >> ROW-MUTATION-STAGE 0 0 998780 > >> MESSAGE-STREAMING-POOL 0 0 0 > >> LOAD-BALANCER-STAGE 0 0 0 > >> FLUSH-SORTER-POOL 0 0 0 > >> MEMTABLE-POST-FLUSHER 0 0 4 > >> FLUSH-WRITER-POOL 0 0 4 > >> AE-SERVICE-STAGE 0 0 0 > >> HINTED-HANDOFF-POOL 0 0 3 > >> > >> EQX r...@cass04:~# vmstat -n 1 > >> > >> procs -----------memory---------- ---swap-- -----io---- --system-- > >> -----cpu------ > >> r b swpd free buff cache si so bi bo in cs us sy > id > >> wa st > >> 6 10 7096 121816 16244 10375492 0 0 1 3 0 0 5 > 1 > >> 94 0 0 > >> 2 10 7096 116484 16248 10381144 0 0 5636 4 21210 9820 2 > 1 > >> 79 18 0 > >> 1 9 7096 108920 16248 10387592 0 0 6216 0 21439 9878 2 > 1 > >> 81 16 0 > >> 0 9 7096 129108 16248 10364852 0 0 6024 0 23280 8753 2 > 1 > >> 80 17 0 > >> 2 9 7096 122460 16248 10370908 0 0 6072 0 20835 9461 2 > 1 > >> 83 14 0 > >> 2 8 7096 115740 16260 10375752 0 0 5168 292 21049 9511 3 > 1 > >> 77 20 0 > >> 1 10 7096 108424 16260 10382300 0 0 6244 0 21483 8981 2 > 1 > >> 75 22 0 > >> 3 8 7096 125028 16260 10364104 0 0 5584 0 21238 9436 2 > 1 > >> 81 16 0 > >> 3 9 7096 117928 16260 10370064 0 0 5988 0 21505 10225 2 > 1 > >> 77 19 0 > >> 1 8 7096 109544 16260 10376640 0 0 6340 28 20840 8602 2 > 1 > >> 80 18 0 > >> 0 9 7096 127028 16240 10357652 0 0 5984 0 20853 9158 2 > 1 > >> 79 18 0 > >> 9 0 7096 121472 16240 10363492 0 0 5716 0 20520 8489 1 > 1 > >> 82 16 0 > >> 3 9 7096 112668 16240 10369872 0 0 6404 0 21314 9459 2 > 1 > >> 84 13 0 > >> 1 9 7096 127300 16236 10353440 0 0 5684 0 38914 10068 2 > 1 > >> 76 21 0 > >> > >> > >> *But the 16 cores are hardly utilized. Which indicates to me there is > some > >> bad thread thrashing, but why? * > >> > >> > >> > >> 1 [||||| 8.3%] > Tasks: > >> 1070 total, 1 running > >> 2 [ 0.0%] Load > >> average: 8.34 9.05 8.82 > >> 3 [ 0.0%] > Uptime: > >> 192 days(!), 15:29:52 > >> 4 [||||||||||| 17.9%] > >> 5 [||||| 5.7%] > >> 6 [|| 1.3%] > >> 7 [|| 2.6%] > >> 8 [| 0.6%] > >> 9 [| 0.6%] > >> 10 [|| 1.9%] > >> 11 [|| 1.9%] > >> 12 [|| 1.9%] > >> 13 [|| 1.3%] > >> 14 [| 0.6%] > >> 15 [|| 1.3%] > >> 16 [| 0.6%] > >> Mem[||||||||||||||||||||||||||||||||||||||||||||1791/12028MB] > >> Swp[| 6/1983MB] > >> > >> PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command > >> 30269 root 40 0 14100 2116 900 R 4.0 0.0 0:00.49 htop > >> 24878 root 40 0 20.6G 8345M 6883M D 3.0 69.4 1:23.03 > >> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark > >> 24879 root 40 0 20.6G 8345M 6883M D 3.0 69.4 1:22.93 > >> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark > >> 24874 root 40 0 20.6G 8345M 6883M D 2.0 69.4 1:22.73 > >> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark > >> 24880 root 40 0 20.6G 8345M 6883M D 2.0 69.4 1:22.93 > >> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark > >> 24875 root 40 0 20.6G 8345M 6883M D 2.0 69.4 1:23.17 > >> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark > >> 24658 root 40 0 20.6G 8345M 6883M D 2.0 69.4 1:23.06 > >> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark > >> 24877 root 40 0 20.6G 8345M 6883M S 2.0 69.4 1:23.43 > >> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark > >> 24873 root 40 0 20.6G 8345M 6883M D 1.0 69.4 1:23.65 > >> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark > >> 24876 root 40 0 20.6G 8345M 6883M S 1.0 69.4 1:23.62 > >> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark > >> 24942 root 40 0 20.6G 8345M 6883M S 1.0 69.4 0:23.50 > >> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark > >> 24943 root 40 0 20.6G 8345M 6883M S 0.0 69.4 0:29.53 > >> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark > >> 24933 root 40 0 20.6G 8345M 6883M S 0.0 69.4 0:22.57 > >> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark > >> 24939 root 40 0 20.6G 8345M 6883M S 0.0 69.4 0:12.73 > >> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark > >> 25280 root 40 0 20.6G 8345M 6883M S 0.0 69.4 0:00.10 > >> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark > > > > -- > > ------------------------------------------------------------------------ > > Anthony Molinaro <antho...@alumni.caltech.edu> > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com > > >