along the same line of the last experimient I did (cluster is only being updated by a single threaded batching processing.) All nodes are the same hardware & configuration. Why on earth would one node require disk IO and not the 2 replicas ?
Primary replica show some disk activity (iostat shows about 40%) ----total-cpu-usage---- -dsk/total- usr sys idl wai hiq siq| read writ 67 10 19 2 0 3|4244k 364k| where as 2nd & 3rd replica do not ----total-cpu-usage---- -dsk/total- usr sys idl wai hiq siq| read writ 42 13 41 0 0 3| 0 0 | 47 15 34 0 0 4|4096B 185k 49 14 35 0 0 3| 0 8192B 47 16 33 0 0 4| 0 4096B 44 13 41 0 0 3| 284k 112k 3rd 11 2 87 1 0 0| 0 136k| 0 0 99 0 0 0| 0 0 9 1 90 0 0 0|4096B 128k 2 2 96 0 0 0| 0 0 0 0 99 0 0 0| 0 0 11 1 87 0 0 0| 0 128k Philippe 2011/12/21 Philippe <watche...@gmail.com> > Hi Aaron, > > >How many rows are you asking for in the multget_slice and what thread > pools are showing pending tasks ? > I am querying in batches of 256 keys max. Each batch may slice between 1 > and 5 explicit super columns (I need all the columns in each super column, > there are at the very most a couple dozen columns per SC). > > On the first replica, only ReadStage ever shows any pending. All the > others have 1 to 10 pending from time to time only. Here's a typical "high > pending count" reading on the first replica for the data hotspot. > ReadStage 13 5238 10374301128 0 > 0 > I've got a watch running every two seconds and I see the numbers vary > every time going from that high point to 0 active, 0 pending. The one thing > I've noticed is that I hardly every see the Active count stay up at the > current 2s sampling rate. > On the 2 other replicas, I hardly ever see any pendings on ReadStage and > Active hardly goes up to 1 or 2. But I do see a little PENDING > on RequestResponseStage, goes up in the tens or hundreds from time to time. > > > If I'm flooding that one replica, shouldn't the ReadStage Active count be > at maximum capacity ? > > > I've already thought of CASSANDRA-2980 but I'm running 0.8.7 and 0.8.9. > > Also, what happens when you reduce the number of rows in the request? >> > I've reduced the requests to batches of 16. I've had to increased the > number of threads from 30 to 90 in order to get the same key throughput > because the throughput I measure drastically goes down on a per thread > basis. > What I see : > - CPU utilization is lower on the first replica (why would that be if the > batches are smaller ?) > - Pending ReadStage on first replica seems to be staying higher longer. > Still goes down to 0 regularly. > - lowering to 60 client threads, I see non-zero active MutationStage and > ReplicateOnWriteStage more often > For our use-case, the higher the throughput per client thread, the less > rework will be done in our processing. > > Another experiment : I stopped the process that does all the reading and a > little of the writing. All that's left is a single-threaded process that > sending counter updates as fast as it can in batches of up to 50 mutations. > First replica : pending counts go up into the low hundreds and back to 0, > active up to 3 or 5 and that's a max. Some mutation stage active & pendings > => the process is indeed faster at updating the counters so that doesn't > surprise me given that a counter write requires a read. > Second & third replicas : no read stage pendings at all. A > little RequestResponseStage as earlier. > > Cheers > Philippe > >> >> Cheers >> >> ----------------- >> Aaron Morton >> Freelance Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 21/12/2011, at 11:57 AM, Philippe wrote: >> >> Hello, >> 5 nodes running 0.8.7/0.8.9, RF=3, BOP, counter columns inside super >> columns. Read queries are multigetslices of super columns inside of which I >> read every column for processing (20-30 at most), using Hector with default >> settings. >> Watching tpstat on the 3 nodes holding the data being most often queries, >> I see the pending count increase only on the "main replica" and I see heavy >> CPU load and network load only on that node. The other nodes seem to be >> doing very little. >> >> Aren't counter read requests supposed to be round-robin across replicas ? >> I'm confused as to why the nodes don't exhibit the same load. >> >> Thanks >> >> >> >