Hi Aaron,

>How many rows are you asking for in the multget_slice and what thread
pools are showing pending tasks ?
I am querying in batches of 256 keys max. Each batch may slice between 1
and 5 explicit super columns (I need all the columns in each super column,
there are at the very most a couple dozen columns per SC).

On the first replica, only ReadStage ever shows any pending. All the others
 have 1 to 10 pending from time to time only. Here's a typical "high
pending count" reading on the first replica for the data hotspot.
ReadStage                        13      5238    10374301128         0
            0
I've got a watch running every two seconds and I see the numbers vary every
time going from that high point to 0 active, 0 pending. The one thing I've
noticed is that I hardly every see the Active count stay up at the current
2s sampling rate.
On the 2 other replicas, I hardly ever see any pendings on ReadStage and
Active hardly goes up to 1 or 2. But I do see a little PENDING
on RequestResponseStage, goes up in the tens or hundreds from time to time.


If I'm flooding that one replica, shouldn't the ReadStage Active count be
at maximum capacity ?


I've already thought of CASSANDRA-2980 but I'm running 0.8.7 and 0.8.9.

Also, what happens when you reduce the number of rows in the request?
>
I've reduced the requests to batches of 16. I've had to increased the
number of threads from 30 to 90 in order to get the same key throughput
because the throughput I measure drastically goes down on a per thread
basis.
What I see :
 - CPU utilization is lower on the first replica (why would that be if the
batches are smaller ?)
 - Pending ReadStage on first replica seems to be staying higher longer.
Still goes down to 0 regularly.
 - lowering to 60 client threads, I see non-zero active MutationStage and
ReplicateOnWriteStage more often
For our use-case, the higher the throughput per client thread, the less
rework will be done in our processing.

Another experiment : I stopped the process that does all the reading and a
little of the writing. All that's left is a single-threaded process that
sending counter updates as fast as it can in batches of up to 50 mutations.
First replica : pending counts go up into the low hundreds and back to 0,
active up to 3 or 5 and that's a max. Some mutation stage active & pendings
=> the process is indeed faster at updating the counters so that doesn't
surprise me given that a counter write requires a read.
Second & third replicas : no read stage pendings at all. A
little RequestResponseStage as earlier.

Cheers
Philippe

>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 21/12/2011, at 11:57 AM, Philippe wrote:
>
> Hello,
> 5 nodes running 0.8.7/0.8.9, RF=3, BOP, counter columns inside super
> columns. Read queries are multigetslices of super columns inside of which I
> read every column for processing (20-30 at most), using Hector with default
> settings.
> Watching tpstat on the 3 nodes holding the data being most often queries,
> I see the pending count increase only on the "main replica" and I see heavy
> CPU load and network load only on that node. The other nodes seem to be
> doing very little.
>
> Aren't counter read requests supposed to be round-robin across replicas ?
> I'm confused as to why the nodes don't exhibit the same load.
>
> Thanks
>
>
>

Reply via email to