Sorry about the previous message, I've enabled keyboard shortcuts on
gmail...*sigh*...

Hello,
I'm trying to understand the network usage I am seeing in my cluster, can
anyone shed some light?
It's an RF=3, 12-node, cassandra 0.8.6 cluster. repair is performed on each
node once a week, with a rolling schedule.
The nodes are p13,p14,p15...p24 and are consecutive in that order on the
ring. Each node is only a cassandra database. I am hitting the cluster from
another server (p4).

p4 is doing this with 20 threads in parallel

   1. read a lot of data (some columns for hundreds to tens of thousands of
   keys, split into 512-key multigets)
   2. process the data
   3. write back a byte array to cassandra (average size is 400 bytes)
   4. go back to 1

According to my munin graphs, network usage is about as follows. I am not
surprised at the bias towards p13-p15 as p4 is getting & storing data
mainly for keys located on one of those nodes.

   - p4 : 1.5Mb/s in and out
   - p13-p15 : 15Mb/s in and 80Mb/s out
   - p16-p24 : 45Mb/s in and 5Mb/s out

What I don't understand is why p4 is only seeing 1.5Mb/s while I see 80Mb/s
on p13 & p15.

The way I understand this:

   - p4 makes a multiget to the cluster, electing to use any node in the
   cluster (IN traffic for describe the query)
   - coordinator node replays the query on all 3 replicas (so 3 servers
   each get the IN traffic, mostly p13-p15)
   - each server replies to coordinator
   - coordinator chooses matching values and sends back data to p4

So if p13-p15 are outputting 80Mb/s why am I not seeing 80Mb/s coming into
p4 which is on the receiving end ?

Thanks

2011/11/15 Philippe <watche...@gmail.com>

> Hello,
> I'm trying to understand the network usage I am seeing in my cluster, can
> anyone shed some light?
> It's an RF=3, 12-node, cassandra 0.8.6 cluster. The nodes are
> p13,p14,p15...p24 and are consecutive in that order on the ring.
> Each node is only a cassandra database. I am hitting the cluster from
> another server (p4).
>
> The pattern on p4 is the pattern is to
>
>    1. read a lot of data (some columns for hundreds to tens of thousands
>    of keys, split into 512-key multigets)
>    2. process the data
>    3. write back a byte array to cassandra (average size is 400 bytes)
>
>
> p4 reads as
>

Reply via email to