Read perf investigation

Ian Danforth Thu, 03 Nov 2011 15:34:24 -0700

All,

 I've done a bit more homework, and I continue to see long 200ms to 300ms
read times for some keys.


Test Setup

EC2 M1Large sending requests to a 5 node C* cluster also in EC2, also all
M1Large. RF=3. ReadConsistency = ONE. I'm using pycassa from python for all
communication.

Data Model

One column family with tens of millions of rows. The number of columns per
row varies between 0 and 1440 (per minute records). The values are all
ints. All data stored on EBS volumes. Total load per node is ~110GB.

According to VMstat I'm not swapping at all.

Highest %Util I see
Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdf              0.00  2788.00   17.00  267.50  1168.00 23020.00    85.02
   32.37  107.73   1.22  34.60

A more average profile I see is:

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdf              0.00     0.00   21.00    0.00  1288.00     0.00    61.33
    0.37   18.38   9.43  19.80

QUESTION

Where should I look next? I'd love to get a profile of exactly where
cassandra is spending its time on a per call basis.

Thanks in advance,

Ian

Read perf investigation

Reply via email to