Uh, so look at your await time of *107.3*. From the iostat man page: "await: The average time (in milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them."
If the key you are reading from is not in Cassandras key cache or row cache, Cassandra needs to do two disk seeks (http://www.datastax.com/dev/blog/maximizing-cache-benefit-with-cassandra). This means that some of your *must* take on average 215 ms not even including network latency. Looks like EBS, or more generally disk saturation, is your problem. Perhaps consider RAID0 with ephemeral drives. Dan From: Ian Danforth [mailto:idanfo...@numenta.com] Sent: November-03-11 18:34 To: user@cassandra.apache.org Subject: Read perf investigation All, I've done a bit more homework, and I continue to see long 200ms to 300ms read times for some keys. Test Setup EC2 M1Large sending requests to a 5 node C* cluster also in EC2, also all M1Large. RF=3. ReadConsistency = ONE. I'm using pycassa from python for all communication. Data Model One column family with tens of millions of rows. The number of columns per row varies between 0 and 1440 (per minute records). The values are all ints. All data stored on EBS volumes. Total load per node is ~110GB. According to VMstat I'm not swapping at all. Highest %Util I see Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util xvdf 0.00 2788.00 17.00 267.50 1168.00 23020.00 85.02 32.37 107.73 1.22 34.60 A more average profile I see is: Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util xvdf 0.00 0.00 21.00 0.00 1288.00 0.00 61.33 0.37 18.38 9.43 19.80 QUESTION Where should I look next? I'd love to get a profile of exactly where cassandra is spending its time on a per call basis. Thanks in advance, Ian No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.920 / Virus Database: 271.1.1/3993 - Release Date: 11/03/11 03:39:00