> So basically, I'm flooding the system right ? For example 99303 means there 
> are 99303 key reads pending, possibly from just a couple MultiSlice gets ?

Yes and then some. Each row you ask for in a multiget turns into a single row 
request in the server. You are overloading the server.

> - exactly how much data are you asking for ? how many rows and what sort of 
> slice 
> According to some munin monitoring, the server is cranking out to the client, 
> over the network, 10Mbits/s = 1.25 Mbytes/s

Was thinking in terms of rows, but thats irrelevant now. The answer is "a lot"

> BTW, cassandra is running on an XFS filesystem over LVM 
Others know more about this than me.

> One question : when nodetool cfstats says the average read latency is 5ms, is 
> that counted once the query is being executed or does that include the time 
> spent "pending" ?

In the cf stats output the latency displayed under the Keyspace is the total 
latency for all CF's / the total read count. The latency displayed for the 
individual CF's is for the actual time taken getting the columns requested for 
a row. It's taking 5ms to read the data from disk and apply the filter. 

I'd check you are reading the data you expect then wind back the number of 
requests and rows / columns requested. Get to a stable baseline and then add 
pressure to see when / how things go wrong. 

Hope that helps. 

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 8 Jun 2011, at 08:00, Philippe wrote:

> Aaron,
> 
> - what version are you on ? 
> 0.7.6-2
> 
> -  what is the concurrent_reads config setting ? 
> concurrent_reads: 64    
> concurrent_writes: 64
>  
> Givent that I've got 4 cores and SSD drives, I doubled the concurrent writes 
> recommended.
> Given that I've RAID-0ed the SSD drive, I figured I could at least double for 
> SSD and double for RAID-0 the recommended version.
> Wrong assumptions ?
> 
> BTW, cassandra is running on an XFS filesystem over LVM over RAID-0
> 
> - what is nodetool tpstats showing during the slow down ? 
> The only value that changes is the ReadStage line. Here's values from a 
> sample every second
> Pool Name                    Active   Pending      Completed
> ReadStage                        64     99303      463085056
> ReadStage                        64     88430      463095929
> ReadStage                        64     91937      463107782
> 
> So basically, I'm flooding the system right ? For example 99303 means there 
> are 99303 key reads pending, possibly from just a couple MultiSlice gets ?
>  
> - exactly how much data are you asking for ? how many rows and what sort of 
> slice 
> According to some munin monitoring, the server is cranking out to the client, 
> over the network, 10Mbits/s = 1.25 Mbytes/s
> 
> The same munin monitoring shows me 200Mbytes/s read from the disks. This is 
> what is worrying me...
> 
> - has their been a lot of deletes or TTL columns used ? 
> No deletes, only update, don't know if that counts as deletes though...
>  
> This is going to be a read-heavy, update-heavy cluster.
> No TTL columns, no counter columns
> 
> One question : when nodetool cfstats says the average read latency is 5ms, is 
> that counted once the query is being executed or does that include the time 
> spent "pending" ?
> 
> Thanks
> Philippe
> 
> Hope that helps. 
> Aaron
>  
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 7 Jun 2011, at 10:09, Philippe wrote:
> 
>> Ok, here it goes again... No swapping at all...
>> 
>> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
>>  1 63  32044  88736  37996 7116524    0    0 227156     0 18314 5607 30  5 
>> 11 53
>>  1 63  32044  90844  37996 7103904    0    0 233524   202 17418 4977 29  4  
>> 9 58
>>  0 42  32044  91304  37996 7123884    0    0 249736     0 16197 5433 19  6  
>> 3 72
>>  3 25  32044  89864  37996 7135980    0    0 223140    16 18135 7567 32  5 
>> 11 52
>>  1  1  32044  88664  37996 7150728    0    0 229416   128 19168 7554 36  4 
>> 10 51
>>  4  0  32044  89464  37996 7149428    0    0 213852    18 21041 8819 45  5 
>> 12 38
>>  4  0  32044  90372  37996 7149432    0    0 233086   142 19909 7041 43  5 
>> 10 41
>>  7  1  32044  89752  37996 7149520    0    0 206906     0 19350 6875 50  4 
>> 11 35
>> 
>> Lots and lots of disk activity
>> iostat -dmx 2
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
>> avgqu-sz   await r_await w_await  svctm  %util
>> sda              52.50     0.00 7813.00    0.00   108.01     0.00    28.31   
>> 117.15   14.89   14.89    0.00   0.11  83.00
>> sdb              56.00     0.00 7755.50    0.00   108.51     0.00    28.66   
>> 118.67   15.18   15.18    0.00   0.11  82.80
>> md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00   
>>   0.00    0.00    0.00    0.00   0.00   0.00
>> md5               0.00     0.00 15796.50    0.00   219.21     0.00    28.42  
>>    0.00    0.00    0.00    0.00   0.00   0.00
>> dm-0              0.00     0.00 15796.50    0.00   219.21     0.00    28.42  
>>  273.42   17.03   17.03    0.00   0.05  83.40
>> dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00   
>>   0.00    0.00    0.00    0.00   0.00   0.00
>> 
>> More info : 
>> - all the data directory containing the data I'm querying into is  9.7GB and 
>> this is a server with 16GB 
>> - I'm hitting the server with 6 concurrent multigetsuperslicequeries on 
>> multiple keys, some of them can bring back quite a number of data
>> - I'm reading all the keys for one column, pretty much sequentially
>> 
>> This is a query in a rollup table that was originally in MySQL and it 
>> doesn't look like the performance to query by key is better. So I'm betting 
>> I'm doing something wrong here... but what ?
>> 
>> Any ideas ?
>> Thanks
>> 
>> 2011/6/6 Philippe <watche...@gmail.com>
>> hum..no, it wasn't swapping. cassandra was the only thing running on that 
>> server
>> and i was querying the same keys over and over
>> 
>> i restarted Cassandra and doing the same thing, io is now down to zero while 
>> cpu is up which dosen't surprise me as much.
>> 
>> I'll report if it happens again.
>> 
>> Le 5 juin 2011 16:55, "Jonathan Ellis" <jbel...@gmail.com> a écrit :
>> 
>> > You may be swapping.
>> > 
>> > http://spyced.blogspot.com/2010/01/linux-performance-basics.html
>> > explains how to check this as well as how to see what threads are busy
>> > in the Java process.
>> > 
>> > On Sat, Jun 4, 2011 at 5:34 PM, Philippe <watche...@gmail.com> wrote:
>> >> Hello,
>> >> I am evaluating using cassandra and I'm running into some strange IO
>> >> behavior that I can't explain, I'd like some help/ideas to troubleshoot 
>> >> it.
>> >> I am running a 1 node cluster with a keyspace consisting of two columns
>> >> families, one of which has dozens of supercolumns itself containing dozens
>> >> of columns.
>> >> All in all, this is a couple gigabytes of data, 12GB on the hard drive.
>> >> The hardware is pretty good : 16GB memory + RAID-0 SSD drives with LVM and
>> >> an i5 processor (4 cores).
>> >> Keyspace: xxxxxxxxxxxxxxxxxxx
>> >>         Read Count: 460754852
>> >>         Read Latency: 1.108205793092766 ms.
>> >>         Write Count: 30620665
>> >>         Write Latency: 0.01411020877567486 ms.
>> >>         Pending Tasks: 0
>> >>                 Column Family: xxxxxxxxxxxxxxxxxxxxxxxxxx
>> >>                 SSTable count: 5
>> >>                 Space used (live): 548700725
>> >>                 Space used (total): 548700725
>> >>                 Memtable Columns Count: 0
>> >>                 Memtable Data Size: 0
>> >>                 Memtable Switch Count: 11
>> >>                 Read Count: 2891192
>> >>                 Read Latency: NaN ms.
>> >>                 Write Count: 3157547
>> >>                 Write Latency: NaN ms.
>> >>                 Pending Tasks: 0
>> >>                 Key cache capacity: 367396
>> >>                 Key cache size: 367396
>> >>                 Key cache hit rate: NaN
>> >>                 Row cache capacity: 112683
>> >>                 Row cache size: 112683
>> >>                 Row cache hit rate: NaN
>> >>                 Compacted row minimum size: 125
>> >>                 Compacted row maximum size: 924
>> >>                 Compacted row mean size: 172
>> >>                 Column Family: yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
>> >>                 SSTable count: 7
>> >>                 Space used (live): 8707538781
>> >>                 Space used (total): 8707538781
>> >>                 Memtable Columns Count: 0
>> >>                 Memtable Data Size: 0
>> >>                 Memtable Switch Count: 30
>> >>                 Read Count: 457863660
>> >>                 Read Latency: 2.381 ms.
>> >>                 Write Count: 27463118
>> >>                 Write Latency: NaN ms.
>> >>                 Pending Tasks: 0
>> >>                 Key cache capacity: 4518387
>> >>                 Key cache size: 4518387
>> >>                 Key cache hit rate: 0.9247881700850826
>> >>                 Row cache capacity: 1349682
>> >>                 Row cache size: 1349682
>> >>                 Row cache hit rate: 0.39400533823415573
>> >>                 Compacted row minimum size: 125
>> >>                 Compacted row maximum size: 6866
>> >>                 Compacted row mean size: 165
>> >> My app makes a bunch of requests using a MultigetSuperSliceQuery for a set
>> >> of keys, typically a couple dozen at most. It also selects a subset of the
>> >> supercolumns. I am running 8 requests in parallel at most.
>> >>
>> >> Two days, I ran a 1.5 hour process that basically read every key. The 
>> >> server
>> >> had no IOwaits and everything was humming along. However, right at the end
>> >> of the process, there was a huge spike in IOs. I didn't think much of it.
>> >> Today, after two days of inactivity, any query I run raises the IOs to 80%
>> >> utilization of the SSD drives even though I'm running the same query over
>> >> and over (no cache??)
>> >> Any ideas on how to troubleshoot this, or better, how to solve this ?
>> >> thanks
>> >> Philippe
>> > 
>> > 
>> > 
>> > -- 
>> > Jonathan Ellis
>> > Project Chair, Apache Cassandra
>> > co-founder of DataStax, the source for professional Cassandra support
>> > http://www.datastax.com
>> 
> 
> 

Reply via email to