You may be swapping. http://spyced.blogspot.com/2010/01/linux-performance-basics.html explains how to check this as well as how to see what threads are busy in the Java process.
On Sat, Jun 4, 2011 at 5:34 PM, Philippe <watche...@gmail.com> wrote: > Hello, > I am evaluating using cassandra and I'm running into some strange IO > behavior that I can't explain, I'd like some help/ideas to troubleshoot it. > I am running a 1 node cluster with a keyspace consisting of two columns > families, one of which has dozens of supercolumns itself containing dozens > of columns. > All in all, this is a couple gigabytes of data, 12GB on the hard drive. > The hardware is pretty good : 16GB memory + RAID-0 SSD drives with LVM and > an i5 processor (4 cores). > Keyspace: xxxxxxxxxxxxxxxxxxx > Read Count: 460754852 > Read Latency: 1.108205793092766 ms. > Write Count: 30620665 > Write Latency: 0.01411020877567486 ms. > Pending Tasks: 0 > Column Family: xxxxxxxxxxxxxxxxxxxxxxxxxx > SSTable count: 5 > Space used (live): 548700725 > Space used (total): 548700725 > Memtable Columns Count: 0 > Memtable Data Size: 0 > Memtable Switch Count: 11 > Read Count: 2891192 > Read Latency: NaN ms. > Write Count: 3157547 > Write Latency: NaN ms. > Pending Tasks: 0 > Key cache capacity: 367396 > Key cache size: 367396 > Key cache hit rate: NaN > Row cache capacity: 112683 > Row cache size: 112683 > Row cache hit rate: NaN > Compacted row minimum size: 125 > Compacted row maximum size: 924 > Compacted row mean size: 172 > Column Family: yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy > SSTable count: 7 > Space used (live): 8707538781 > Space used (total): 8707538781 > Memtable Columns Count: 0 > Memtable Data Size: 0 > Memtable Switch Count: 30 > Read Count: 457863660 > Read Latency: 2.381 ms. > Write Count: 27463118 > Write Latency: NaN ms. > Pending Tasks: 0 > Key cache capacity: 4518387 > Key cache size: 4518387 > Key cache hit rate: 0.9247881700850826 > Row cache capacity: 1349682 > Row cache size: 1349682 > Row cache hit rate: 0.39400533823415573 > Compacted row minimum size: 125 > Compacted row maximum size: 6866 > Compacted row mean size: 165 > My app makes a bunch of requests using a MultigetSuperSliceQuery for a set > of keys, typically a couple dozen at most. It also selects a subset of the > supercolumns. I am running 8 requests in parallel at most. > > Two days, I ran a 1.5 hour process that basically read every key. The server > had no IOwaits and everything was humming along. However, right at the end > of the process, there was a huge spike in IOs. I didn't think much of it. > Today, after two days of inactivity, any query I run raises the IOs to 80% > utilization of the SSD drives even though I'm running the same query over > and over (no cache??) > Any ideas on how to troubleshoot this, or better, how to solve this ? > thanks > Philippe -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com