>> and nodetool tpstats always shows pending tasks in the ReadStage. Are clients reading a single row at a time or multiple rows ? Each row requested in a multi get becomes a task in the read stage.
Also look at the type of query you are sending. I talked a little about the performance of different query techniques at Cassandra SFhttp://www.datastax.com/events/cassandrasummit2012/presentations > 1. Consider Leveled compaction instead of Size Tiered. LCS improves > read performance at the cost of more writes. I would look at other options first. If you want to know how many SSTables a read is hitting look at nodetool cfhistograms > 2. You said "skinny column family" which I took to mean not a lot of > columns/row. See if you can organize your data into wider rows which > allow reading fewer rows and thus fewer queries/disk seeks. Wide rows take longer to read than narrow ones. Artificially wide rows may take longer to read than narrow ones. > 4. Splitting your data from your MetaData could definitely help. I > like separating my read heavy from write heavy CF's because generally > speaking they benefit from different compaction methods. But don't go > crazy creating 1000's of CF's either. +1 25 ms read latency is high. Hope that helps. ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 23/10/2012, at 9:06 AM, Aaron Turner <synfina...@gmail.com> wrote: > On Mon, Oct 22, 2012 at 11:05 AM, feedly team <feedly...@gmail.com> wrote: >> Hi, >> I have a small 2 node cassandra cluster that seems to be constrained by >> read throughput. There are about 100 writes/s and 60 reads/s mostly against >> a skinny column family. Here's the cfstats for that family: >> >> SSTable count: 13 >> Space used (live): 231920026568 >> Space used (total): 231920026568 >> Number of Keys (estimate): 356899200 >> Memtable Columns Count: 1385568 >> Memtable Data Size: 359155691 >> Memtable Switch Count: 26 >> Read Count: 40705879 >> Read Latency: 25.010 ms. >> Write Count: 9680958 >> Write Latency: 0.036 ms. >> Pending Tasks: 0 >> Bloom Filter False Postives: 28380 >> Bloom Filter False Ratio: 0.00360 >> Bloom Filter Space Used: 874173664 >> Compacted row minimum size: 61 >> Compacted row maximum size: 152321 >> Compacted row mean size: 1445 >> >> iostat shows almost no write activity, here's a typical line: >> >> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz >> avgqu-sz await svctm %util >> sdb 0.00 0.00 312.87 0.00 6.61 0.00 43.27 >> 23.35 105.06 2.28 71.19 >> >> and nodetool tpstats always shows pending tasks in the ReadStage. The data >> set has grown beyond physical memory (250GB/node w/64GB of RAM) so I know >> disk access is required, but are there particular settings I should >> experiment with that could help relieve some read i/o pressure? I already >> put memcached in front of cassandra so the row cache probably won't help >> much. >> >> Also this column family stores smallish documents (usually 1-100K) along >> with metadata. The document is only occasionally accessed, usually only the >> metadata is read/written. Would splitting out the document into a separate >> column family help? >> > > Some un-expert advice: > > 1. Consider Leveled compaction instead of Size Tiered. LCS improves > read performance at the cost of more writes. > > 2. You said "skinny column family" which I took to mean not a lot of > columns/row. See if you can organize your data into wider rows which > allow reading fewer rows and thus fewer queries/disk seeks. > > 3. Enable compression if you haven't already. > > 4. Splitting your data from your MetaData could definitely help. I > like separating my read heavy from write heavy CF's because generally > speaking they benefit from different compaction methods. But don't go > crazy creating 1000's of CF's either. > > Hope that gives you some ideas to investigate further! > > > -- > Aaron Turner > http://synfin.net/ Twitter: @synfinatic > http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & > Windows > Those who would give up essential Liberty, to purchase a little temporary > Safety, deserve neither Liberty nor Safety. > -- Benjamin Franklin > "carpe diem quam minimum credula postero"