as the wiki suggested: http://wiki.apache.org/cassandra/LargeDataSetConsiderations Adding nodes is a slow process if each node is responsible for a large amount of data. Plan for this; do not try to throw additional hardware at a cluster at the last minute.
I really would like to know what's the status of my cluster, if it is normal On Mon, Jul 25, 2011 at 8:59 PM, Yan Chunlu <springri...@gmail.com> wrote: > I am using normal SATA disk, actually I was worrying about whether it > is okay if every time cassandra using all the io resources? > further more when is the good time to add more nodes when I was just > using normal SATA disk and with 100r/s it could reach 100 %util.... > > how large the data size it should be on each node? > > > below is my iostat -x 2 when doing node repair, I have to repair > column family separately otherwise the load will be more crazy: > > Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s > avgrq-sz avgqu-sz await r_await w_await svctm %util > sda 1.50 1.50 121.50 14.00 3.68 0.30 > 60.19 116.98 1569.46 59.49 14673.86 7.38 100.00 > > > > > > > On Sun, Jul 24, 2011 at 8:04 AM, Jonathan Ellis <jbel...@gmail.com> wrote: >> On Sat, Jul 23, 2011 at 4:16 PM, Francois Richard <frich...@xobni.com> wrote: >>> My understanding is that during compaction cassandra does a lot of non >>> sequential readsa then dumps the results with a big sequential write. >> >> Compaction reads and writes are both sequential, and 0.8 allows >> setting a MB/s to cap compaction at. >> >> As to the original question "do I need to add more machines" I'd say >> that depends more on whether your application's SLA is met, than what >> % io util spikes to. >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of DataStax, the source for professional Cassandra support >> http://www.datastax.com >> >