Re: do I need to add more nodes? minor compaction eat all IO

aaron morton Mon, 25 Jul 2011 15:42:17 -0700

There are no hard and fast rules to add new nodes, but here are two guidelines:


1) Single node load is getting too high, rule of thumb is 300GB is probably too 
high. 
2) There are times when the cluster cannot keep up with throughout, for example 
the client is getting TimedOutExceptions or TPStats is showing consistently 
high (a multiple of the available threads) read or write pending queues. 

What works for you will be what keeps your site running and keeps the ops/dev 
team sleeping at night.   

In your case, high IO during repair maybe OK if the cluster can keep up with 
demands. Or it may mean you need to upgrade the IO capacity or add nodes. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 26 Jul 2011, at 01:17, Yan Chunlu wrote:

> as the wiki suggested:
> http://wiki.apache.org/cassandra/LargeDataSetConsiderations
> Adding nodes is a slow process if each node is responsible for a large
> amount of data. Plan for this; do not try to throw additional hardware
> at a cluster at the last minute.
> 
> 
> I really would like to know what's the status of my cluster, if it is normal
> 
> 
> On Mon, Jul 25, 2011 at 8:59 PM, Yan Chunlu <springri...@gmail.com> wrote:
>> I am using normal SATA disk,  actually I was worrying about whether it
>> is okay if every time cassandra using all the io resources?
>> further more when is the good time to add more nodes when I was just
>> using normal SATA disk and with 100r/s it could reach 100 %util....
>> 
>> how large the data size it should be on each node?
>> 
>> 
>> below is my iostat -x 2 when doing node repair, I have to repair
>> column family separately otherwise the load will be more crazy:
>> 
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s
>> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>> sda               1.50     1.50  121.50   14.00     3.68     0.30
>> 60.19   116.98 1569.46   59.49 14673.86   7.38 100.00
>> 
>> 
>> 
>> 
>> 
>> 
>> On Sun, Jul 24, 2011 at 8:04 AM, Jonathan Ellis <jbel...@gmail.com> wrote:
>>> On Sat, Jul 23, 2011 at 4:16 PM, Francois Richard <frich...@xobni.com> 
>>> wrote:
>>>> My understanding is that during compaction cassandra does a lot of non 
>>>> sequential readsa then dumps the results with a big sequential write.
>>> 
>>> Compaction reads and writes are both sequential, and 0.8 allows
>>> setting a MB/s to cap compaction at.
>>> 
>>> As to the original question "do I need to add more machines" I'd say
>>> that depends more on whether your application's SLA is met, than what
>>> % io util spikes to.
>>> 
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of DataStax, the source for professional Cassandra support
>>> http://www.datastax.com
>>> 
>>

Re: do I need to add more nodes? minor compaction eat all IO

Reply via email to