Moreover, if you are using SSDs keeping data directories and commitlog on 
separate disks wont provide much benefit.


As Nate said, relying on RAID with RF=1 is not good design. Cassandra replicas 
provide greater fault tolerance and HA as they are on different nodes. 


Thanks

Anuj




Sent from Yahoo Mail on Android

From:"Nate McCall" <n...@thelastpickle.com>
Date:Sun, 19 Jul, 2015 at 1:20 am
Subject:Re: Unbalanced disk load

>
> I am currently benchmarking Cassandra with three machines, and on each 
> machine I am seeing an unbalanced distribution of data among the data 
> directories (1 per disk). 
> I am concerned that this affects my write performance, is there anything that 
> I can make the distribution be more even? Would raid0 be my best option?
>

Using LeveledCompactionStrategy should provide a much better balance. 

However, depending on your use case, this may not be the right choice for your 
workload, in which case RAID0 with a single data_dir will be the best option. 


 

> Total size of data is about 2TB, 14B records, all unique. Replication factor 
>of 1.


RF=1 means *no* redundancy which is a bad idea to run in production (and sort 
of defeats the purpose of a system like Cassandra). This is not going to be an 
accurate a picture for a load test as it eliminates a lot of cross-node traffic 
which you would see with a higher Replication Factor. 


--
-----------------
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Reply via email to