>
> I am currently benchmarking Cassandra with three machines, and on each
machine I am seeing an unbalanced distribution of data among the data
directories (1 per disk).
> I am concerned that this affects my write performance, is there anything
that I can make the distribution be more even? Would raid0 be my best
option?
>

Using LeveledCompactionStrategy should provide a much better balance.

However, depending on your use case, this may not be the right choice for
your workload, in which case RAID0 with a single data_dir will be the best
option.

> Total size of data is about 2TB, 14B records, all unique. Replication
factor of 1.

RF=1 means *no* redundancy which is a bad idea to run in production (and
sort of defeats the purpose of a system like Cassandra). This is not going
to be an accurate a picture for a load test as it eliminates a lot of
cross-node traffic which you would see with a higher Replication Factor.


--
-----------------
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Reply via email to