> I have a Cassandra installation where we plan to store 1Tb of data, split 
> between two 1Tb disks.
In general it's a good idea to limit the per node storage to 300GB to 400GB. 
This has more to do with operational issues that any particular issue with 
cassandra. However storing a very large number of keys on a single node can 
result in high memory usage while the server is idling, and reduced read 
performance. 
 
> I know that tiered compaction needs 50% free disk space for worst case 
> situation. 
Not really now days, but it's a good idea to treat 50% as a soft limit. 

> How does this combine with the disk split? 
Whenever a new file is written to disk it will use the data directory with the 
most space. In general we recommend using a single data directory. 

Hope that helps. 


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 21/06/2012, at 10:56 PM, Flavio Baronti wrote:

> Hi,
> 
> I have a Cassandra installation where we plan to store 1Tb of data, split 
> between two 1Tb disks.
> Tiered compation should be better suited for our workload (append-only, 
> deletion of old data, few reads).
> I know that tiered compaction needs 50% free disk space for worst case 
> situation. How does this combine with the disk split? What happens if I have 
> 500Gb of data in one disk and 500Gb in the other? Won't compaction try to 
> build a single 1Tb file, failing since there are only 500Gb free on each disk?
> 
> Flavio
> 

Reply via email to