Right sizing Cassandra data nodes

Charulata Sharma (charshar) Mon, 19 Feb 2018 12:09:07 -0800

Hi All,

Looking for some insight into how application data archive and purge is carried 
out for C* database. Are there standard guidelines on calculating the amount of 
space that can be used for storing data in a specific node.


Some pointers that I got while researching are;


-          Allocate 50% space for compaction, e.g. if data size is 50GB then 
allocate 25GB for compaction.

-          Snapshot strategy. If old snapshots are present, then they occupy 
the disk space.

-          Allocate some percentage of storage ( ???? ) for system tables and 
OpsCenter tables ?

We have a scenario where certain transaction data needs to be archived based on 
business rules and some purged, so before deciding on an A&P strategy, I am 
trying to analyze
how much transactional data can be stored given the current node capacity. I 
also found out that the space available metric shown in Opscenter is not very 
reliable because it doesn’t show
the snapshot space. In our case, we have a huge snapshot size. For some 
unexplained reason, we seem to be taking snapshots of our data every hour and 
purging them only after 7 days.


Thanks,
Charu
Cisco Systems.

Right sizing Cassandra data nodes

Reply via email to