Hi Aiman, Can you please clarify whether the mentioned size of 800GB is considering Replication Factor(RF) or without it ? If yes, what is the RF ?
Also, what is the method used to measure keyspace data size e.g size of directory, nodetool command etc. It would be helpful to know about the cluster node configurations and topology used. On basis of information we have, the size 800GB for 15 nodes gives us 53.33GB of data per node which is quite normal for a Cassandra cluster. A question about growth of data, what is the estimated rate at which the data will grow ? If you can clarify these queries, it will be easy to talk about specific areas of solution. Thanks, Anup On 20 April 2018 at 12:08, Aiman Parvaiz <ai...@steelhouse.com> wrote: > Hi all > > I have been given a 15 nodes C* 2.2.8 cluster to manage which has a large > size KS (~800GB). Given the size of the KS most of the management tasks > like repair take a long time to complete and disk space management is > becoming tricky from the systems perspective. > > > This KS size is going to go up in future and we have a business > requirement of long data retention here. I wanted to share this with all of > you and ask what are my options here, what would be the best way to deal > with a large size KS like this one. To make situation even trickier low IO > latency is expected from this cluster as well. > > > Thankful for any suggestions/advice in advance. > > > >