Good Day Everyone, I am very happy with the (almost) linear scalability offered by C*. We had a lot of problems with RDBMS.
But, I heard that C* has a limit on number of column families that can be created in a single cluster. The reason being each CF stores 1-2 MB on the JVM heap. In our use case, we have about 10000+ CF and we want to support multi-tenancy. (i.e 10000 * no of tenants) We are new to C* and being from RDBMS background, I would like to understand how to tackle this scenario from your advice. Our plan is to use Off-Heap memtable approach. http://www.datastax.com/dev/blog/off-heap-memtables-in-Cassandra-2-1 Each node in the cluster has following configuration 16 GB machine (8GB Cassandra JVM + 2GB System + 6GB Off-Heap) IMO, this should be able to support 1000 CF with no(very less) impact on performance and startup time. We tackle multi-tenancy using different keyspaces.(Solution I found on the web) Using this approach we can have 10 clusters doing the job. (We actually are worried about the cost) Can you please help us evaluate this strategy? I want to hear communities opinion on this. My major concerns being, 1. Is Off-Heap strategy safe and my assumption of 16 GB supporting 1000 CF right? 2. Can we use multiple keyspaces to solve multi-tenancy? IMO, the number of column families increase even when we use multiple keyspace. 3. I understand the complexity using multi-cluster for single application. The code base will get tightly coupled with infrastructure. Is this the right approach? Any suggestion is appreciated. Thanks, Arun