As the actual problem is mostly related to the number of CFs in the system (may be number of the columns), I still believe that supporting exposing the Cassandra ‘as-is’ to a tenant is doable and suitable though need some fixes. That multi-tenancy model allows a tenant to use the programming model of the Cassandra ‘as-is’, enabling the seamless migration of an application that uses the Cassandra into the cloud. Moreover, In order to support different SLA requirements of different tenants, the configurability of keyspaces, cfs, etc., per tenant may be critical. However, there are trade-offs among usability, memory consumption, and performance. I believe that it is important to consider the SLA requirements of different tenants when deciding the strategies for controlling resource consumption.
I like to the idea of system-wide parameters for controlling resource usage. I believe that the tenant-specific parameters are equally important. There are resources, and each tenant can claim a portion of them based on SLA. For instance, if there is a threshold on the number of columns per a node, it should be able to decide how many columns a particular tenant can have. It allows selecting a suitable Cassandra cluster for a tenant based on his or her SLA. I believe the capability to configure resource controlling parameters per keyspace would be important to support a keyspace per tenant model. Furthermore, In order to maximize the resource sharing among tenants, a threshold (on a resource) per keyspace should not be a hard limit. Rather, it should be oscillated between a hard minimum and a maximum. For example, if a particular tenant needs more resources at a given time, he or she should be possible to borrow from the others up to the maximum. The threshold is only considered when a tenant is assigned to a cluster - the remaining resources of a cluster should be equal or higher than the resource limit of the tenant. It may need to spread a single keyspace across multiple clusters; especially when there are no enough resources in a single cluster. I believe that it would be better to have a flexibility to change seamlessly multi-tenancy implementation models such as the Cassadra ‘as-is’, the keyspace per tenant model, a keyspace for all tenants, and so on. Based on what I have learnt, each model requires adding tenant id (name space) to a keyspace’s name or cf’s name or raw key, or column’s name. Would it be better to have a kind of pluggable handler that can access those resources prior to doing the actual operation so that the required changes can be done? May be prior to authorization. Thanks, Indika