Hi,
We are interested in a multi-tenancy environment, that may consist of up to
hundreds of data centers. The current design requires cross rack and cross
DC replication. Specifically, the per-tenant CFs will be replicated 6 times:
in three racks,  with 2 copies inside a rack, the racks will be located in
at least two different DCs. In the future other replication policies will be
considered. The application will decide where (which racks and DC)  to place
each tenant's replicas.  and it might be that one rack can hold more than
one tenant.

Separating each tenant in a different keyspace, as was suggested
in  previous mail thread in this subject, seems to be a good approach
(assuming the memtable problem will be solved somehow).
But then we had concern with regard to the cluster size.
and here are my questions:
1) Given the above, should I define one Cassandra cluster that hold all the
DCs? sounds not reasonable  given hundreds DCs tens of servers in each DC
etc. Where is the bottleneck here? keep-alive messages, the gossip, request
routing? what is the largest number of servers a cluster can bear?
2) Now assuming that I can create the per-tenant  keyspace only for  the
servers that in the three racks where the replicas are held,  does such
definition reduces the messaging transfer among the other servers. Does
Cassandra optimizes the message transfer in such case?
3) Additional possible solution was to create a separate clusters per each
tenant. But it can cause a situation where one server has to run two or more
Cassandra's clusters. Can we run more than one cluster in parallel, does it
means two cassandra daemons / instances on one server? what will be the
overhead? do you have a link that explains how to deal with it?

Please can you help me to decide which of these solution can work or you are
welcome to suggest something else.
Thanks a lot,
Mimi

Reply via email to