Re: Cassandra table limitation

Jack Krupansky Wed, 06 Apr 2016 06:35:12 -0700

So, best case, with 50 tables per tenant, you could support less than ten
tenants ("ten ants" - they have small data, ha ha!) per cluster.


Out of curiosity, how much data might a single tenant have.

If the tenants shared a data model then you could separate their data using
a tenant ID in the partition key, so the question is whether each tenant
has a completely distinct data model or just some customization to a core
data model. You can always do limited customization within a single table
using a map field with arbitrary key values. You could also use an
application layer to separate the client apps from the actual Cassandra
database.

Unless these tenants share the same data model, there is no reason for
multi-tenancy. I mean, if you were performing analytics across tenants,
that would be one thing, but otherwise, no tenant will gain any benefit
from being on the same cluster as the other tenants.

Were there any other specific reasons for choosing Cassandra other than
pursuing multi-tenancy? Out of curiosity, what source of information
pointed you in the direction of multi-tenancy?


-- Jack Krupansky

On Wed, Apr 6, 2016 at 1:17 AM, Kai Wang <dep...@gmail.com> wrote:

> With small data size and unknown access pattern, any particular reason to
> choose C*? It sounds like a relational database fits better.
>
> On Tue, Apr 5, 2016 at 11:40 PM, jason zhao yang <
> zhaoyangsingap...@gmail.com> wrote:
>
>> Hi Jack,
>>
>> Thanks for the reply.
>>
>> Each tenant will has around 50-100 tables for their applications.
>> probably log collection, probably account table, it's not fixed and depends
>> on tenants' need.
>>
>> There will be a team in charge of helping tenant to do data modeling and
>> access patterns. Tenants will not directly admin on the cluster, we will
>> take care.
>>
>> Yes, multi-cluster is a solution. But the cost will be quite high,
>> because each tenant's data is far less than the capacity of a 3 node
>> cluster. So I want to put multiple tenants into one clusters.
>>
>>
>>
>> Jack Krupansky <jack.krupan...@gmail.com>于2016年4月6日周三 上午10:41写道：
>>
>>> What is the nature of these tenants? Are they each creating their own
>>> data models? Is there one central authority that will approve of all data
>>> models and who can adjust the cluster configuration to support those models?
>>>
>>> Generally speaking, multi-tenancy is an anti-pattern for Cassandra and
>>> for most servers. The proper way to do multitenancy is to not do it at all,
>>> and to use separate machines or at least separate virtual machines.
>>>
>>> In particular, there needs to be a central authority managing a
>>> Cassandra cluster to assure its smooth operation. If each tenant is going
>>> in their own directions, then nobody will be in charge and capable of
>>> assuring that everybody is on the same page.
>>>
>>> Again, it depends on the nature of these tenants and how much control
>>> the cluster administrator has over them.
>>>
>>> Think of a Cassandra cluster as managing the data for either a single
>>> application or a collection of applications which share the same data. If
>>> there are multiple applications that don't share the same data, then they
>>> absolutely should be on separate clusters.
>>>
>>>
>>> -- Jack Krupansky
>>>
>>> On Tue, Apr 5, 2016 at 5:40 PM, Kai Wang <dep...@gmail.com> wrote:
>>>
>>>> Once a while the question about table count rises in this list. The
>>>> most recent is
>>>> https://groups.google.com/forum/#!topic/nosql-databases/IblAhiLUXdk
>>>>
>>>> In short C* is not designed to scale with the table count. For one each
>>>> table/CF has some fixed memory footprint on *ALL* nodes. The consensus is
>>>> you shouldn't have more than "a few hundreds" of tables.
>>>>
>>>> On Mon, Apr 4, 2016 at 10:17 AM, jason zhao yang <
>>>> zhaoyangsingap...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> This is Jason.
>>>>>
>>>>> Currently, I am using C* 2.1.10, I want to ask what's the optimal
>>>>> number of tables I should create in one cluster?
>>>>>
>>>>> My use case is that I will prepare a keyspace for each of my tenant,
>>>>> and every tenant will create tables they needed. Assume each tenant 
>>>>> created
>>>>> 50 tables with normal workload (half read, half write).   so how many
>>>>> number of tenants I can support in one cluster?
>>>>>
>>>>> I know there are a few issues related to large number of tables.
>>>>> * frequent GC
>>>>> * frequent flush due to insufficient memory
>>>>> * large latency when modifying table schema
>>>>> * large amount of tombstones during creating table
>>>>>
>>>>> Is there any other issues with large number of tables? Using a 32GB
>>>>> instance, I can easily create 4000 tables with off-heap-memtable.
>>>>>
>>>>> BTW, Is this table limitation solved in 3.X?
>>>>>
>>>>> Thank you very much.
>>>>>
>>>>>
>>>>
>>>
>

Re: Cassandra table limitation

Reply via email to