Re: Practical limit on number of column families

tommaso barbugli Tue, 01 Mar 2016 00:37:19 -0800

Hi Fernando,

I used to have a cluster with ~300 tables (1 keyspace) on C* 2.0, it was a
real pain in terms of operations. Repairs were terribly slow, boot of C*
slowed down and in general tracking table metrics becomes bit more work.
Why do you need this high number of tables?


Tommaso

On Tue, Mar 1, 2016 at 9:16 AM, Fernando Jimenez <
fernando.jime...@wealth-port.com> wrote:

> Hi Jack
>
> By entry I mean row
>
> Apologies for the “obsolete terminology”. When I first looked at Cassandra
> it was still on CQL2, and now that I’m looking at it again I’ve defaulted
> to the terms I already knew. I will bear it in mind and call them tables
> from now on.
>
> Is there any documentation about this limit? for example, I’d be keen to
> know how much memory is consumed per table, and I’m also curious about the
> reasons for keeping this in memory. I’m trying to understand the
> limitations here, rather than challenge them.
>
> So far I found nothing in my search, hence why I had to resort to some
> “load testing” to see what happens when you push the table count high
>
> Thanks
> FJ
>
>
> On 01 Mar 2016, at 06:23, Jack Krupansky <jack.krupan...@gmail.com> wrote:
>
> 3,000 entries? What's an "entry"? Do you mean row, column, or... what?
>
> You are using the obsolete terminology of CQL2 and Thrift - column family.
> With CQL3 you should be creating "tables". The practical recommendation of
> an upper limit of a few hundred tables across all key spaces remains.
>
> Technically you can go higher and technically you can reduce the overhead
> per table (an undocumented Jira - intentionally undocumented since it is
> strongly not recommended), but... it is unlikely that you will be happy
> with the results.
>
> What is the nature of the use case?
>
> You basically have two choices: an additional cluster column to
> distinguish categories of table, or separate clusters for each few hundred
> of tables.
>
>
> -- Jack Krupansky
>
> On Mon, Feb 29, 2016 at 12:30 PM, Fernando Jimenez <
> fernando.jime...@wealth-port.com> wrote:
>
>> Hi all
>>
>> I have a use case for Cassandra that would require creating a large
>> number of column families. I have found references to early versions of
>> Cassandra where each column family would require a fixed amount of memory
>> on all nodes, effectively imposing an upper limit on the total number of
>> CFs. I have also seen rumblings that this may have been fixed in later
>> versions.
>>
>> To put the question to rest, I have setup a DSE sandbox and created some
>> code to generate column families populated with 3,000 entries each.
>>
>> Unfortunately I have now hit this issue:
>> https://issues.apache.org/jira/browse/CASSANDRA-9291
>>
>> So I will have to retest against Cassandra 3.0 instead
>>
>> However, I would like to understand the limitations regarding creation of
>> column families.
>>
>> * Is there a practical upper limit?
>> * is this a fixed limit, or does it scale as more nodes are added into
>> the cluster?
>> * Is there a difference between one keyspace with thousands of column
>> families, vs thousands of keyspaces with only a few column families each?
>>
>> I haven’t found any hard evidence/documentation to help me here, but if
>> you can point me in the right direction, I will oblige and RTFM away.
>>
>> Many thanks for your help!
>>
>> Cheers
>> FJ
>>
>>
>>
>
>

Re: Practical limit on number of column families

Reply via email to