Hi Jack,
>you can reduce the overhead per table  an undocumented Jira Can you please 
>point to this Jira number?
 
>it is strongly not recommendedWhat is consequences of this (besides 
>performance degradation, if any)?
Thanks.


    On Tuesday, March 1, 2016 7:23 AM, Jack Krupansky 
<jack.krupan...@gmail.com> wrote:
 

 3,000 entries? What's an "entry"? Do you mean row, column, or... what?

You are using the obsolete terminology of CQL2 and Thrift - column family. With 
CQL3 you should be creating "tables". The practical recommendation of an upper 
limit of a few hundred tables across all key spaces remains.
Technically you can go higher and technically you can reduce the overhead per 
table (an undocumented Jira - intentionally undocumented since it is strongly 
not recommended), but... it is unlikely that you will be happy with the results.
What is the nature of the use case?
You basically have two choices: an additional cluster column to distinguish 
categories of table, or separate clusters for each few hundred of tables.

-- Jack Krupansky
On Mon, Feb 29, 2016 at 12:30 PM, Fernando Jimenez 
<fernando.jime...@wealth-port.com> wrote:

Hi all
I have a use case for Cassandra that would require creating a large number of 
column families. I have found references to early versions of Cassandra where 
each column family would require a fixed amount of memory on all nodes, 
effectively imposing an upper limit on the total number of CFs. I have also 
seen rumblings that this may have been fixed in later versions.
To put the question to rest, I have setup a DSE sandbox and created some code 
to generate column families populated with 3,000 entries each.
Unfortunately I have now hit this issue: 
https://issues.apache.org/jira/browse/CASSANDRA-9291
So I will have to retest against Cassandra 3.0 instead
However, I would like to understand the limitations regarding creation of 
column families. 
 * Is there a practical upper limit?  * is this a fixed limit, or does it scale 
as more nodes are added into the cluster?  * Is there a difference between one 
keyspace with thousands of column families, vs thousands of keyspaces with only 
a few column families each?
I haven’t found any hard evidence/documentation to help me here, but if you can 
point me in the right direction, I will oblige and RTFM away.
Many thanks for your help!
CheersFJ






  

Reply via email to