Hi Fernando, I used to have a cluster with ~300 tables (1 keyspace) on C* 2.0, it was a real pain in terms of operations. Repairs were terribly slow, boot of C* slowed down and in general tracking table metrics becomes bit more work. Why do you need this high number of tables?
Tommaso On Tue, Mar 1, 2016 at 9:16 AM, Fernando Jimenez < fernando.jime...@wealth-port.com> wrote: > Hi Jack > > By entry I mean row > > Apologies for the “obsolete terminology”. When I first looked at Cassandra > it was still on CQL2, and now that I’m looking at it again I’ve defaulted > to the terms I already knew. I will bear it in mind and call them tables > from now on. > > Is there any documentation about this limit? for example, I’d be keen to > know how much memory is consumed per table, and I’m also curious about the > reasons for keeping this in memory. I’m trying to understand the > limitations here, rather than challenge them. > > So far I found nothing in my search, hence why I had to resort to some > “load testing” to see what happens when you push the table count high > > Thanks > FJ > > > On 01 Mar 2016, at 06:23, Jack Krupansky <jack.krupan...@gmail.com> wrote: > > 3,000 entries? What's an "entry"? Do you mean row, column, or... what? > > You are using the obsolete terminology of CQL2 and Thrift - column family. > With CQL3 you should be creating "tables". The practical recommendation of > an upper limit of a few hundred tables across all key spaces remains. > > Technically you can go higher and technically you can reduce the overhead > per table (an undocumented Jira - intentionally undocumented since it is > strongly not recommended), but... it is unlikely that you will be happy > with the results. > > What is the nature of the use case? > > You basically have two choices: an additional cluster column to > distinguish categories of table, or separate clusters for each few hundred > of tables. > > > -- Jack Krupansky > > On Mon, Feb 29, 2016 at 12:30 PM, Fernando Jimenez < > fernando.jime...@wealth-port.com> wrote: > >> Hi all >> >> I have a use case for Cassandra that would require creating a large >> number of column families. I have found references to early versions of >> Cassandra where each column family would require a fixed amount of memory >> on all nodes, effectively imposing an upper limit on the total number of >> CFs. I have also seen rumblings that this may have been fixed in later >> versions. >> >> To put the question to rest, I have setup a DSE sandbox and created some >> code to generate column families populated with 3,000 entries each. >> >> Unfortunately I have now hit this issue: >> https://issues.apache.org/jira/browse/CASSANDRA-9291 >> >> So I will have to retest against Cassandra 3.0 instead >> >> However, I would like to understand the limitations regarding creation of >> column families. >> >> * Is there a practical upper limit? >> * is this a fixed limit, or does it scale as more nodes are added into >> the cluster? >> * Is there a difference between one keyspace with thousands of column >> families, vs thousands of keyspaces with only a few column families each? >> >> I haven’t found any hard evidence/documentation to help me here, but if >> you can point me in the right direction, I will oblige and RTFM away. >> >> Many thanks for your help! >> >> Cheers >> FJ >> >> >> > >