If your 1K tables might grow to 5 or 10K, then doesn’t that mean you would be trying to add columns, later, after you’ve populated your data? If so, that would argue for using one or more map columns, to accommodate the dynamic addition of pseudo-columns.
Once again, look at your queries (as they would be today and as in the future as you expand the data) since they will be your ultimate guide as to how to model your data. And drill deeper into how you will be inserting and updating the data in “groups” – that will guide the data modeling as well. What will the typical update use cases look like? By all means, start simple, but also be careful not to paint yourself into a corner. In the alternative, be prepared to throw away entire implementations as your conceptualization of the data evolves. -- Jack Krupansky From: tommaso barbugli Sent: Saturday, July 12, 2014 3:12 PM To: user@cassandra.apache.org Subject: Re: keyspace with hundreds of columnfamilies hi Jack thank you for your clear answer! On Saturday, 12 July 2014, Jack Krupansky <j...@basetechnology.com> wrote: 1. What does your data look like – 100 small integers or short strings and dates, or... 100 massive blobs? it will be only small short strings/varints no blobs or nested data 2. What operations are you doing on those rows – reading and updating individual columns, or mostly full-row upserts? mostly read write grops of columns (previously i had those set of columns in different CFs) 3. 100 columns in a CQL row is not so unreasonable, per se. 4. The ultimate answer to any “how will it perform” question is to do a “proof of concept” implementation since it really all depends on your actual data and hardware setup, such as memory, cpu, I/O, and networking – IOW, all the non-Cassandra factors can easily dwarf Cassandra itself. 5. As far as 1K tables with 10 columns vs. 100 tables with 100 columns – it should primarily be your queries (and updates) that drive the decision. Do fewer tables and more columns make your queries (and updates) a lot simpler and cleaner? yes code-wise it does; i am just scared that i will get into some bad situation problem when 1k CFs will grow to 5 or 10k -- Jack Krupansky From: tommaso barbugli Sent: Saturday, July 12, 2014 7:58 AM To: javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org'); Subject: Re: keyspace with hundreds of columnfamilies hi, how is a table with hundreds columns is going to perform? i am moving from 1k column families each with 10 columns to 100 CFs each with 100 columns. thank you tommaso On Friday, 11 July 2014, Sourabh Agrawal <javascript:_e(%7B%7D,'cvml','iitr.sour...@gmail.com');> wrote: Yes, what about CQL style columns? Please clarify On Sat, Jul 5, 2014 at 12:32 PM, tommaso barbugli <javascript:_e(%7B%7D,'cvml','tbarbu...@gmail.com');> wrote: Yes my question what about CQL-style columns. 2014-07-04 12:40 GMT+02:00 Jens Rantil <javascript:_e(%7B%7D,'cvml','jens.ran...@tink.se');>: Just so you guys aren't misunderstanding each other; Tommaso, you were not refering to CQL-style columns, right? /J On Fri, Jul 4, 2014 at 10:18 AM, Romain HARDOUIN <javascript:_e(%7B%7D,'cvml','romain.hardo...@urssaf.fr');> wrote: Cassandra can handle many more columns (e.g. time series). So 100 columns is OK. Best, Romain tommaso barbugli <javascript:_e(%7B%7D,'cvml','tbarbu...@gmail.com');> a écrit sur 03/07/2014 21:55:18 : > De : tommaso barbugli <javascript:_e(%7B%7D,'cvml','tbarbu...@gmail.com');> > A : javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org');, > Date : 03/07/2014 21:55 > Objet : Re: keyspace with hundreds of columnfamilies > > thank you for the replies; I am rethinking the schema design, one > possible solution is to "implode" one dimension and get N times less CFs. > With this approach I would come up with (cql) tables with up to 100 > columns; would that be a problem? > > Thank You, > Tommaso > -- Sourabh Agrawal Bangalore +91 9945657973 -- sent from iphone (sorry for the typos) -- sent from iphone (sorry for the typos)