Hi, sorry for re-posting, but it would be very helpful to get some input on my previous post, so I'd know which direction to take. So if anyone of the more experienced users here can help, it would be greatly appreciated.
Thank you Osi ---------- Forwarded message ---------- From: osishkin osishkin <osish...@gmail.com> Date: Wed, Sep 7, 2011 at 2:02 PM Subject: Re: CQL and schema-less column family To: user@cassandra.apache.org, eev...@acunu.com Thank you very much Eric for your response. Some follow-up questions come to mind: 1. What will be the performance hit for querying a coulmn name not predefined in a schema? if it's not indexed, then I guess Cassandra will have to iterate all rows,which will impose huge overhead. 2. Assuming my guess from the previous question is correct, then in order to get decent performance I need to index a column. Can you tell me if indexing a column name (not predefined in a schema) has any performance impact? I'm not yet sure whether CQL/secondary indexes is the right direction for me, as opposed to manually-maintained indexes. My application also requires range predicates columns with potentially high numbers of unique values. From what I gather, both (range predicates, high cardinality values) are very inefficient in CQL/secondary indexes. But I'd like to get the whole picture before deciding. In my system each row may contain a lot of columns, common to only part of the rows. If I understand correctly from the documentation, every index is actually implemented as a new "hidden" column family. This means that in my case if I use a secondary index for every column name, I can quickly get a LOT of column families just to hold the secondary indexes for all my rows. My intuition says updating dozens of column families on every insert would probably be very bad performance-wise, in comparison with manually updating a single "global" column family index of my own (with multiple inserts) Is this true? Thank you p.s. Since I don't know whether a secondary index for a column already exists, this means I have to check if such an index already exists every time, and create it if not. Things seem to get even worse from my point of view...:) On Wed, Sep 7, 2011 at 12:34 PM, Eric Evans <eev...@acunu.com> wrote: > On Tue, Sep 6, 2011 at 12:22 PM, osishkin osishkin <osish...@gmail.com> wrote: >> Sorry for the newbie question but I failed to find a clear answer. >> Can CQL be used to query a schema-less column family? can they be indexed? >> That is, query for column names that do not necessarily exist in all >> rows, and were not defined in advance when the column family was >> created. > > Absolutely, yes. > > If you don't create schema for columns, then their type will simply be > the default for that column family. > > -- > Eric Evans > Acunu | http://www.acunu.com | @acunu >