Dear Jean-Yves, You can have a different approach of the problem. You need on one side a relational database (MySQL, PostGreSQL) or SolR (as an very efficient index) and on the other side Cassandra. The relational database or SolR must contain the minimum amount of information possible : a date and only the relevant data. It enabled me to keep a simple model for Cassandra. Cassandra will act as a "vault" where you keep all the data and then you dispatch the data from Cassandra to the relational database or SolR. When you want to query you query against SolR or the relational data the key / column / supercolumn and you retrieve the complete data from Cassandra. The hard thing is to maintain the coherence between the query part and the Cassandra part. I speak from personal experience but it was very hard for me to use only Cassandra to do everything my (small amateur) website needed. Now I found an alternative I use : Cassandra (data vault) + Redis (Sessions and other volatile data) + SolR (Search engine) + PostGreSQL ( for relational queries).
Best regards, Victor Kabdebon http://www.voxnucleus.fr 2011/4/13 Edward Capriolo <edlinuxg...@gmail.com> > On Wed, Apr 13, 2011 at 10:39 AM, Jean-Yves LEBLEU <jleb...@gmail.com> > wrote: > > Hi all, > > > > Just some thoughts and question I have about cassandra data modeling. > > > > If I understand well, cassandra is better on writing than on reading. > > So you have to think about your queries to design cassandra schema. We > > are doing incremental design, and already have our system in > > production and we have to develop new queries. > > How do you usualy do when you have new queries, do you write a > > specific job to update data in the database to match the new query you > > are writing ? > > > > Thanks for your help. > > > > Jean-Yves > > > > Good point, Generally you will need to write some type of range > scanning/map reduce application to process and back fill your data. >