Hi Vodnok, For tag searches I would use a search engine like Solr (Lucene), as I think it would be more flexible to query. You can update the index as new data comes in and query it for queries #1, #2 and #4.
For "All doc of type='BOT' and c_bot_code='ABC'" query, I would create the CF below. doc_types { 'BOT:ABC': { <docid>: <creation_date?> } } You can assign a value you are going to need when after querying to the docid. The problem with this schema is that if there are not many type:c_bot_code combinations, there will be many columns under each key in this CF. If a combination has much much more columns than others, hot spot problem may arise. On Tue, Mar 1, 2011 at 11:39 PM, Vodnok <vod...@gmail.com> wrote: > Hi, > > Totaly newbie on Cassandra (with phpcassa) with big background on > relationned database, i'm would like to use Cassandra for a trivial case. So > i'm on it since 3 days. Sorry for my stupid question. I'm pretty sure i'm > wrong but i want to learn so i'm here > > > I would like your advise on a design for cassandra. > > > Case: > > - Users created Docs and can share docs with friends > - Users can read and share docs of their friends with other friends > - Docs can be of different type [text;picture;video;etc] > - Docs can be taggued > > > > Typical queries : > > > - Doc relative to tag > - Doc relative to mutiple tags > - Doc readed by user x > - Doc relative to tag and ratio readed_shared greater than x (see design) > - All doc of type='IMG' favorized by my friend > - All doc of type='BOT' and c_bot_code='ABC' > - All doc of type='BOT' favorized by my friend relative (tag) with 'fire' > and 'belgium' ? > > > > Design : > > > docs // all docs > { > ‘123456’: //id_docs > { > ‘t_info’: > { > 'c_type':'BOT' > 'b_del':'y' > 'b_publish':'y' > } > 't_info_type': > { > 'l_title':'Hello World!' > 'c_bot_code':'ABC' > } > 't_read_user' : //read by user x > { > //time + id_user > '123456789_123':'123' > '123456789_155':'155' > } > 't_shared_user' : //share by user x > { > //time + id_user > '123456789_123':'123' > '123456789_155':'155' > } > 't_tags' > { > 'fire':'fire' > 'belgium':'belgium' > } > 't_stats' > { > 'n_readed':'60' > 'n_shared':'6' > 'n_ratio_readed_shared':'0.1' > } > } > } > > > tags_docs // all tag linked to docs > { > 'fire'://tag > { > //creation_time + id_docs > '456789_123456': > { > 'id_doc':'123456' > 'time':'456789' > } > '456789_223456':'223456': > { > 'id_doc':'123456' > 'time':'456789' > } > '456789_323456':'223456': > { > 'id_doc':'123456' > 'time':'456789' > } > } > 'belgium': > { > ... > } > } > > > users // all users > { > ‘123’: //id_user > { > ‘t_info’: > { > l_name:'Boris' > c_lang='fr' > > } > 't_readed_docs': > { > //time + id_doc > '123456789_123456':'123456' > '123458789_136456':'136456' > } > 't_shared_docs': > { > //time + id_doc > '123456789_123456':'123456' > '123458789_136456':'136456' > } > } > } > > > users_docs // all action by users on docs > { > ‘123_123456’: // id_user + id_doc > { > 'id_doc':'123456' > 'id_user':'123' > 'd_readed':'20110301' > 'd_shared':'20110301' > } > } > > > user_friends_act // all activity of user friends > { > ‘123’:// id_user > { > 't_readed_docs': //all docs readed by my friends > { > '223456_224_123456': // time + id_friend + id_docs > { > 'id_friend':'224' > 'id_docs':'123456' > 'time':'223456' > 'c_type='BOT' > } > } > 't_shared_docs': //all docs shared by my friends > { > '223456_224_123456': // time + id_friend + id_docs > { > 'id_friend':'224' > 'id_docs':'123456' > 'time':'223456' > 'c_type='BOT' > } > } > } > } > > > > I know that certain queries are not possible for now like : - All doc of > type='BOT' favorized by my friend relative (tag) with 'fire' and 'belgium' ? > > > > What do you think ? > > > Thank you, > > > Vodnok, > > > (Please remember i'm on cassandra since 3 days) >