Hi Vodnok,

For tag searches I would use a search engine like Solr (Lucene), as I think
it would be more flexible to query. You can update the index as new data
comes in and query it for queries #1, #2 and #4.

For "All doc of type='BOT' and c_bot_code='ABC'" query, I would create the
CF below.

doc_types
{
   'BOT:ABC':
  {
    <docid>: <creation_date?>
  }
}

You can assign a value you are going to need when after querying to the
docid. The problem with this schema is that if there are not many
type:c_bot_code combinations, there will be many columns under each key in
this CF. If a combination has much much more columns than others, hot spot
problem may arise.



On Tue, Mar 1, 2011 at 11:39 PM, Vodnok <vod...@gmail.com> wrote:

> Hi,
>
> Totaly newbie on Cassandra (with phpcassa) with big background on
> relationned database, i'm would like to use Cassandra for a trivial case. So
> i'm on it since 3 days. Sorry for my stupid question. I'm pretty sure i'm
> wrong but i want to learn so i'm here
>
>
> I would like your advise on a design for cassandra.
>
>
> Case:
>
> - Users created Docs and can share docs with friends
> - Users can read and share docs of their friends with other friends
> - Docs can be of different type [text;picture;video;etc]
> - Docs can be taggued
>
>
>
> Typical queries :
>
>
> - Doc relative to tag
> - Doc relative to mutiple tags
> - Doc readed by user x
> - Doc relative to tag and ratio readed_shared greater than x (see design)
> - All doc of type='IMG' favorized by my friend
> - All doc of type='BOT' and c_bot_code='ABC'
> - All doc of type='BOT' favorized by my friend relative (tag) with 'fire'
> and 'belgium' ?
>
>
>
> Design :
>
>
> docs // all docs
> {
>     ‘123456’: //id_docs
>     {
>         ‘t_info’:
> {
>  'c_type':'BOT'
> 'b_del':'y'
> 'b_publish':'y'
>  }
> 't_info_type':
> {
>  'l_title':'Hello World!'
> 'c_bot_code':'ABC'
>  }
> 't_read_user' : //read by user x
> {
>  //time + id_user
> '123456789_123':'123'
> '123456789_155':'155'
>  }
> 't_shared_user' : //share by user x
> {
>  //time + id_user
> '123456789_123':'123'
> '123456789_155':'155'
>  }
> 't_tags'
> {
>  'fire':'fire'
> 'belgium':'belgium'
> }
>  't_stats'
> {
> 'n_readed':'60'
>  'n_shared':'6'
> 'n_ratio_readed_shared':'0.1'
>  }
> }
> }
>
>
> tags_docs // all tag linked to docs
> {
> 'fire'://tag
> {
> //creation_time + id_docs
>  '456789_123456':
> {
> 'id_doc':'123456'
>  'time':'456789'
> }
> '456789_223456':'223456':
>  {
> 'id_doc':'123456'
> 'time':'456789'
>  }
> '456789_323456':'223456':
> {
>  'id_doc':'123456'
> 'time':'456789'
> }
>  }
> 'belgium':
> {
>  ...
> }
> }
>
>
> users // all users
> {
>     ‘123’: //id_user
>     {
>         ‘t_info’:
> {
>  l_name:'Boris'
> c_lang='fr'
>
> }
>  't_readed_docs':
> {
> //time + id_doc
>  '123456789_123456':'123456'
> '123458789_136456':'136456'
>  }
> 't_shared_docs':
> {
>  //time + id_doc
> '123456789_123456':'123456'
> '123458789_136456':'136456'
>  }
> }
> }
>
>
> users_docs // all action by users on docs
> {
>     ‘123_123456’: // id_user + id_doc
>     {
> 'id_doc':'123456'
>  'id_user':'123'
> 'd_readed':'20110301'
> 'd_shared':'20110301'
>  }
> }
>
>
> user_friends_act // all activity of user friends
> {
>     ‘123’:// id_user
>     {
> 't_readed_docs': //all docs readed by my friends
> {
> '223456_224_123456': // time + id_friend + id_docs
>  {
> 'id_friend':'224'
> 'id_docs':'123456'
>  'time':'223456'
> 'c_type='BOT'
>  }
> }
> 't_shared_docs': //all docs shared by my friends
>  {
> '223456_224_123456': // time + id_friend + id_docs
> {
>  'id_friend':'224'
> 'id_docs':'123456'
>  'time':'223456'
> 'c_type='BOT'
>  }
> }
> }
> }
>
>
>
> I know that certain queries are not possible for now like : - All doc of
> type='BOT' favorized by my friend relative (tag) with 'fire' and 'belgium' ?
>
>
>
> What do you think ?
>
>
> Thank you,
>
>
> Vodnok,
>
>
> (Please remember i'm on cassandra since 3 days)
>

Reply via email to