Ok seems that i'll use Solr (with dedicated Cassandra) for search I've readed this article : http://ria101.wordpress.com/2010/02/22/cassandra-randompartitioner-vs-orderpreservingpartitioner/ on RP vs OPP...
Here is my case docs_shared{ //docs shared by users ordered by time 'time:id_user:id_doc' { 'time':'123456' //index on it 'id_user':'123' //index on it 'c_type':'BOT' //index on it 'id_doc':'123' //index on it } } So i can list all doc shared by id_user = 123 and type ='BOT' ordered by time.... Well i wanted because i discovered the RP vs OPP issue. I'm default so RP and so row id are not ordered !!! And as it's recommanded, i would like to stay RP So other possibility is addind a dimension with super column as column are ordered in RP index{ docs_shared{ //docs shared by users ordered by time 'time:id_user:id_doc' { 'time':'123456' //index on it 'id_user':'123' //index on it 'c_type':'BOT' //index on it 'id_doc':'123' } } } BUT.... sexondary index is not possible on SC -> C So next possibility is index{ docs_shared_time_c_type_id_user{ //docs shared by users ordered by time:c_type:id_user 'time:c_type:id_user:id_doc' : 'id_doc' } docs_shared_c_type_time_id_user{ //docs shared by users ordered by time:id_user:c_type 'c_type:time:id_user:id_do' : 'id_doc' } ... (there is 6 combinations of time c_type id_user) } Like that i can list with keystart and keyend and filters Example : No filter : index -> time:c_type:id_user Filter on c_type : index -> c_type:time:id_user Filter on id_user : index -> id_user:time:c_type Filter on c_type and id_user : index -> id_user:c_type:time Fortunately cassandra likes writing !!! (Ironic inside) So i have a question : i've readed that secondary index on SC->C will maybe arrive in next releases... Is this information true ? And is it already planned ? Thank you, Sébastien, 2011/3/2 Burc Sade <burcs...@gmail.com> > You can use PHP Solr Extension. It is a fully featured and light-weight > client. > > http://www.php.net/manual/en/book.solr.php > > Without the secondary indexes on columns in CFs within SCFs, the best > approach is to create query-specific CFs at the moment. In the end all comes > down to how simple you can make your queries to have a minimum CF count. > > Regards, > Burc > > On Wed, Mar 2, 2011 at 9:06 AM, Vodnok <vod...@gmail.com> wrote: > >> I think too via Solr it'll be easier. Just need to google it. (if you have >> links about Solr in php...) >> >> I realize that i have to remove some dimension to my CF... >> >> I thought it was possible to have SCF -> CF -> SC -> C:value having >> secondary index on C but has i understood, secondary index on C on super is >> not possible for now (but will be maybe in next version) >> As i understand it's better to have more less complex CF then less more >> complex CF >> >> Thank you for your reply, >> >> >> >> 2011/3/2 Burc Sade <burcs...@gmail.com> >> >> Hi Vodnok, >>> >>> For tag searches I would use a search engine like Solr (Lucene), as I >>> think it would be more flexible to query. You can update the index as new >>> data comes in and query it for queries #1, #2 and #4. >>> >>> For "All doc of type='BOT' and c_bot_code='ABC'" query, I would create >>> the CF below. >>> >>> doc_types >>> { >>> 'BOT:ABC': >>> { >>> <docid>: <creation_date?> >>> } >>> } >>> >>> You can assign a value you are going to need when after querying to the >>> docid. The problem with this schema is that if there are not many >>> type:c_bot_code combinations, there will be many columns under each key in >>> this CF. If a combination has much much more columns than others, hot spot >>> problem may arise. >>> >>> >>> >>> On Tue, Mar 1, 2011 at 11:39 PM, Vodnok <vod...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> Totaly newbie on Cassandra (with phpcassa) with big background on >>>> relationned database, i'm would like to use Cassandra for a trivial case. >>>> So >>>> i'm on it since 3 days. Sorry for my stupid question. I'm pretty sure i'm >>>> wrong but i want to learn so i'm here >>>> >>>> >>>> I would like your advise on a design for cassandra. >>>> >>>> >>>> Case: >>>> >>>> - Users created Docs and can share docs with friends >>>> - Users can read and share docs of their friends with other friends >>>> - Docs can be of different type [text;picture;video;etc] >>>> - Docs can be taggued >>>> >>>> >>>> >>>> Typical queries : >>>> >>>> >>>> - Doc relative to tag >>>> - Doc relative to mutiple tags >>>> - Doc readed by user x >>>> - Doc relative to tag and ratio readed_shared greater than x (see >>>> design) >>>> - All doc of type='IMG' favorized by my friend >>>> - All doc of type='BOT' and c_bot_code='ABC' >>>> - All doc of type='BOT' favorized by my friend relative (tag) with >>>> 'fire' and 'belgium' ? >>>> >>>> >>>> >>>> Design : >>>> >>>> >>>> docs // all docs >>>> { >>>> ‘123456’: //id_docs >>>> { >>>> ‘t_info’: >>>> { >>>> 'c_type':'BOT' >>>> 'b_del':'y' >>>> 'b_publish':'y' >>>> } >>>> 't_info_type': >>>> { >>>> 'l_title':'Hello World!' >>>> 'c_bot_code':'ABC' >>>> } >>>> 't_read_user' : //read by user x >>>> { >>>> //time + id_user >>>> '123456789_123':'123' >>>> '123456789_155':'155' >>>> } >>>> 't_shared_user' : //share by user x >>>> { >>>> //time + id_user >>>> '123456789_123':'123' >>>> '123456789_155':'155' >>>> } >>>> 't_tags' >>>> { >>>> 'fire':'fire' >>>> 'belgium':'belgium' >>>> } >>>> 't_stats' >>>> { >>>> 'n_readed':'60' >>>> 'n_shared':'6' >>>> 'n_ratio_readed_shared':'0.1' >>>> } >>>> } >>>> } >>>> >>>> >>>> tags_docs // all tag linked to docs >>>> { >>>> 'fire'://tag >>>> { >>>> //creation_time + id_docs >>>> '456789_123456': >>>> { >>>> 'id_doc':'123456' >>>> 'time':'456789' >>>> } >>>> '456789_223456':'223456': >>>> { >>>> 'id_doc':'123456' >>>> 'time':'456789' >>>> } >>>> '456789_323456':'223456': >>>> { >>>> 'id_doc':'123456' >>>> 'time':'456789' >>>> } >>>> } >>>> 'belgium': >>>> { >>>> ... >>>> } >>>> } >>>> >>>> >>>> users // all users >>>> { >>>> ‘123’: //id_user >>>> { >>>> ‘t_info’: >>>> { >>>> l_name:'Boris' >>>> c_lang='fr' >>>> >>>> } >>>> 't_readed_docs': >>>> { >>>> //time + id_doc >>>> '123456789_123456':'123456' >>>> '123458789_136456':'136456' >>>> } >>>> 't_shared_docs': >>>> { >>>> //time + id_doc >>>> '123456789_123456':'123456' >>>> '123458789_136456':'136456' >>>> } >>>> } >>>> } >>>> >>>> >>>> users_docs // all action by users on docs >>>> { >>>> ‘123_123456’: // id_user + id_doc >>>> { >>>> 'id_doc':'123456' >>>> 'id_user':'123' >>>> 'd_readed':'20110301' >>>> 'd_shared':'20110301' >>>> } >>>> } >>>> >>>> >>>> user_friends_act // all activity of user friends >>>> { >>>> ‘123’:// id_user >>>> { >>>> 't_readed_docs': //all docs readed by my friends >>>> { >>>> '223456_224_123456': // time + id_friend + id_docs >>>> { >>>> 'id_friend':'224' >>>> 'id_docs':'123456' >>>> 'time':'223456' >>>> 'c_type='BOT' >>>> } >>>> } >>>> 't_shared_docs': //all docs shared by my friends >>>> { >>>> '223456_224_123456': // time + id_friend + id_docs >>>> { >>>> 'id_friend':'224' >>>> 'id_docs':'123456' >>>> 'time':'223456' >>>> 'c_type='BOT' >>>> } >>>> } >>>> } >>>> } >>>> >>>> >>>> >>>> I know that certain queries are not possible for now like : - All doc of >>>> type='BOT' favorized by my friend relative (tag) with 'fire' and 'belgium' >>>> ? >>>> >>>> >>>> >>>> What do you think ? >>>> >>>> >>>> Thank you, >>>> >>>> >>>> Vodnok, >>>> >>>> >>>> (Please remember i'm on cassandra since 3 days) >>>> >>> >>> >> >