Using cassandra collections CREATE TABLE videos ( videoid uuid primary key, title text, actor list<text>, producer list<text>, release_date timestamp, description text, music text, etc... );
When using collection you need to take care of its length. Collections are designed to store <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>only a small amount of data <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>. 5/10 actors per movie is ok. Eduardo Alonso Vía de las dos Castillas, 33, Ática 4, 3ª Planta 28224 Pozuelo de Alarcón, Madrid Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd <https://twitter.com/StratioBD>* 2017-06-12 11:54 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>: > So In short we have to go with one single table as videos and put primary > key as videoid uuid. > But then how can we able to handle multiple actor name and producer name. > ? > > On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <eduardoalo...@stratio.com > > wrote: > >> Yes, you are right. >> >> Table denormalization is useful just when you have unique primary keys, >> not your case. >> Denormalized tables are only different in its primary key, every >> denormalized table contains all the data (it just change how it is >> structured). So, if you need to index it, do it with just one table (the >> one you showed us with videoid as the primary key is ok). >> >> Solr, Elastic and cassandra-lucene-index are both based on Lucene and all >> of them fulfill all your needs. >> >> Solr (in DSE) and cassandra-lucene-index >> <https://github.com/stratio/cassandra-lucene-index> are very well >> integrated with cassandra using its secondary index interface. If you >> choose elastic search you will need to code the integration (write mutex, >> both cluster synchronization (imagine something written in cassandra but >> failed to write in elastic)) >> >> I know i am not the most suitable to recommend you to use our product >> cassandra-lucene-index >> <https://github.com/stratio/cassandra-lucene-index> but it is open >> source, just take a look. >> >> Eduardo Alonso >> Vía de las dos Castillas, 33, Ática 4, 3ª Planta >> 28224 Pozuelo de Alarcón, Madrid >> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // >> *@stratiobd >> <https://twitter.com/StratioBD>* >> >> 2017-06-12 11:18 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>: >> >>> Hi Eduardo, >>> >>> And As we are trying to build an advanced search functionality in which >>> we can able to do partial search based on actor, producer, director, etc. >>> columns. >>> So if we do denormalization of tables then we have to create tables such >>> as below :- >>> video_by_actor >>> video_by_producer >>> video_by_director >>> video_by_date >>> etc.. >>> By using denormalized, Cassandra only allows us to do equality search, >>> but for implementing Partial search we need to implement solr on all above >>> tables. >>> >>> This is my thinking, but I think this will be not correct way to >>> implement Apache Solr on all tables. >>> >>> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <nandanpriyadarshi...@gmail.co >>> m> wrote: >>> >>>> Hi Edurado, >>>> >>>> As you mentioned queries 1-6 , >>>> In this condition, we have to proceed with a table like as below :- >>>> create table videos ( >>>> videoid uuid primary key, >>>> title text, >>>> actor text, >>>> producer text, >>>> release_date timestamp, >>>> description text, >>>> music text, >>>> etc... >>>> ); >>>> This table will help to store video datas based on PK videoid and will >>>> give uniqeness due to uuid. >>>> But as we know , in one movie there are multiple actor, multiple >>>> producer, multiple music worked, So how can we store all these.. Only one >>>> option will left as to use collection type columns. >>>> >>>> >>>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso < >>>> eduardoalo...@stratio.com> wrote: >>>> >>>>> TLDR shouldBe *PD >>>>> >>>>> Eduardo Alonso >>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta >>>>> 28224 Pozuelo de Alarcón, Madrid >>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // >>>>> *@stratiobd >>>>> <https://twitter.com/StratioBD>* >>>>> >>>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <eduardoalo...@stratio.com>: >>>>> >>>>>> Hi Nandan: >>>>>> >>>>>> So, your system must provide these queries: >>>>>> >>>>>> 1 - SELECT video FROM ... WHERE actor = '...'; >>>>>> 2 - SELECT video FROM ... WHERE producer = '...'; >>>>>> 3 - SELECT video FROM ... WHERE music = '...'; >>>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...'; >>>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...'; >>>>>> 6 - SELECT video WHERE title CONTAINS 'Harry'; >>>>>> >>>>>> >>>>>> For queries 1-5 you can get them with just cassandra, denormalizing >>>>>> tables just the way your mentioned but without solr, just cassandra >>>>>> (Indeed, just for equality clauses) >>>>>> >>>>>> video_by_actor; >>>>>> video_by_producer; >>>>>> video_by_music; >>>>>> video_by_actor_and_producer; >>>>>> video_by_actor_and_music; >>>>>> >>>>>> For queries number 6 you need a search engine. >>>>>> >>>>>> SOL >>>>>> ElasticSearch >>>>>> cassandra-lucene-index >>>>>> <https://github.com/stratio/cassandra-lucene-index> >>>>>> SASI >>>>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html> >>>>>> >>>>>> I think, just for your query, the easiest way to get it is to build >>>>>> a SASI index. >>>>>> TLDR: I work for stratio in cassandra-lucene-index but for your basic >>>>>> query (only one dimension), SASI indexes will work for you. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Eduardo Alonso >>>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta >>>>>> 28224 Pozuelo de Alarcón, Madrid >>>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com >>>>>> // *@stratiobd <https://twitter.com/StratioBD>* >>>>>> >>>>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>: >>>>>> >>>>>>> But Condition is , I am working with Apache Cassandra Database in >>>>>>> which I have to store my data into Cassandra and then have to implement >>>>>>> partial search capability. >>>>>>> If we need to search based on full search primary key, then it >>>>>>> really best and easy to work with Cassandra , but in case of flexible >>>>>>> search , I am getting confused. >>>>>>> >>>>>>> >>>>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin < >>>>>>> oskar.kjel...@gmail.com> wrote: >>>>>>> >>>>>>>> I haven't run solr with Cassandra myself. I just meant to run >>>>>>>> elasticsearch as a completely separate service and write there as well. >>>>>>>> >>>>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <nandanpriyadarshi...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>> Do you mean to use Elastic Search with Cassandra? >>>>>>>> Even I am thinking to use Apache Solr With Cassandra. >>>>>>>> In that case I have to create distributed tables such as:- >>>>>>>> 1) video_by_title, video_by_actor, video_by_year etc.. >>>>>>>> 2) After creating Tables , will have to configure solr core on all >>>>>>>> tables. >>>>>>>> >>>>>>>> Is it like this ? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin < >>>>>>>> oskar.kjel...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Why not elasticsearch for this use case? It will make your life >>>>>>>>> much simpler >>>>>>>>> >>>>>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ < >>>>>>>>> nandanpriyadarshi...@gmail.com> wrote: >>>>>>>>> > >>>>>>>>> > Hi, >>>>>>>>> > >>>>>>>>> > Currently, I am working on data modeling for Video Company in >>>>>>>>> which we have different types of users as well as different user >>>>>>>>> functionality. >>>>>>>>> > But currently, my concern is about Search video module based on >>>>>>>>> different fields. >>>>>>>>> > >>>>>>>>> > Query patterns are as below:- >>>>>>>>> > 1) Select video by actor. >>>>>>>>> > 2) select video by producer. >>>>>>>>> > 3) select video by music. >>>>>>>>> > 4) select video by actor and producer. >>>>>>>>> > 5) select video by actor and music. >>>>>>>>> > >>>>>>>>> > Note: - In short, We want to establish an advanced search module >>>>>>>>> by which we can search by anyway and get the desired results. >>>>>>>>> > >>>>>>>>> > During a search , we need partial search also such that if any >>>>>>>>> user can search "Harry" title, then we are able to give them result >>>>>>>>> as all >>>>>>>>> videos whose >>>>>>>>> > title contains "Harry" at any location. >>>>>>>>> > >>>>>>>>> > As per my ideas, I have to create separate tables such as >>>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on >>>>>>>>> all >>>>>>>>> tables. Otherwise, >>>>>>>>> > is there any others way by which we can implement this search >>>>>>>>> module effectively. >>>>>>>>> > >>>>>>>>> > Please suggest. >>>>>>>>> > >>>>>>>>> > Best regards, >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >