Re: Index and schema review

aaron morton Wed, 08 Feb 2012 11:58:01 -0800

> 1.       Are the indexes local? i.e if node 1 holds say 10 keys that will 
> only have indexes for theses 10 keys. In short – interested in knowing how is 
> the index partitioned?
Yes, nodes only hold the secondary indexes for the rows they are a replica for. 
This means it's token range and the token range for the other nodes it shares 
ranges with.


IMHO you should try to model the well known requests without using secondary 
indexes. This is not always possible but it will give the best performance.

A lot depends on the shape of the data, but I would think about:

Partitioning time series data 
http://www.slideshare.net/mattdennis/cassandra-data-modeling
Using composite columns to store all the B's and C's in the same row as the A. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 8/02/2012, at 10:11 PM, Tiwari, Dushyant wrote:

> Hi Cassandra Users,
>  
> I am considering Cassandra as the main data store. Just a quick description 
> of the entity structure that we are looking at to store/design schema for. So 
> we have 3 entities A,B and C. A has one to many relationship with B and 
> similar is true for B and C. Hence this gives us a tree like structure with A 
> as a root. To store this structure in the Cassandra I am looking at the 
> following col family schema design.
>  
>  
> Col Family : For storing A
> Id of A - Key
> Date – Indexed
> Byte format of A object
>  
> Col Family : For storing B
> Id of B - Key
> Date – Indexed field
> Id of A to which B belongs – Indexed
> Byte format of B object
>  
> Col Family : For storing C
> Id of C - Key
> Date – Indexed field
> Id of A to which C belongs – Indexed
> Id of B to which C belongs – Indexed
> Byte format of C object
>  
>  
> Maintaining an index on date because caches are supposed to be preloaded with 
> say 3 days worth of data. Now the questions are
> 1.       Are the indexes local? i.e if node 1 holds say 10 keys that will 
> only have indexes for theses 10 keys. In short – interested in knowing how is 
> the index partitioned?
> n  Just to appreciate the concern I have consider the case where we receive 
> 100 keys and we have 10 Cassandra nodes. Assuming even distribution of 10 
> keys each node. Will there be 10 partitions of the index on 10 different 
> nodes – only of the keys it owns?
>  
> 2.       Opinion about the schema design above. To list down the use cases –
> n  Preload the processes on start up with 3 days of data. (The data store 
> should hold data which dates as back as 3 years)
> n   Given Id of A get all the B’s and C’s of the tree.
>  
>  
> Hoping to hear soon.
>  
> Thanks and Regards,
> Dushyant
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions 
> or views contained herein are not intended to be, and do not constitute, 
> advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform 
> and Consumer Protection Act. If you have received this communication in 
> error, please destroy all electronic and paper copies and notify the sender 
> immediately. Mistransmission is not intended to waive confidentiality or 
> privilege. Morgan Stanley reserves the right, to the extent permitted under 
> applicable law, to monitor electronic communications. This message is subject 
> to terms available at the following link: 
> http://www.morganstanley.com/disclaimers. If you cannot access these links, 
> please notify us by reply message and we will send the contents to you. By 
> messaging with Morgan Stanley you consent to the foregoing.

Re: Index and schema review

Reply via email to