I would go with the solution that means you only have to make one request to 
serve your reads, so consider the super CF approach. 

There are some downsides to super columns see 
http://wiki.apache.org/cassandra/CassandraLimitations and they tend to have a 
love-them-hate-them reputation.

One thing to consider is that you do not need to model every attribute of your 
entity as a column in cassandra. Especially if you are always going to pull 
back all the attributes. So you could do your super CF approach with a standard 
CF, just pack the columns into some sort of structure such as JSON and store 
them as a blob. 

Or you can use a naming scheme in the column names with a standard CF, e.g. 
uuid1.text and uuid2.text 

Hope that helps. 
Aaron

On 30 Mar 2011, at 01:05, T Akhayo wrote:

> Good afternoon,
> 
> I'm making my data model from scratch for cassandra, this means i can tune 
> and fine tune it for performance.
> 
> At this time i'm having problems choosing between a 2 column families or 1 
> super column family. I will illustrate with a example.
> 
> Sector, this defines a place, this is one or two properties.
> Entry, a entry that is bound to a sector, this is simply some text and a few 
> properties.
> 
> I can model this with a super column family:
> 
> sectors{ //super column family
> sector1{
> uid1{
> text: a text
> user: joop
> }
> uid2{
> text: more text
> user: piet
> }
> }
> sector2{
> uid10{
> text: even more text
> user: marie
> }
> }
> }
> 
> But i can also model this with 2 column families:
> 
> sectors{ // column family
> sector1{
> textid1: null
> textid2: null
> }
> sector2{
> textid4: null
> }
> }
> 
> texts{ //column family
> textid1{
> text: a text
> user: joop
> }
> textid2{
> text: more text
> user: piet
> }
> }
> 
> With the super column family i can retrieve a list of texts for a specific 
> sector with only 1 request to cassandra.
> 
> With the 2 column families i need to send 2 requests to cassandra:
> 1. give me all textids from sector x. (returns x, y, z)
> 2. give me all texts that have id x, y, z.
> 
> In my final application it is likely that there will be a bit more writes 
> compared to reads.
> 
> I was wondering what the best approach is when it comes to performance. I 
> suspect that using super column families is slower compared the using column 
> families, but is it stil slower when using 2 column families and with 2 
> request to cassandra instead of 1 (with super column family).
> 
> Kind regards,
> T. Akhayo

Reply via email to