Data modelling question

Per Olesen Mon, 14 Jun 2010 06:09:56 -0700

Hi,
I have a question that relates to how to best model data. I have some pretty 
simple tabular data, which I am to show to a large amount of users, and the 
users need to be able to search some of the columns.


Given this tabular data:

Company        | Amount    |...many more columns here
------------------------------------------------------
Ajax A/S       | 12345     |
Dude A/S       | 54321     |
Ajax A/S       |  5436     |
...many more rows here...

If I need to store this in cassandra, but also be able to search quite fast on 
"Company" and on "Amount", how might I go about storing this? The current plan 
I have for modelling it in cassandra, is to use one CF "Dashboard" for the 
tabular data itself and one for each "index" I would like to be able to 
retrieve it on.

Like this:

Super-CF "Dashboard":
--------------------
uuid-1 -> { company:"Ajax A/S", Amount:12345 }
uuid-2 -> { company:"Dude A/S", Amount:54321 }
uuid-3 -> { company:"Ajax A/S", Amount:5436 }

Where the SC value simply is a unique identifier.

Super-CF "DashboardCompanyIndex":
--------------------------------
"Ajax A/S" -> { uuid-1:"", uuid-3:"" }
"Dude A/S" -> { uuid-2:"" }


Super-CF "DashboardAmountIndex":
--------------------------------
"12345" -> { uuid-1:"" }
"54321" -> { uuid-2:"" }
"5436" -> { uuid-3:"" }

So, in my use case, when searching on e.g. company, I can then access the 
"DashboardCompanyIndex" with a slice on its SC and then grab all the uuids from 
the columns, and after this, make a lookup in the Dashboard CF for each uuid 
found in the index.

Is this the preferred way of doing this in cassandra?
Or am I trying to apply relational algebra modelling on something that is 
supposed to be used differently?

/Per

Data modelling question

Reply via email to