Re: Recommandation on how to organize CF

aaron morton Thu, 19 May 2011 20:15:06 -0700

I'm a bit confused by your examples. I think you are saying...

- Standard CF called Message using the UTF8Type for column comparisons used to 
store the individual messages. Row key is the message UUID. Not sure what the 
columns are. 
- Standard CF called MessageTime using TimeUUIDType for columns comparison uses 
to store collections of messages. Row key is "messagelist:<message_list_uuid>" 
for a message list, and "messagebox:<user_name>:<mbox_name>" for message box. 
Not sure what the columns are.


The best model is going to be the one that supports your read requests and the 
volume of data your are expecting. 

One way to go is to de normalise to support very fast read paths. You could 
store the entire message in one column using something like JSON to serialise 
it. Then

- MessageIndexes standard CF to store the full messages in context, there are 
three different types of rows:
        * keys with <user_name>  store all messages for a user, column name is 
the message TimeUUID and value is the message structure
        * keys with <user_name>/<mbox_name> store the messages for a single 
message box. Columns same as below. 
        * keys with <user_name>/<mbox_name>/<mlist_name> store the messages in 
a single message list. Columns as above. 

- MessageFolders CF to store the message box and message lists, two approaches:
        1) <user_name> as key and each column is a message box, message lists 
are stored in a single column as JSON
        2) <user_name> row for the top level message box, column for each 
message box. <user_name>/<message_box> for the next level, 

Or if space is a concern just store the UUID of the message in the index CF and 
add a CF to store the messages. 

It also going to depend on the management features, e.g. can you rename a 
message box / list ? Move messages around ? If so the de normalised pattern may 
not be the best as those operations will take longer. 

Hope that helps. 

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 19 May 2011, at 05:44, openvictor Open wrote:

> Hello all,
> 
> I know organization is a broad topic and everybody may have an idea on how to 
> do it, but I really want to have some advices and opinions and I think it 
> could be interesting to discuss this matter.
> 
> Here is my problem: I am designing a messaging system internal to a website. 
> There are 3 big structures which are Message, MessageList, MessageBox. A 
> message/messagelist is identified only by an UUID; a MessageBox is identified 
> by a name(utf8 string). A messagebox has a set of MessageList in it and a 
> messagelist has a set of message in it, all of them being UUIDs.
> Currently I have only two CF : message and message_time. Message is a 
> UTF8Type (cassandra 0.6.11, soon going for 0.8) and message_time is a 
> TimeUUIDType.
> 
> For example if I want to request all message in a certain messagelist I do : 
> message_time['messagelist:uuid(messagelist)']
> If I want information of a mesasge I do message['message:uuid(message)']
> If I want all messagelist for a certain messagebox ( called nameofbox for 
> user openvictor for this example) I do : 
> message_time['messagebox:openvictor:nameofbox']
> 
> My question to Cassandra users is : is it a good idea to regroup all those 
> things into two CF ? Is there some advantages / drawbacks of this two CFs and 
> for long term should I change my organization ?
> 
> Thank you,
> Victor

Re: Recommandation on how to organize CF

Reply via email to