Thanks Aaron, Sorry I didn't see your message sooner.
So the CF Messages using UTF8Type holds the information such as : who has the right to read/ is it possible to answer to this list etc... There are two "kinds" of keys. The keys which begin by : "message:uuid" and the "messagelist:uuid". A column of message:uuid is for example "sender" or "rawtext". A column of messagelist:uuid is for example : "creator" or "participants". MessagesTime (message_time) is the sorting mechanism, meaning when I request against message_time I get messages or messagelists in the order it was sent. There are 2 kinds of keys : "messagebox:someone" : Each Column is for the Value : the uuid of a list inside the messagebox of someone, for the Name : the uuid of the last message in the corresponding messagelist. It gives me a sorting mechanism based on the last message received. "messagelist:uuid" : Each Column has for its Name : the UUID of a message and for the Value : whatever it doesn't really care. About your suggestion, is a very good solution but there is one thing I don't really like with serialization : it "blocks" evolution. Let's say I would like to add one field to a message because I want to add a field, I am obliged to make a tool to deserialize, add the information reserialize all the fields and insert. Even if I serialize with JSON it looks like evolution (that is why I chose Cassandra) is a little bit broken.If I am wrong, please tell me so. However I will explore this very interesting possibility for another project with "tags", which is not really subject to dramatic evolutions. At the moment I don't really complain about speed and since it is not really time critical (after all who cares if the messagebox loads in 250 ms instead of 200ms). At the moment I get the messages with two batch Cassandra calls so I think this is satisfying. Thanks again, the json serialization looks like a very interesting possibility. Victor 2011/5/19 aaron morton <aa...@thelastpickle.com> > I'm a bit confused by your examples. I think you are saying... > > - Standard CF called Message using the UTF8Type for column comparisons used > to store the individual messages. Row key is the message UUID. Not sure what > the columns are. > - Standard CF called MessageTime using TimeUUIDType for columns comparison > uses to store collections of messages. Row key is > "messagelist:<message_list_uuid>" for a message list, and > "messagebox:<user_name>:<mbox_name>" for message box. Not sure what the > columns are. > > The best model is going to be the one that supports your read requests and > the volume of data your are expecting. > > One way to go is to de normalise to support very fast read paths. You could > store the entire message in one column using something like JSON to > serialise it. Then > > - MessageIndexes standard CF to store the full messages in context, there > are three different types of rows: > * keys with <user_name> store all messages for a user, column name > is the message TimeUUID and value is the message structure > * keys with <user_name>/<mbox_name> store the messages for a single > message box. Columns same as below. > * keys with <user_name>/<mbox_name>/<mlist_name> store the messages > in a single message list. Columns as above. > > - MessageFolders CF to store the message box and message lists, two > approaches: > 1) <user_name> as key and each column is a message box, message > lists are stored in a single column as JSON > 2) <user_name> row for the top level message box, column for each > message box. <user_name>/<message_box> for the next level, > > Or if space is a concern just store the UUID of the message in the index CF > and add a CF to store the messages. > > It also going to depend on the management features, e.g. can you rename a > message box / list ? Move messages around ? If so the de normalised pattern > may not be the best as those operations will take longer. > > Hope that helps. > > ----------------- > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 19 May 2011, at 05:44, openvictor Open wrote: > > > Hello all, > > > > I know organization is a broad topic and everybody may have an idea on > how to do it, but I really want to have some advices and opinions and I > think it could be interesting to discuss this matter. > > > > Here is my problem: I am designing a messaging system internal to a > website. There are 3 big structures which are Message, MessageList, > MessageBox. A message/messagelist is identified only by an UUID; a > MessageBox is identified by a name(utf8 string). A messagebox has a set of > MessageList in it and a messagelist has a set of message in it, all of them > being UUIDs. > > Currently I have only two CF : message and message_time. Message is a > UTF8Type (cassandra 0.6.11, soon going for 0.8) and message_time is a > TimeUUIDType. > > > > For example if I want to request all message in a certain messagelist I > do : message_time['messagelist:uuid(messagelist)'] > > If I want information of a mesasge I do message['message:uuid(message)'] > > If I want all messagelist for a certain messagebox ( called nameofbox for > user openvictor for this example) I do : > message_time['messagebox:openvictor:nameofbox'] > > > > My question to Cassandra users is : is it a good idea to regroup all > those things into two CF ? Is there some advantages / drawbacks of this two > CFs and for long term should I change my organization ? > > > > Thank you, > > Victor > >