Thanks for the writeup...good stuff! Any lessons learnt you'd like to share or challenges that persist?
Simon Reavely On Sep 20, 2010, at 6:37 AM, Juho Mäkinen <juho.maki...@gmail.com> wrote: > We have built a facebook style "messenger" into our web site which > uses cassandra as storage backend with two column families: > TalkMessages and TalkLastMessages. I've uploaded a screenshot showing > the feature in action to > http://img138.imageshack.us/img138/3807/talkexample.jpg > > TalkMessages contains each message between two participants. The key > is a string built from the two users uids "$smaller_uid:$bigger_uid". > Each column inside this CF contains a single message. The column name > is the message timestamp in microseconds since epoch stored as > LongType. The column value is a JSON encoded string containing > following fields: sender_uid, target_uid, msg. > > This results in following structure inside the column family. > > "2249:9111" => [ > 12345678 : { sender_uid : 2249, target_uid : 9111, msg : "Hello, how > are you?" }, > 12345679 : { sender_uid : 9111, target_uid : 2249, msg : "I'm fine, thanks" } > ] > > TalkLastMessages is used to quickly fetch users talk partners, the > last message which was sent between the peers and other similar data. > This allows us to quickly fetch all needed data which is needed to > display a "main view" for all online friends with just one query to > cassandra. This column family uses the user uid as is key. Each column > represents a talk partner whom the user has been talking to and it > uses the talk partner uid as the column name. Column value is a json > packed structure which contains following fields: > - last message timestamp: microseconds since epoch when a message was > last sent between these two users. > - unread timestamp : microseconds since epoch when the first unread > message was sent between these two users. > - unread : counter how many unread messages there are. > - last message : last message between these two users. > > This results in following structure inside the column family for these > two example users: 2249 and 9111. > > "2249" => [ > 9111 : { last_message_timestamp : 12345679, unread_timestamp : > 12345679, unread : 1, last_message: "I'm fine, thanks" } > > ], > "9111" => [ > 2249 : { last_message_timestamp : 12345679, unread_timestamp : > 12345679, unread : 0, last_message: "I'm fine, thanks" } > ] > > Displaying chat (this happends on every page load, needs to be fast) > 1) Fetch all columns from TalkLastMessages for the user > > Display messages history between two participants: > 1) Fetch last n columns from TalkMessages for the relevant > "$smaller_uid:$bigger_uid" row. > > Mark all sent messages from another participant as read (when you read > the messages) > 1) Get column $sender_uid from row $reader_uid from TalkLastMessages > 2) Update the JSON payload and insert the column back > > Sending message involves the following operations: > 1) Insert new column to TalkMessages > 2) Fetch relevant column from TalkLastMessages from $target_uid row > with $sender_uid column > 3) Update the column json payload and insert it back to TalkLastMessages > 4) Fetch relevant column from TalkLastMessages from $sender_uid row > with $target_uid column > 5) Update the column json payload and insert it back to TalkLastMessages > > There are also other operations and the actual payload is a bit more complex. > > I'm happy to answer questions if somebody is interested :) > > - Juho Mäkinen > > > > On Mon, Sep 20, 2010 at 12:57 PM, Morten Wegelbye Nissen <m...@monit.dk> > wrote: >> Hello List, >> >> No matter where you read, you almost every-where read the the noSQL >> datascema is completely different from the relational way - and after a >> little insight in cassandra everyone can 2nd that. >> >> But I miss to see some real-life examples on how a real system can be >> modelled. Lets take the example for a system where users can send messages >> to each other. ( Completely imaginary, noone would use cassandra for a >> mailsystem :) ) >> >> If one should create such a system, what CF's would be used? And how would >> you per example find all not read messages? >> >> ./Morten >>