Re: Schema question

Edward Capriolo Sun, 03 Oct 2010 12:29:14 -0700
On Sun, Oct 3, 2010 at 11:02 AM, Simon Reavely <simon.reav...@gmail.com> wrote:
> Two questions:
> 1. So this compaction challenge is a CPU issue or a disk IO issue in your 
> case?
> 2. In other places people have recommended adjustments from the defaults to 
> control compaction overhead...did you adjust or experiment with how to 
> control compaction?
>
>
>
>
> Simon Reavely
>
>
> On Sep 21, 2010, at 8:48 AM, Juho Mäkinen <juho.maki...@gmail.com> wrote:
>
>> Not really. The schema has worked without any problems and we haven't
>> had any problems with it. We're running a five node cassandra cluster
>> behind the system (it has also other uses than just this particular
>> application like it stores all our blog contents and bunch of other
>> data). There are about 120 000 new messages each day and the chat
>> window is displayed about 1200 times per second.
>>
>> It's worth to note that cassandra cluster can't be just one or two
>> machines to get best availability because once a node starts to do
>> compacting it will severely hurt the node performance. Thus you need
>> to have enough nodes that the probability that all nodes are doing
>> compaction is low enough. I've implemented a php wrapper around the
>> thrift api which retries operation to another server until the result
>> can be obtained. The library is available at github:
>> http://github.com/dynamoid/cassandra-utilities
>>
>> We have monitored and stored performance data for each cassandra
>> request we have done. As the results are interesting I'll be posting
>> them into another post into this mailing list within a moment.
>>
>> - Juho Mäkinen
>>
>>
>>
>>
>> On Tue, Sep 21, 2010 at 1:20 PM, Simon Reavely <simon.reav...@gmail.com> 
>> wrote:
>>> Thanks for the writeup...good stuff!
>>> Any lessons learnt you'd like to share or challenges that persist?
>>>
>>>
>>> Simon Reavely
>>>
>>>
>>> On Sep 20, 2010, at 6:37 AM, Juho Mäkinen <juho.maki...@gmail.com> wrote:
>>>
>>>> We have built a facebook style "messenger" into our web site which
>>>> uses cassandra as storage backend with two column families:
>>>> TalkMessages and TalkLastMessages. I've uploaded a screenshot showing
>>>> the feature in action to
>>>> http://img138.imageshack.us/img138/3807/talkexample.jpg
>>>>
>>>> TalkMessages contains each message between two participants. The key
>>>> is a string built from the two users uids "$smaller_uid:$bigger_uid".
>>>> Each column inside this CF contains a single message. The column name
>>>> is the message timestamp in microseconds since epoch stored as
>>>> LongType. The column value is a JSON encoded string containing
>>>> following fields: sender_uid, target_uid, msg.
>>>>
>>>> This results in following structure inside the column family.
>>>>
>>>> "2249:9111" => [
>>>>  12345678 : { sender_uid : 2249, target_uid : 9111, msg : "Hello, how
>>>> are you?" },
>>>>  12345679 : { sender_uid : 9111, target_uid : 2249, msg : "I'm fine, 
>>>> thanks" }
>>>> ]
>>>>
>>>> TalkLastMessages is used to quickly fetch users talk partners, the
>>>> last message which was sent between the peers and other similar data.
>>>> This allows us to quickly fetch all needed data which is needed to
>>>> display a "main view" for all online friends with just one query to
>>>> cassandra. This column family uses the user uid as is key. Each column
>>>> represents a talk partner whom the user has been talking to and it
>>>> uses the talk partner uid as the column name. Column value is a json
>>>> packed structure which contains following fields:
>>>> - last message timestamp: microseconds since epoch when a message was
>>>> last sent between these two users.
>>>> - unread timestamp : microseconds since epoch when the first unread
>>>> message was sent between these two users.
>>>> - unread : counter how many unread messages there are.
>>>> - last message : last message between these two users.
>>>>
>>>> This results in following structure inside the column family for these
>>>> two example users: 2249 and 9111.
>>>>
>>>> "2249" => [
>>>>  9111 : { last_message_timestamp : 12345679, unread_timestamp :
>>>> 12345679, unread : 1, last_message: "I'm fine, thanks" }
>>>>
>>>> ],
>>>> "9111" => [
>>>>  2249 : { last_message_timestamp :  12345679, unread_timestamp :
>>>> 12345679, unread : 0, last_message: "I'm fine, thanks" }
>>>> ]
>>>>
>>>> Displaying chat (this happends on every page load, needs to be fast)
>>>> 1) Fetch all columns from TalkLastMessages for the user
>>>>
>>>> Display messages history between two participants:
>>>> 1) Fetch last n columns from TalkMessages for the relevant
>>>> "$smaller_uid:$bigger_uid" row.
>>>>
>>>> Mark all sent messages from another participant as read (when you read
>>>> the messages)
>>>> 1) Get column $sender_uid from row $reader_uid from TalkLastMessages
>>>> 2) Update the JSON payload and insert the column back
>>>>
>>>> Sending message involves the following operations:
>>>> 1) Insert new column to TalkMessages
>>>> 2) Fetch relevant column from TalkLastMessages from $target_uid row
>>>> with $sender_uid column
>>>> 3) Update the column json payload and insert it back to TalkLastMessages
>>>> 4) Fetch relevant column from TalkLastMessages from $sender_uid row
>>>> with $target_uid column
>>>> 5) Update the column json payload and insert it back to TalkLastMessages
>>>>
>>>> There are also other operations and the actual payload is a bit more 
>>>> complex.
>>>>
>>>> I'm happy to answer questions if somebody is interested :)
>>>>
>>>> - Juho Mäkinen
>>>>
>>>>
>>>>
>>>> On Mon, Sep 20, 2010 at 12:57 PM, Morten Wegelbye Nissen <m...@monit.dk> 
>>>> wrote:
>>>>>  Hello List,
>>>>>
>>>>> No matter where you read, you almost every-where read the the noSQL
>>>>> datascema is completely different from the relational way - and after a
>>>>> little insight in cassandra everyone can 2nd that.
>>>>>
>>>>> But I miss to see some real-life examples on how a real system can be
>>>>> modelled. Lets take the example for a system where users can send messages
>>>>> to each other. ( Completely imaginary, noone would use cassandra for a
>>>>> mailsystem :) )
>>>>>
>>>>> If one should create such a system, what CF's would be used? And how would
>>>>> you per example find all not read messages?
>>>>>
>>>>> ./Morten
>>>>>
>>>
>
You can adjust the compaction thread priority, or up memtable sizes,
but if you are doing a high write volume "Judgement day is inevitable"
I mean "compaction is inevitable". If you up your nodes from 2->4 and
the other settings stay the same you will get less inventive
compaction as the data per node is lower.
Re: Schema question

Reply via email to