On Wed, May 11, 2011 at 12:25 PM, Jared Morrow <ja...@basho.com> wrote: > It seems like what you are needing is a lot what the Yammer guys needed for > their streamie application. They have a video > here: http://vimeo.com/21598799 about how they modeled their data. It > might be pretty helpful for your application. If not, no harm done, you > still get to watch a video from some pretty smart people!
There are some differences between what Alexey is describing and what we built at Yammer. The key difference is per-item read/unread state. That being said, I don't think it's a deal breaker. And I don't think you need search. At Yammer we have a notion of streams (notifications is one of our streams). Each stream has a list of stream items. For instance, "Bob liked your message" or "Jenny replied to your message" or "Charlie mentioned you in a thread". Each stream item has a uniquely generated, monotonically increasing ID. That's great, that gives us something to sort and dedupe on. We store the stream items for a user in a single key/value. Each stream type has it's own bucket. To get to my notifications, I would fetch /riak/notifications/ryan. To keep things simple (and bounded) we only store the most recent 1,000 or so stream items for each user. Older notifications age out of the system as newer ones replace them. That's fine…for nearly all of our users 1,000 notifications would represent a significant amount of calendar time. More than they could be expected to page back through. In addition, we support the notion of a cursor. A cursor is simply a pointer into a stream. We use the cursor to indicate the last seen stream item. We have a single bucket for cursors. To get my default cursor, I would fetch /riak/cursors/ryan-default. The value of that key is the ID of an item in my notifications stream. This is where your requirements and ours diverge a bit: we don't have per-item seen/unseen state. That being said, you could take the basics of our design and add per-item seen/unseen state. Ditch our cursors and add a "seen" field to each stream item. The one problem you're going to have is the eventual consistency model, especially if you want to support the ability for users to once again mark something as unseen/unread. In that case, if you ever encounter sibling values, you may not be able to reliably merge them. If in one sibling value you see a notification as read and another as unread, you can't tell which was the last action taken by the user. Not allowing users to mark something as unread once again should simplify that problem (if either sibling value is read, then the notification is read since you can't go back). Alternatively, consider using a cursor like we are. The write is much smaller and you don't have to read first to perform the update. Hopefully that sheds a little more light on what we're doing at Yammer. As Jared pointed out, the video from our talk is online. @coda even cracks a few good jokes. Good luck! Ryan Kennedy Infrastructure Engineer @ Yammer _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com