Ryan I'm resurrecting this topic because a have a similar request of storing in riak streams of items coming from sources with unique ids. In your post you said
Ryan Kennedy wrote: > > At Yammer we have a notion of streams (notifications is one of our > streams). Each stream has a list of stream items. For instance, "Bob > liked your message" or "Jenny replied to your message" or "Charlie > mentioned you in a thread". Each stream item has a uniquely generated, > monotonically increasing ID. That's great, that gives us something to > sort and dedupe on. We store the stream items for a user in a single > key/value. Each stream type has it's own bucket. To get to my > notifications, I would fetch /riak/notifications/ryan. To keep things > simple (and bounded) we only store the most recent 1,000 or so stream > items for each user. Older notifications age out of the system as > newer ones replace them. That's fineā¦for nearly all of our users 1,000 > notifications would represent a significant amount of calendar time. > More than they could be expected to page back through. > What confuses me is the phrase "We store the stream items for a user in a single key/value". Does this mean all the items are put together under a single key? if so, when a new item arrives, you need to read the key, update and re-write. Doesn't this affect performance In my case, I need to maintain a very hight write throughput, so I would prefer not to update. Would it be efficient storing/retrieving items under a bucket on the form /source/period and use the timestamp as key, where the period may be configurable for the application and will probably be in the order of minutes. In that way all items from a source will in the same bucket. However, this will lead to millions of buckets very quickly. Another option would be to batch the items (which are very short) and store them as an object under /source/period as has been discussed in this thread: http://riak-users.197444.n3.nabble.com/High-volume-data-series-storage-and-queries-td3236378.html High volume data series storage and queries Thanks in advance -- View this message in context: http://riak-users.197444.n3.nabble.com/Millions-of-buckets-tp2928567p3238776.html Sent from the Riak Users mailing list archive at Nabble.com. _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com