Re: Storage of time-series data

2010-05-18 Thread Daniel Einspanjer
I do a lot of temporal aggregate statistics in the Mozilla Socorro project using HBase. The problem is made much easier there because you can have a rowkey that uses the timestamp as a prefix making it easy to do a range query, and then HBase also has an atomic increment function that can be

Is it inefficient to map over a small bucket when you have millions of other buckets?

2010-07-11 Thread Daniel Einspanjer
I'm thinking about the pros and cons of Riak vs HBase for Mozilla's Weave (now Firefox Sync) 2.0 engine. https://wiki.mozilla.org/Labs/Weave/Sync/2.0/API The primary use case is that when a user's client performs a sync, it needs to retrieve all the new items since the last time it synced for

Re: Expected vs Actual Bucket Behavior

2010-07-20 Thread Daniel Einspanjer
On 7/20/10 6:00 PM, Eric Filson wrote: On Tue, Jul 20, 2010 at 3:02 PM, Justin Sheehy > wrote: Hi, Eric! Thanks for your thoughts. On Tue, Jul 20, 2010 at 12:39 PM, Eric Filson mailto:efil...@gmail.com>> wrote: > I would think that this requirement, >

Re: sane way to edit json map reduce functions?

2010-09-08 Thread Daniel Einspanjer
A friend and I hacked up this template for editing queries. It is only a rough prototype, but it would be pretty easy to fold it in to something that would allow you to save the query in an MR bucket or even post it directly: https://svn.mozilla.org/metrics/testpilot/riak/mapreduce/validate_m