Hi Shawn,

tl;dr use Riak and Redis. Could you do it without Redis? Probably. Would I want 
to? No.

I'll take a stab at this. It goes without saying that there are many ways to do 
this and no "right" way. Each solution will have its own positives and 
negatives. It all depends on what you and your team are comfortable with and 
the needs of your app. 

For those that follow my ramblings you can guess what I'm gonna say. I would 
put forward a solution of Riak and... Redis! Why Redis? Data structures (Riak 
doesn't have them... at the moment... or ever? Don't try to make it have them). 
You want them. Things like sorted sets, lists and hashes (which compress) are 
great for basically everything. 

Things in your favor:
-immutable data
-trivially shardable 
-constrained data set (data is not UGC with unbounded size)
-predictable growth rates
-deterministic keys (think %iso8601 date or unix epoch int%_%customerid%)

Keep 48 hours of live data in Redis. Run a culling process that dumps data to 
Riak. The culling process will keep your Redis memory footprint within known 
limits. You could run this every minute to minimize any data loss from downed 
Redis servers (outside of master/slave etc.). So like here is where you could 
make do without Redis. If your app is holding on to, writing or requesting data 
every minute you could write straight to Riak and just have worker processes 
roll those minutes into hours/days whatever if necessary. 

With deterministic keys you may not even need search, secondary indexes or key 
filters but with them you can basically cover any permutation you could come up 
with. Your application handles fetching the correct key(s) in a deterministic 
fashion simply by manipulating date offsets.

Whatever you do, you do not want a situation where you write half baked keys 
into Riak. Frequently updating keys will incur file compaction which will make 
you want to cry and punch babies just to make the pain stop.

Best,
-Alexander Sicular

@siculars
http://siculars.posterous.com

On Aug 16, 2012, at 6:20 PM, Shawn Parrish wrote:

> Howdy Riak folk,
> 
> We're looking for a possible datastore replacement for our server
> monitoring check results.  Maybe some of you can offer feedback if
> Riak is a possible good solution.
> 
> Each ping, http request, etc has a result with various metadata that
> we store.  We're looking at about 250 million results a month and that
> number continues to grow.
> 
> We query this data for:
> 1. last result (is the server up or down?)
> 2. if it's up, when was the last 'down' and inversely when it's down,
> when was the last up?
> 3. Full detail of the last 5 results (to show recent results)
> 4. Last 24 hours results (usually ~1440 results) to graph
> 5. Results in a date range (example: all results from July 1 through
> July 31)... this can be very large.
> 
> We currently use bigcouch (Couchdb) but the views and built in
> _all_docs slow down with so many results and especially when we call
> them with 'include_docs', cause we need the details of the results as
> well.
> 
> We're trying to trim down the total results stored by summarizing
> older data and deleting it but that slows down Couchdb views even
> farther.
> 
> Questions:
> 1. Is Riak a possible datastore for this use case?  Can I get so many
> results, including all the details quickly enough?
> 2. Do you know of another datastore that might be better?
> 
> Thanks,
> Shawn
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to