Hello, I have a question regarding throughput of map-reduce queries -- in other words, how many per second I can reasonably try to do in my Riak cluster. I have a decent sized data set - about 2 million keys for a total of about 240GB disk usage on a 6 node Riak cluster (version 1.0.1). On top of that data is a Java application talking to Riak via protocol buffers wherein it would be nice to be able to throw a large volume of MapReduce queries at those keys, where the basic map function looks like this, and it gets called against a single key and "sub key" per query:
function(object, subKey) { return [ Riak.mapValuesJson(object)[0][subKey] ] ; } Is it reasonable to ask a 6 node Riak cluster on 4 core virtual servers with 8GB RAM to do 1000 of those per second, with a sub-100ms 99th percentile latency? In testing with 25 javascript VMs it looks good at 160 RPS, but under more realistic load I'm seeing it melt the Riak cluster and I'm wondering if this is something I can tune my way out of, or if I'm asking Riak to do the impossible. By way of comparison, the same volume of simple gets against those keys works smoothly. Thanks in advance, Will
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com