Hi John, thanks for the response.

In my case this is a nightly process which will be farming older data out
of Riak for analysis elsewhere. I can certainly run a few tests with the
M/R as it is, and with a more simple version which does sequential gets.
Which metrics should I be interested in when measuring this do you think?


On 26 March 2013 15:35, John Caprice <jcapr...@basho.com> wrote:

> Matt,
> (I'll finish this response this time!)
> How often is this MapReduce query being run?  Is the execution of this
> MapReduce query done in a controlled manner (for instance, not initiated by
> users of your application)?
> The actual use case of MapReduce queries are important in determining the
> sensibility of using MapReduce.  A MapReduce query that has the potential
> to be executed at an untested frequency can cause unintended load on the
> cluster.  If the frequency of the MapReduce query is known, it can be
> tested under production load to determine the overall affect it has on the
> cluster.
> You can also test this against your second suggestion, combining multiple
> MapReduce queries with successive / concurrent GETs to determine which
> method is more efficient in your situation.
> Thanks,
> John Caprice
> On Mon, Mar 25, 2013 at 9:28 PM, John Caprice <jcapr...@basho.com> wrote:
>> Matt,
>> How often is this MapReduce query being run?  Is the execution of this
>> MapReduce query done in a controlled manner (for instance, not initiated by
>> users of your application)?
>> The actual use case of MapReduce queries are important in determining
>> Thanks,
>> John Caprice
>> On Mon, Mar 25, 2013 at 9:17 PM, Matt Black <matt.bl...@jbadigital.com>wrote:
>>> Hi list,
>>> I have a non-trivial map reduce query which traverses links between
>>> objects across several map phases, constructing a composite object enroute.
>>> One of our links is one-to-many, so I use a reduce phase to flatten my set
>>> of objects at the end.
>>> Would a query of this type be better done as a smaller M/R query to get
>>> the base set of objects, and then multiple concurrent gets for the related
>>> objects? Is it sensible to do map reduce jobs with many map and reduce
>>> phases?
>>> I appreciate this question is somewhat open ended, but any input would
>>> be welcome.
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
riak-users mailing list

Reply via email to