Hi John, thanks for the response. In my case this is a nightly process which will be farming older data out of Riak for analysis elsewhere. I can certainly run a few tests with the M/R as it is, and with a more simple version which does sequential gets. Which metrics should I be interested in when measuring this do you think?
Thanks Matt On 26 March 2013 15:35, John Caprice <jcapr...@basho.com> wrote: > Matt, > > (I'll finish this response this time!) > > How often is this MapReduce query being run? Is the execution of this > MapReduce query done in a controlled manner (for instance, not initiated by > users of your application)? > > The actual use case of MapReduce queries are important in determining the > sensibility of using MapReduce. A MapReduce query that has the potential > to be executed at an untested frequency can cause unintended load on the > cluster. If the frequency of the MapReduce query is known, it can be > tested under production load to determine the overall affect it has on the > cluster. > > You can also test this against your second suggestion, combining multiple > MapReduce queries with successive / concurrent GETs to determine which > method is more efficient in your situation. > > Thanks, > > John Caprice > > > On Mon, Mar 25, 2013 at 9:28 PM, John Caprice <jcapr...@basho.com> wrote: > >> Matt, >> >> >> How often is this MapReduce query being run? Is the execution of this >> MapReduce query done in a controlled manner (for instance, not initiated by >> users of your application)? >> >> The actual use case of MapReduce queries are important in determining >> >> Thanks, >> >> John Caprice >> >> >> On Mon, Mar 25, 2013 at 9:17 PM, Matt Black <matt.bl...@jbadigital.com>wrote: >> >>> Hi list, >>> >>> I have a non-trivial map reduce query which traverses links between >>> objects across several map phases, constructing a composite object enroute. >>> One of our links is one-to-many, so I use a reduce phase to flatten my set >>> of objects at the end. >>> >>> Would a query of this type be better done as a smaller M/R query to get >>> the base set of objects, and then multiple concurrent gets for the related >>> objects? Is it sensible to do map reduce jobs with many map and reduce >>> phases? >>> >>> I appreciate this question is somewhat open ended, but any input would >>> be welcome. >>> >>> _______________________________________________ >>> riak-users mailing list >>> riak-users@lists.basho.com >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >>> >> >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com