Rob, Performing GET requests either serially or concurrently is more efficient than using MapReduce to query for values. MapReduce has additional overhead that GET requests do not have. One example of this is that a GET is sent to only the nodes in the prefs list for a given key, while a MapReduce query is sent to all nodes.
There are appropriate uses of MapReduce. Using MapReduce in a controlled manner outside of your peak production hours can minimize performance effects. For example, using MapReduce nightly to perform maintenance, build reports etc. It is important to ensure that MapReduce queries remain bounded. Replacing serial / concurrent GETs in your application with MapReduce queries provides the opportunity for unbounded use, which can have severe performance consequences. Making separate requests, either serially or concurrently, is the optimal way to query data in Riak. To an application developer, this might not look as elegant however it is much more efficient for Riak. Thanks, John On Mon, Mar 25, 2013 at 3:07 PM, Rob Speer <r...@luminoso.com> wrote: > I've looked at the archives of this mailing list to find a way to > implement a "multi-get" using Riak, for the very common case where there > are multiple keys to look up. Making a separate round-trip to the server > for each key seems inefficient, after all. > > I came across the suggestion to use MapReduce, so I tried implementing it > this way (using riak-python-client): > > def multi_get(self, bucket_name, ids): > if len(ids) == 0: > return [] > mr = RiakMapReduce(self.riak) > for uid in ids: > mr.add(bucket_name, uid) > query = mr.map_values_json() > return query.run() > > After this I noticed significant load on the Riak servers, and the client > code would sometimes stall for a long time, even on a multi_get that was > only returning 6 documents. Is this actually an inappropriate use of > MapReduce? (And are there appropriate uses of MapReduce in NoSQL databases > besides stress-testing them?) > > Is it better to make a separate request for each ID, to use MapReduce, or > to use some other method I haven't thought of? > -- Rob > > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com