Getting multiple values: is iterating or MapReduce preferred?

Rob Speer Mon, 25 Mar 2013 15:09:19 -0700

I've looked at the archives of this mailing list to find a way to implement
a "multi-get" using Riak, for the very common case where there are multiple
keys to look up. Making a separate round-trip to the server for each key
seems inefficient, after all.


I came across the suggestion to use MapReduce, so I tried implementing it
this way (using riak-python-client):

    def multi_get(self, bucket_name, ids):
        if len(ids) == 0:
            return []
        mr = RiakMapReduce(self.riak)
        for uid in ids:
            mr.add(bucket_name, uid)
        query = mr.map_values_json()
        return query.run()

After this I noticed significant load on the Riak servers, and the client
code would sometimes stall for a long time, even on a multi_get that was
only returning 6 documents. Is this actually an inappropriate use of
MapReduce? (And are there appropriate uses of MapReduce in NoSQL databases
besides stress-testing them?)

Is it better to make a separate request for each ID, to use MapReduce, or
to use some other method I haven't thought of?
-- Rob

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Getting multiple values: is iterating or MapReduce preferred?

Reply via email to