Hi Mattias, MapReduce in Riak executes based on the data in a single partition and does, for efficiency reasons, not perform a quorum read (which greatly reduces the required amount of network traffic). As Riak is eventually consistent, it is possible that all partitions do not hold exactly the same data or version of the data at any point in time. What you are seeing could very well be a result of all replicas of some data not being in sync across all partitions holding a copy.
This would however be corrected either through read-repair or AAE (Active Anti-Entropy) if you have this enabled. If you were to perform a GET on a key that is missing, triggering read-repair, I would expect it to consistently show up in the results from that point on, at least until it is updated again. Best regards, Christian On 12 Apr 2013, at 08:13, Mattias Sjölinder <matt...@sjolinder.se> wrote: > Hi > > I struggling to get a grip around MapReduce and why it is sometimes returning > only a subset of what is expected. Is it the nodes processing the map phase > that after a specific time returning the found matches so far? I would rather > have it returning timeout instead of a subset of the actual match. > > An example is this simple MapReduce: > > { > "inputs":{ > "bucket":"som-bucket", > "index":"userid_bin", > "key":"18481123123" > }, > "query":[ > { > "map":{ > "language":"javascript", > "name":"Riak.mapValuesJson", > "keep":true > } > } > ] > } > > > Any thoughts? > > Regards > Mattias > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com