Hello Probably my understanding of M/R might be wrong. But I am getting drastic performance difference when running secondary index query on PB with map and reduce function in different order. If my understanding is correct a reduce phase with riak_kv_mapreduce.reduce_identity is needed for secondary index query. I added one map phase to get the value instead of the key
But if I send the reduce before the map as you see in the map reduce payload JSON the values are return much faster than the other way. In my test it 251 ms vs 700ms. Anyone can explain this behavior. Reduce before map (Faster) ------- {"inputs":{"index":"PERFTEST_INDEX_NAME_bin","bucket":"_ITEST_SI_BUCKET","key":"PERFTEST_INDEX_VALUE"},"query":[{"reduce":{"arg":"{reduce_phase_only_1, true}","module":"riak_kv_mapreduce","language":"erlang","keep":false,"function":"reduce_identity"}},{"map":{"source":"function(value,keyData,arg){ return [value.values[0].data]; }","language":"javascript","keep":true}}]} Map before reduce (Slower) -------------- {"inputs":{"index":"PERFTEST_INDEX_NAME_bin","bucket":"_ITEST_SI_BUCKET","key":"PERFTEST_INDEX_VALUE"},"query":[{"map":{"source":"function(value,keyData,arg){ return [value.values[0].data]; }","language":"javascript","keep":true}},{"reduce":{"arg":"{reduce_phase_only_1, true}","module":"riak_kv_mapreduce","language":"erlang","keep":false,"function":"reduce_identity"}}]} ________________________________ This message is private and confidential. If you have received it in error, please notify the sender and remove it from your system.
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com