Hello
Probably my understanding of M/R might be wrong. But I am getting drastic 
performance difference when running secondary index query on PB with map and 
reduce function in different order.
If my understanding is correct a reduce phase with 
riak_kv_mapreduce.reduce_identity is needed for secondary index query. I added 
one map phase to get the value instead of the key

But if I send the reduce before the map as you see in the map reduce payload 
JSON the values are return much faster than the other way. In my test it 251 ms 
vs 700ms. Anyone can explain this behavior.

Reduce before map (Faster)
-------
{"inputs":{"index":"PERFTEST_INDEX_NAME_bin","bucket":"_ITEST_SI_BUCKET","key":"PERFTEST_INDEX_VALUE"},"query":[{"reduce":{"arg":"{reduce_phase_only_1,
 
true}","module":"riak_kv_mapreduce","language":"erlang","keep":false,"function":"reduce_identity"}},{"map":{"source":"function(value,keyData,arg){
 return [value.values[0].data]; }","language":"javascript","keep":true}}]}

Map before reduce (Slower)
--------------
{"inputs":{"index":"PERFTEST_INDEX_NAME_bin","bucket":"_ITEST_SI_BUCKET","key":"PERFTEST_INDEX_VALUE"},"query":[{"map":{"source":"function(value,keyData,arg){
 return [value.values[0].data]; 
}","language":"javascript","keep":true}},{"reduce":{"arg":"{reduce_phase_only_1,
 
true}","module":"riak_kv_mapreduce","language":"erlang","keep":false,"function":"reduce_identity"}}]}


________________________________
This message is private and confidential. If you have received it in error, 
please notify the sender and remove it from your system.

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to