On Wed, Apr 4, 2012 at 11:28 AM, Michael Radford <m...@blorf.com> wrote: > Aha, I just noticed that the native erlang client is still using > luke_flow to implement its map-reduce, rather than riak_pipe. On some > level, this must be the reason for the differing behavior...either a > bug in riak_pipe, or a bug in the usage of riak_pipe somewhere in the > chain?
Maybe not "bug", but "naïveté", I think, is a pretty good bet. https://github.com/basho/riak_kv/blob/master/src/riak_kv_mrc_pipe.erl#L551-L593 This is a behavior that we changed for list_keys just before 1.0 and for 2i in 1.1. The implementation is naïve: have a query send all results to this process, which then enqueues them one at a time in the pipe. Switching to a model where this queueing is done in parallel (as in riak_kv_pipe_listkeys and …_index) reduces time dramatically, because there's no need to hold up enqueuing an input on node X while an input is being enqueued on node Y. The system is fluid enough that such serialization often means that each stage of the pipeline is processing ~1 input at a time, in aggregate, instead of ~($Partitions) inputs at a time. This *might* be the wrong intuition for Search, since there is funneling happening to process the query anyway, but it's likely a good place to start. -Bryan _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com