On Wed, Nov 30, 2011 at 8:36 AM, Kresten Krab Thorup <k...@trifork.com> wrote:> 1. How do I create the initial inputs? i.e. the list of all {Index, Node} pairs that go into the riak_kv_pipe_listkeys fitting. Does this fitting need a special chashfun to send it to the right vnode? The easiest way to get the "go" message to all vnodes is to use the riak_pipe_qcover_* modules, as is done here: https://github.com/basho/riak_kv/blob/master/src/riak_kv_pipe_listkeys.erl#L135 Most of your arguments will be the same as is seen there: [{raw, %% used send (!) for done, not gen_*/luke/etc. calls ReqId, %% a unique integer for the request self()}, %% send done and errors to this process [LKP, %% the pipe with listkeys as its head fitting Bucket, %% the name of the bucket to list NVal]] %% the N to use in coverage calculation, 1 in your case It would be difficult to do the same with a special chashfun, since the input that listkeys expects does not contain anything naming the vnode for your chashfun to work with (every vnode gets the same input). > 2. Given such a pipeline-based thingie installed as .beam files with Riak, is > there a way to "invoke" it via the HTTP M/R API? It would be great if I > don't have to poke a new whole through tcp/ip to exploit pipe. Sadly, no, there is not a way to use this special setup over the HTTP MR API at this time. I have been working on some designs for external pipe control, but they are not yet ready for implementation. So, yes, you'll have to poke a new whole through for now. > 3. Does riak_pipe run a fitting instance per node, or per vnode? Pipe runs a fitting worker per vnode. If you have ideas for how to make that clearer in riak_pipe's README.org, please pass them along! > FYI, ... I'm considering doing a disk-based version of riak_pipe_w_reduce, > which keeps the intermediate results in a local K/V store, in order to > support large keysets. This could just be a bitcask w/merge disabled. > Re-implementing the reducer would also allow us to evaluate the N>R condition > in the reducer, and emit results as early as possible. Sounds awesome and right on track, I must say. -Bryan
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com