Hi Matt,

If you have a complicated mapreduce job containing multiple phases implemented 
in JavaScript, you will most likely see a lot of contention for the JavaScript 
VMs which will cause problems. While you can tune the configuration [1], you 
may find that you will need a very large pool size in order to properly support 
your job, especially for map phases as these run in parallel.

The best way to speed up the mapreduce job and get around the VM pool 
contention is to implement the mapreduce functions in Erlang.

Best regards,

Christian 

[1] 
http://docs.basho.com/riak/1.2.0/references/appendices/MapReduce-Implementation/#Configuration-Tuning-for-Javascript



--------------------
Christian Dahlqvist
Client Services Engineer
Basho Technologies
EMEA Office
E-mail: christ...@basho.com
Skype: c.dahlqvist
Mobile: +44 7890 590 910

On 8 Apr 2013, at 08:20, Matt Black <matt.bl...@jbadigital.com> wrote:

> Thanks for the reply, Christian.
> 
> I didn't explain well enough in my first post - the map reduce operation is 
> merely loading a bunch of objects, and a Python script which makes the 
> connection to Riak then will write these objects to disk. (It's probably 
> obvious, but I'm using javascript and riak python client.)
> 
> The query itself has many map phases where a composite object is built up 
> from related objects spread across many buckets.
> 
> I was hoping there may be some kind of timeout I could adjust on a per-map 
> phase basis - clutching at straws really.
> 
> Cheers
> Matt
> 
> 
> On 8 April 2013 17:14, Christian Dahlqvist <christ...@basho.com> wrote:
> Hi,
> 
> Without having access to the mapreduce functions you are running, I would 
> assume that a mapreduce job both writing data to disk as well as deleting the 
> written record from Riak might be quite slow. This is not really a use case 
> mapreduce was designed for, and when a mapreduce job crashes or times out it 
> is difficult to know how far along the processing of different records it got.
> 
> I would therefore recommend considering running this type of archiving and 
> delete job as an external batch process instead as it will give you better 
> control over the execution and avoid timeout problems.
> 
> Best regards,
> 
> Christian
> 
> 
> 
> On 8 Apr 2013, at 00:49, Matt Black <matt.bl...@jbadigital.com> wrote:
> 
> > Dear list,
> >
> > I'm currently getting a timeout during a single phase of a multi-phase map 
> > reduce query. Is there anything I can do to assist this in running?
> >
> > It's purpose is to backup and remove objects from Riak, so it will run 
> > periodically during quiet times moving old data out of Riak into file 
> > storage.
> >
> > Traceback (most recent call last):
> >   File "./tools/rolling_backup.py", line 185, in <module>
> >     main()
> >   File "./tools/rolling_backup.py", line 181, in main
> >     args.func(**kwargs)
> >   File "/srv/backup/tools/mapreduce.py", line 295, in do_map_reduce
> >     raise e
> > Exception: 
> > {"phase":2,"error":"timeout","input":"[<<\"cart-products\">>,<<\"cd67d7f6e2688bc2089e6fa79506ac05-2\">>,{struct,[{<<\"uid\">>,<<\"cd67d7f6e2688bc2089e6fa79506ac05\">>},{<<\"cart\">>,{struct,[{<<\"expired_ts\">>,<<\"2013-03-05T19:12:23.906228\">>},{<<\"last_updated\">>,<<\"2013-03-05T19:12:23.906242\">>},{<<\"tags\">>,{struct,[{<<\"type\">>,<<\"AB\">>}]}},{<<\"completed\">>,false},{<<\"created\">>,<<\"2013-03-04T02:10:18.638413\">>},{<<\"products\">>,[{struct,[{<<\"cost\">>,0},{<<\"bundleName\">>,<<\"Product\">>},...]},...]},...]}},...]}]","type":"exit","stack":"[{riak_kv_w_reduce,'-js_runner/1-fun-0-',3,[{file,\"src/riak_kv_w_reduce.erl\"},{line,283}]},{riak_kv_w_reduce,reduce,3,[{file,\"src/riak_kv_w_reduce.erl\"},{line,206}]},{riak_kv_w_reduce,maybe_reduce,2,[{file,\"src/riak_kv_w_reduce.erl\"},{line,157}]},{riak_pipe_vnode_worker,process_input,3,[{file,\"src/riak_pipe_vnode_worker.erl\"},{line,444}]},{riak_pipe_vnode_worker,wait_for_input,2,[{file,\"src/riak_pipe_vnode_worker.erl\"},{line,376}]},{gen_fsm,handle_msg,7,[{file,\"gen_fsm.erl\"},{line,494}]},{proc_lib,...}]"}
> >
> >
> > _______________________________________________
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to