Erlang: R13B04 Riak: 0.14.2 I am having the same issue as Jeremy.
I just did 208 MapReduce jobs using anonymous JavaScript functions in the map and reduce phases. I am sending the MapReduce jobs to a single node, riak01. Out of the 208 jobs, I got two "mapexec_error" {error,timeout} on riak02. I read on the basho wiki that the default timeout is 60 seconds. http://wiki.basho.com/Loading-Data-and-Running-MapReduce-Queries.html Map/Reduce queries have a default timeout of 60000 milliseconds (60 seconds). I have discovered that if a MapReduce job does not complete within 10 seconds, then it likely is having issues. Most MapReduce jobs complete in one to two seconds. I can try increasing the MapReduce timeout to 120 seconds, but I doubt that this will help. I have discovered that if there are several timeouts, then the beam process can terminate. Any help would be appreciated. The following is from the sals-error.log on riak01. =ERROR REPORT==== 21-Jun-2011::16:29:11 === ** State machine <0.11130.0> terminating ** Last event in was {mapexec_error,{<<"46">>,'riak@10.0.60.209'}, {error,timeout}} ** When State == executing ** Data == {state,0,riak_kv_map_phase, {state,true, {javascript, {map, {jsanon, ................. =ERROR REPORT==== 21-Jun-2011::16:29:11 === ** State machine <0.11127.0> terminating ** Last message in was {'EXIT',<0.11130.0>,{error,timeout}} ** When State == executing ** Data == {state,41465578, [<0.11130.0>,[<0.11129.0>,<0.11128.0>]], <0.10971.0>,66000, {1308688220159363,#Ref<0.0.0.198634>}, #Fun<riak_kv_mapred_json.jsonify_not_found.1>,[],[]} ** Reason for termination = ** {error,{phase_error,{error,timeout}}} =CRASH REPORT==== 21-Jun-2011::16:29:11 === crasher: initial call: luke_flow:init/1 pid: <0.11127.0> registered_name: [] exception exit: {error,{phase_error,{error,timeout}}} in function gen_fsm:terminate/7 in call from proc_lib:init_p_do_apply/3 ancestors: [luke_flow_sup,luke_sup,<0.91.0>] messages: [] links: [<0.11128.0>,<0.11129.0>,<0.93.0>] dictionary: [] trap_exit: true status: running heap_size: 233 stack_size: 24 reductions: 23099 neighbours: neighbour: [{pid,<0.11129.0>},{registered_name,[]},{initial_call,{luke_phase,init,[Argument__1]}},{current_function,{gen_fsm,loop,7}},{ancestors,[luke_phase_sup,luke_sup,<0. 91.0>]},{messages,[]},{links,[<0.11127.0>,<0.11128.0>,<0.94.0>]},{dictionary,[]},{trap_exit,false},{status,waiting},{heap_size,6765},{stack_size,10},{reductions,4926}] neighbour: [{pid,<0.11128.0>},{registered_name,[]},{initial_call,{luke_phase,init,[Argument__1]}},{current_function,{gen_fsm,loop,7}},{ancestors,[luke_phase_sup,luke_sup,<0. 91.0>]},{messages,[]},{links,[<0.11127.0>,<0.11129.0>,<0.94.0>]},{dictionary,[]},{trap_exit,false},{status,waiting},{heap_size,4181},{stack_size,10},{reductions,4905}] The second timeout error: =ERROR REPORT==== 21-Jun-2011::16:31:10 === ** State machine <0.15144.0> terminating ** Last message in was flow_timeout ** When State == executing ** Data == {state,78575179, [<0.15147.0>,[<0.15146.0>,<0.15145.0>]], <0.15118.0>,66000, {1308688285727293,#Ref<0.0.1.11874>}, #Fun<riak_kv_mapred_json.jsonify_not_found.1>,[],[]} ** Reason for termination = ** {error,flow_timeout} =CRASH REPORT==== 21-Jun-2011::16:31:10 === crasher: initial call: luke_flow:init/1 pid: <0.15144.0> registered_name: [] exception exit: {error,flow_timeout} in function gen_fsm:terminate/7 in call from proc_lib:init_p_do_apply/3 ancestors: [luke_flow_sup,luke_sup,<0.91.0>] messages: [] links: [<0.15145.0>,<0.15147.0>,<0.15146.0>,<0.93.0>] dictionary: [] trap_exit: true status: running heap_size: 233 stack_size: 24 reductions: 20791 neighbours: neighbour: [{pid,<0.15146.0>},{registered_name,[]},{initial_call,{luke_phase,init,[Argument__1]}},{current_function,{gen_fsm,loop,7}},{ancestors,[luke_phase_sup,luke_sup,<0. 91.0>]},{messages,[]},{links,[<0.15144.0>,<0.15145.0>,<0.94.0>]},{dictionary,[]},{trap_exit,false},{status,waiting},{heap_size,4181},{stack_size,10},{reductions,6554}] neighbour: [{pid,<0.15145.0>},{registered_name,[]},{initial_call,{luke_phase,init,[Argument__1]}},{current_function,{gen_fsm,loop,7}},{ancestors,[luke_phase_sup,luke_sup,<0. 91.0>]},{messages,[]},{links,[<0.15144.0>,<0.15146.0>,<0.94.0>]},{dictionary,[]},{trap_exit,false},{status,waiting},{heap_size,4181},{stack_size,10},{reductions,6274}] =SUPERVISOR REPORT==== 21-Jun-2011::16:31:10 === Supervisor: {local,luke_flow_sup} Context: child_terminated Reason: {error,flow_timeout} Offender: [{pid,<0.15144.0>},{name,undefined},{mfa,{luke_flow,start_link,[<0.15118.0>,78575179,[{riak_kv_map_phase,[],[{javascript,{map,{jsanon,<<"function(value,keyData, ................ David
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com