Erlang: R13B04
Riak: 0.14.2

I am having the same issue as Jeremy.

I just did 208 MapReduce jobs using anonymous JavaScript functions in the map 
and reduce phases.  I am sending the MapReduce jobs to a single node, riak01.  
Out of the 208 jobs, I got two "mapexec_error" {error,timeout} on riak02.

I read on the basho wiki that the default timeout is 60 seconds.  
http://wiki.basho.com/Loading-Data-and-Running-MapReduce-Queries.html
Map/Reduce queries have a default timeout of 60000 milliseconds (60 seconds).

I have discovered that if a MapReduce job does not complete within 10 seconds, 
then it likely is having issues.  Most MapReduce jobs complete in one to two 
seconds.  I can try increasing the MapReduce timeout to 120 seconds, but I 
doubt that this will help.

I have discovered that if there are several timeouts, then the beam process can 
terminate.

Any help would be appreciated.

The following is from the sals-error.log on riak01.

=ERROR REPORT==== 21-Jun-2011::16:29:11 ===
** State machine <0.11130.0> terminating
** Last event in was {mapexec_error,{<<"46">>,'riak@10.0.60.209'},
                                    {error,timeout}}
** When State == executing
**      Data  == {state,0,riak_kv_map_phase,
                  {state,true,
                   {javascript,
                    {map,
                     {jsanon,
.................

=ERROR REPORT==== 21-Jun-2011::16:29:11 ===
** State machine <0.11127.0> terminating
** Last message in was {'EXIT',<0.11130.0>,{error,timeout}}
** When State == executing
**      Data  == {state,41465578,
                        [<0.11130.0>,[<0.11129.0>,<0.11128.0>]],
                        <0.10971.0>,66000,
                        {1308688220159363,#Ref<0.0.0.198634>},
                        #Fun<riak_kv_mapred_json.jsonify_not_found.1>,[],[]}
** Reason for termination =
** {error,{phase_error,{error,timeout}}}

=CRASH REPORT==== 21-Jun-2011::16:29:11 ===
  crasher:
    initial call: luke_flow:init/1
    pid: <0.11127.0>
    registered_name: []
    exception exit: {error,{phase_error,{error,timeout}}}
      in function  gen_fsm:terminate/7
      in call from proc_lib:init_p_do_apply/3
    ancestors: [luke_flow_sup,luke_sup,<0.91.0>]
    messages: []
    links: [<0.11128.0>,<0.11129.0>,<0.93.0>]
    dictionary: []
    trap_exit: true
    status: running
    heap_size: 233
    stack_size: 24
    reductions: 23099
  neighbours:
    neighbour: 
[{pid,<0.11129.0>},{registered_name,[]},{initial_call,{luke_phase,init,[Argument__1]}},{current_function,{gen_fsm,loop,7}},{ancestors,[luke_phase_sup,luke_sup,<0.
91.0>]},{messages,[]},{links,[<0.11127.0>,<0.11128.0>,<0.94.0>]},{dictionary,[]},{trap_exit,false},{status,waiting},{heap_size,6765},{stack_size,10},{reductions,4926}]
    neighbour: 
[{pid,<0.11128.0>},{registered_name,[]},{initial_call,{luke_phase,init,[Argument__1]}},{current_function,{gen_fsm,loop,7}},{ancestors,[luke_phase_sup,luke_sup,<0.
91.0>]},{messages,[]},{links,[<0.11127.0>,<0.11129.0>,<0.94.0>]},{dictionary,[]},{trap_exit,false},{status,waiting},{heap_size,4181},{stack_size,10},{reductions,4905}]

The second timeout error:

=ERROR REPORT==== 21-Jun-2011::16:31:10 ===
** State machine <0.15144.0> terminating
** Last message in was flow_timeout
** When State == executing
**      Data  == {state,78575179,
                        [<0.15147.0>,[<0.15146.0>,<0.15145.0>]],
                        <0.15118.0>,66000,
                        {1308688285727293,#Ref<0.0.1.11874>},
                        #Fun<riak_kv_mapred_json.jsonify_not_found.1>,[],[]}
** Reason for termination =
** {error,flow_timeout}

=CRASH REPORT==== 21-Jun-2011::16:31:10 ===
  crasher:
    initial call: luke_flow:init/1
    pid: <0.15144.0>
    registered_name: []
    exception exit: {error,flow_timeout}
      in function  gen_fsm:terminate/7
      in call from proc_lib:init_p_do_apply/3
    ancestors: [luke_flow_sup,luke_sup,<0.91.0>]
    messages: []
    links: [<0.15145.0>,<0.15147.0>,<0.15146.0>,<0.93.0>]
    dictionary: []
    trap_exit: true
    status: running
    heap_size: 233
    stack_size: 24
    reductions: 20791
  neighbours:
    neighbour: 
[{pid,<0.15146.0>},{registered_name,[]},{initial_call,{luke_phase,init,[Argument__1]}},{current_function,{gen_fsm,loop,7}},{ancestors,[luke_phase_sup,luke_sup,<0.
91.0>]},{messages,[]},{links,[<0.15144.0>,<0.15145.0>,<0.94.0>]},{dictionary,[]},{trap_exit,false},{status,waiting},{heap_size,4181},{stack_size,10},{reductions,6554}]
    neighbour: 
[{pid,<0.15145.0>},{registered_name,[]},{initial_call,{luke_phase,init,[Argument__1]}},{current_function,{gen_fsm,loop,7}},{ancestors,[luke_phase_sup,luke_sup,<0.
91.0>]},{messages,[]},{links,[<0.15144.0>,<0.15146.0>,<0.94.0>]},{dictionary,[]},{trap_exit,false},{status,waiting},{heap_size,4181},{stack_size,10},{reductions,6274}]

=SUPERVISOR REPORT==== 21-Jun-2011::16:31:10 ===
     Supervisor: {local,luke_flow_sup}
     Context:    child_terminated
     Reason:     {error,flow_timeout}
     Offender:   
[{pid,<0.15144.0>},{name,undefined},{mfa,{luke_flow,start_link,[<0.15118.0>,78575179,[{riak_kv_map_phase,[],[{javascript,{map,{jsanon,<<"function(value,keyData,
................

David

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to