Hi, We have been using riak to gather our test data and analyze results after test completes. Recently we have observed riak crash in riak console logs. This causes our tests failing to record data to riak and bailing out :-(
The crash logs are as follow: 2016-02-19 16:25:26.255 [error] <0.2160.0> gen_fsm <0.2160.0> in state active terminated with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 2016-02-19 16:25:26.260 [error] <0.2160.0> CRASH REPORT Process <0.2160.0> with 2 neighbours exited with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 in gen_fsm:terminate/7 line 622 2016-02-19 16:25:26.260 [error] <0.172.0> Supervisor riak_core_vnode_sup had child undefined started with {riak_core_vnode,start_link,undefined} at <0.2160.0> exit with reason no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 in context child_terminated 2016-02-19 16:25:26.261 [error] <0.4319.0> gen_fsm <0.4319.0> in state ready terminated with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 2016-02-19 16:25:26.275 [error] <0.4319.0> CRASH REPORT Process <0.4319.0> with 10 neighbours exited with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 in gen_fsm:terminate/7 line 622 2016-02-19 16:25:26.278 [error] <0.4320.0> Supervisor {<0.4320.0>,poolboy_sup} had child riak_core_vnode_worker started with riak_core_vnode_worker:start_link([{worker_module,riak_core_vnode_worker},{worker_args,[268322566228720457638957762256505085639956365312,...]},...]) at undefined exit with reason no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 in context shutdown_error 2016-02-19 16:25:26.278 [error] <0.4320.0> gen_server <0.4320.0> terminated with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 2016-02-19 16:25:26.278 [error] <0.4320.0> CRASH REPORT Process <0.4320.0> with 0 neighbours exited with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 in gen_server:terminate/6 line 744 2016-02-19 16:25:26.806 [error] <0.2157.0> gen_fsm <0.2157.0> in state active terminated with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}} 2016-02-19 16:25:26.808 [error] <0.2157.0> CRASH REPORT Process <0.2157.0> with 2 neighbours exited with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}} in gen_fsm:terminate/7 line 600 2016-02-19 16:25:26.809 [error] <0.5450.0> gen_fsm <0.5450.0> in state ready terminated with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}} 2016-02-19 16:25:26.809 [error] <0.172.0> Supervisor riak_core_vnode_sup had child undefined started with {riak_core_vnode,start_link,undefined} at <0.2157.0> exit with reason {timeout,{gen_server,call,[<0.5141.0>,stop]}} in context child_terminated 2016-02-19 16:25:26.809 [error] <0.5450.0> CRASH REPORT Process <0.5450.0> with 10 neighbours exited with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}} in gen_fsm:terminate/7 line 622 2016-02-19 16:25:26.809 [error] <0.5451.0> Supervisor {<0.5451.0>,poolboy_sup} had child riak_core_vnode_worker started with riak_core_vnode_worker:start_link([{worker_module,riak_core_vnode_worker},{worker_args,[211232658520482062396626323478525280184646500352,...]},...]) at undefined exit with reason {timeout,{gen_server,call,[<0.5141.0>,stop]}} in context shutdown_error 2016-02-19 16:25:26.809 [error] <0.5451.0> gen_server <0.5451.0> terminated with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}} 2016-02-19 16:25:26.809 [error] <0.5451.0> CRASH REPORT Process <0.5451.0> with 0 neighbours exited with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}} in gen_server:terminate/6 line 744 Our setup is as follow: We have a riak cluster with 10 nodes, configuration of each node is as follow: RAM: 48GB Disk: 80GB (/) 504GB (separate riak partition) Riak Version: 2.1.3-1 (2.1.3) Data in riak: After observing crash, total data in riak partition was ~50GB Riak config is as follow: riak.conf [Attached with this email] advanced.config: [ {riak_kv, [{add_paths, ["/usr/local/lib/scale_riak/ebin"]}]}, {webmachine, [{backlog, 511}, {nodelay, true}]}, {yokozuna, [{solr_request_timeout, 120000}]} ]. We have observed this a few times now, and after this crash we observed latency increases and our application starts timing out. We would really like to understand what might be causing this crash and if it is something due to missing config on our nodes we would like to fix it. Thanks for your help in advanced :-) Regards, Raviraj
riak.conf
Description: riak.conf
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com