There are no other errors in any of the logs at exactly the same time but there are periodic errors in error.log and console.log of the following form (and these occurred seconds before and after the crash):
2014-10-02 12:07:57.623 [error] <0.30125.18> CRASH REPORT Process <0.30125.18> with 10 neighbours exited with reason: no case clause matching "immediate" in riak_kv_vnode:do_delete/3 line 1321 in gen_f sm:terminate/7 line 611 2014-10-02 12:07:57.623 [error] <0.30127.18> Supervisor {<0.30127.18>,poolboy_sup} had child riak_core_vnode_worker started with riak_core_vnode_worker:start_link([{worker_module,riak_core_vnode_worker },{worker_args,[228359630832953580969325755111919221821239459840,...]},...]) at undefined exit with reason no case clause matching "immediate" in riak_kv_vnode:do_delete/3 line 1321 in context shutdown _error 2014-10-02 12:07:57.623 [error] <0.30127.18> gen_server <0.30127.18> terminated with reason: no case clause matching "immediate" in riak_kv_vnode:do_delete/3 line 1321 crash.log contains: 2014-10-02 12:07:57 =ERROR REPORT==== ** State machine <0.30125.18> terminating ** Last message in was {'EXIT',<0.30124.18>,{{case_clause,"immediate"},[{riak_kv_vnode,do_delete,3,[{file,"src/riak_kv_vnode.erl"},{line,1321}]},{riak_core_vnode,vnode_command,3,[{file,"src/riak_core_v node.erl"},{line,299}]},{gen_fsm,handle_msg,7,[{file,"gen_fsm.erl"},{line,494}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}} ** When State == ready ** Data == {state,<0.30127.18>,{[<0.30129.18>,<0.30128.18>,<0.30138.18>,<0.30137.18>],[<0.30130.18>,<0.30131.18>,<0.30132.18>,<0.30134.18>,<0.30135.18>,<0.30136.18>]},{[],[]},295960692,10,0,0} ** Reason for termination = ** {{case_clause,"immediate"},[{riak_kv_vnode,do_delete,3,[{file,"src/riak_kv_vnode.erl"},{line,1321}]},{riak_core_vnode,vnode_command,3,[{file,"src/riak_core_vnode.erl"},{line,299}]},{gen_fsm,handle_m sg,7,[{file,"gen_fsm.erl"},{line,494}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]} 2014-10-02 12:07:57 =CRASH REPORT==== crasher: initial call: poolboy:init/1 pid: <0.30125.18> registered_name: [] exception exit: {{{case_clause,"immediate"},[{riak_kv_vnode,do_delete,3,[{file,"src/riak_kv_vnode.erl"},{line,1321}]},{riak_core_vnode,vnode_command,3,[{file,"src/riak_core_vnode.erl"},{line,299}]} ,{gen_fsm,handle_msg,7,[{file,"gen_fsm.erl"},{line,494}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]},[{gen_fsm,terminate,7,[{file,"gen_fsm.erl"},{line,611}]},{proc_lib,init_p_do_ apply,3,[{file,"proc_lib.erl"},{line,227}]}]} ancestors: [<0.30124.18>,<0.30117.18>,riak_core_vnode_sup,riak_core_sup,<0.157.0>] messages: [] links: [<0.30134.18>,<0.30136.18>,<0.30137.18>,<0.30138.18>,<0.30135.18>,<0.30129.18>,<0.30131.18>,<0.30132.18>,<0.30130.18>,<0.30127.18>,<0.30128.18>] dictionary: [] trap_exit: true status: running heap_size: 987 stack_size: 24 reductions: 403 neighbours: neighbour: [{pid,<0.30128.18>},{registered_name,[]},{initial_call,{riak_core_vnode_worker,init,['Argument__1']}},{current_function,{gen_server,loop,6}},{ancestors,[<0.30127.18>,<0.30125.18>,<0.3012 4.18>,<0.30117.18>,riak_core_vnode_sup,riak_core_sup,<0.157.0>]},{messages,[]},{links,[<0.30127.18>,<0.30125.18>]},{dictionary,[]},{trap_exit,false},{status,waiting},{heap_size,610},{stack_size,9},{red uctions,526}] neighbour: [{pid,<0.30130.18>},{registered_name,[]},{initial_call,{riak_core_vnode_worker,init,['Argument__1']}},{current_function,{gen_server,loop,6}},{ancestors,[<0.30127.18>,<0.30125.18>,<0.3012 4.18>,<0.30117.18>,riak_core_vnode_sup,riak_core_sup,<0.157.0>]},{messages,[]},{links,[<0.30127.18>,<0.30125.18>]},{dictionary,[]},{trap_exit,false},{status,waiting},{heap_size,233},{stack_size,9},{red uctions,27}] ... 2014-10-02 12:07:57 =SUPERVISOR REPORT==== Supervisor: {<0.30127.18>,poolboy_sup} Context: shutdown_error Reason: {{case_clause,"immediate"},[{riak_kv_vnode,do_delete,3,[{file,"src/riak_kv_vnode.erl"},{line,1321}]},{riak_core_vnode,vnode_command,3,[{file,"src/riak_core_vnode.erl"},{line,299}]},{ge n_fsm,handle_msg,7,[{file,"gen_fsm.erl"},{line,494}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]} Offender: [{nb_children,10},{name,riak_core_vnode_worker},{mfargs,{riak_core_vnode_worker,start_link,[[{worker_module,riak_core_vnode_worker},{worker_args,[22835963083295358096932575511191922182 1239459840,[],worker_props,<0.30124.18>]},{worker_callback_mod,riak_kv_worker},{size,10},{max_overflow,0}]]}},{restart_type,temporary},{shutdown,5000},{child_type,worker}] 2014-10-02 12:07:57 =ERROR REPORT==== ** Generic server <0.30127.18> terminating ** Last message in was {'EXIT',<0.30125.18>,{{case_clause,"immediate"},[{riak_kv_vnode,do_delete,3,[{file,"src/riak_kv_vnode.erl"},{line,1321}]},{riak_core_vnode,vnode_command,3,[{file,"src/riak_core_v node.erl"},{line,299}]},{gen_fsm,handle_msg,7,[{file,"gen_fsm.erl"},{line,494}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}} ** When Server state == {state,{<0.30127.18>,poolboy_sup},simple_one_for_one,[{child,undefined,riak_core_vnode_worker,{riak_core_vnode_worker,start_link,[[{worker_module,riak_core_vnode_worker},{worker _args,[228359630832953580969325755111919221821239459840,[],worker_props,<0.30124.18>]},{worker_callback_mod,riak_kv_worker},{size,10},{max_overflow,0}]]},temporary,5000,worker,[riak_core_vnode_worker]} ],{set,10,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[<0.30131.18>],[<0.30132.18>],[],[<0.30134.18>],[<0.30135.18>],[<0.30136.18>],[<0.30137.18>],[<0.30138.18>],[],[],[],[],[],[< 0.30128.18>],[<0.30129.18>],[<0.30130.18>]}}},0,1,[],poolboy_sup,{riak_core_vnode_worker,[{worker_module,riak_core_vnode_worker},{worker_args,[228359630832953580969325755111919221821239459840,[],worker _props,<0.30124.18>]},{worker_callback_mod,riak_kv_worker},{size,10},{max_overflow,0}]}} ** Reason for termination == ** {{case_clause,"immediate"},[{riak_kv_vnode,do_delete,3,[{file,"src/riak_kv_vnode.erl"},{line,1321}]},{riak_core_vnode,vnode_command,3,[{file,"src/riak_core_vnode.erl"},{line,299}]},{gen_fsm,handle_m sg,7,[{file,"gen_fsm.erl"},{line,494}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]} 2014-10-02 12:07:57 =CRASH REPORT==== crasher: initial call: supervisor:poolboy_sup/1 pid: <0.30127.18> registered_name: [] exception exit: {{{case_clause,"immediate"},[{riak_kv_vnode,do_delete,3,[{file,"src/riak_kv_vnode.erl"},{line,1321}]},{riak_core_vnode,vnode_command,3,[{file,"src/riak_core_vnode.erl"},{line,299}]} ,{gen_fsm,handle_msg,7,[{file,"gen_fsm.erl"},{line,494}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]},[{gen_server,terminate,6,[{file,"gen_server.erl"},{line,747}]},{proc_lib,init _p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]} ancestors: [<0.30125.18>,<0.30124.18>,<0.30117.18>,riak_core_vnode_sup,riak_core_sup,<0.157.0>] messages: [] links: [] dictionary: [] trap_exit: true status: running heap_size: 4181 stack_size: 24 reductions: 1530 neighbours: On Thu, Oct 2, 2014 at 12:20 PM, Dmitri Zagidulin <dzagidu...@basho.com> wrote: > Thanks. Are there entries in any of the other logs? (like the crash dump). > > Can you also provide more info on the nodes themselves. What size AWS > instances are you running? Is the delete timeout happening while load > testing? > > On Thu, Oct 2, 2014 at 12:11 PM, Igor Senderovich < > isenderov...@esperdyne.com> wrote: > >> Thanks for your help, Dmitri, >> >> I get the following in error.log: >> 2014-10-02 12:05:45.037 [error] <0.6359.19> Webmachine error at path >> "/buckets/imc/keys/5134a18660494ea5553d2c90ef9eea2f" : "Service Unavailable" >> >> And no, there is no load balancer on our cluster. >> Thank you >> >> >> On Thu, Oct 2, 2014 at 11:52 AM, Dmitri Zagidulin <dzagidu...@basho.com> >> wrote: >> >>> One other question - are you using a load balancer for your cluster >>> (like HAProxy or the like). In which case, take a look at its logs, also. >>> >>> On Thu, Oct 2, 2014 at 11:51 AM, Dmitri Zagidulin <dzagidu...@basho.com> >>> wrote: >>> >>>> Igor, >>>> Can you look in the riak log directory, in the error.log (and console >>>> log and crash dump file) to see if there's any entries, around the time of >>>> the delete operation? And post them here? >>>> >>>> >>>> >>>> On Thu, Oct 2, 2014 at 11:45 AM, Igor Senderovich < >>>> isenderov...@esperdyne.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> I get a timeout when deleting a key, reproducible in about 1 in 10 >>>>> times: >>>>> $ curl -i -vvv >>>>> http://myhost:8098/buckets/imc/keys/5134a18660494ea5553d2c90ef9eea2f >>>>> >>>>> * About to connect() to dp1.prod6.ec2.cmg.net port 8098 >>>>> * Trying 10.12.239.90... connected >>>>> * Connected to dp1.prod6.ec2.cmg.net (10.12.239.90) port 8098 >>>>> > DELETE /buckets/imc/keys/5134a18660494ea5553d2c90ef9eea2f HTTP/1.1 >>>>> > User-Agent: curl/7.15.5 (x86_64-redhat-linux-gnu) libcurl/7.15.5 >>>>> OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5 >>>>> > Host: dp1.prod6.ec2.cmg.net:8098 >>>>> > Accept: */* >>>>> > >>>>> < HTTP/1.1 503 Service Unavailable >>>>> HTTP/1.1 503 Service Unavailable >>>>> < Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact) >>>>> Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact) >>>>> < Date: Wed, 01 Oct 2014 16:11:41 GMT >>>>> Date: Wed, 01 Oct 2014 16:11:41 GMT >>>>> < Content-Type: text/plain >>>>> Content-Type: text/plain >>>>> < Content-Length: 18 >>>>> Content-Length: 18 >>>>> >>>>> request timed out >>>>> * Connection #0 to host dp1.prod6.ec2.cmg.net left intact >>>>> * Closing connection #0 >>>>> >>>>> >>>>> This is on Riak 1.4 on a 5 node cluster with an n-value of 3. >>>>> Thank you for your help >>>>> >>>>> _______________________________________________ >>>>> riak-users mailing list >>>>> riak-users@lists.basho.com >>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>> >>>>> >>>> >>> >> >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com