Looks like the root of the problem in incorrect handling of bad_crc backend error. This error was mention here https://github.com/basho/riak_kv/pull/385 Could anybody advice way how-to to do dealt with this error in riak 1.2.1-1 ?
2013/8/1 Daniil Churikov <ddo...@gmail.com> > Hello dear list. > Recently we had an issue: 1 from 4 of our riak nodes went down > ('riak@10.0.1.192'). As expected 3 other nodes took it's work and > everything > was ok. > After some time we was able to recover this node and put it again into > cluster. > > But now I see some strange things in logs: > > # riak-admin transfers > Attempting to restart script through sudo -H -u riak > 'riak@10.0.1.190' waiting to handoff 1 partitions > > > From 'riak@10.0.1.190' logs: > > [error] hinted_handoff transfer of riak_kv_vnode from 'riak@10.0.1.190' > 1244559988039597016282825365359959758925755056128 to 'riak@10.0.1.192' > 1244559988039597016282825365359959758925755056128 failed because of > error:{case_clause,{error,closed}} > > [{riak_core_handoff_sender,start_fold,5,[{file,"src/riak_core_handoff_sender.erl"},{line,174}]}] > > > And from 'riak@10.0.1.192' (recovered node) logs: > 2013-08-01 08:20:00.222 UTC [error] <0.31615.12> Supervisor poolboy_sup had > child riak_core_vnode_worker started with > > riak_core_vnode_worker:start_link([{worker_args,[1244559988039597016282825365359959758925755056128,[],worker_props]},{worker_callback_mod,...},...]) > at undefined exit with reason no case clause matching > > {error,bad_crc,{state,#Ref<0.0.11.259591>,"1244559988039597016282825365359959758925755056128",[{async_folds,true},{vnode_vclocks,true},{included_applications,[]},{allow_strfun,false},{storage_backend,riak_kv_bitcask_backend},{legacy_keylisting,false},{reduce_js_vm_count,0},{js_thread_stack,16},{riak_kv_stat,true},{map_js_vm_count,0},{mapred_system,pipe},{js_max_vm_mem,8},{legacy_stats,true},{mapred_name,"mapred"},{stats_urlpath,"stats"},{http_url_encoding,...},...],...}} > in riak_kv_vnode:prepare_put/3 line 698 in context shutdown_error > 2013-08-01 08:20:00.231 UTC [error] <0.31615.12> gen_server <0.31615.12> > terminated with reason: no case clause matching > > {error,bad_crc,{state,#Ref<0.0.11.259591>,"1244559988039597016282825365359959758925755056128",[{async_folds,true},{vnode_vclocks,true},{included_applications,[]},{allow_strfun,false},{storage_backend,riak_kv_bitcask_backend},{legacy_keylisting,false},{reduce_js_vm_count,0},{js_thread_stack,16},{riak_kv_stat,true},{map_js_vm_count,0},{mapred_system,pipe},{js_max_vm_mem,8},{legacy_stats,true},{mapred_name,"mapred"},{stats_urlpath,"stats"},{http_url_encoding,...},...],...}} > in riak_kv_vnode:prepare_put/3 line 698 > 2013-08-01 08:20:00.255 UTC [error] <0.31615.12> CRASH REPORT Process > <0.31615.12> with 0 neighbours exited with reason: no case clause matching > > {error,bad_crc,{state,#Ref<0.0.11.259591>,"1244559988039597016282825365359959758925755056128",[{async_folds,true},{vnode_vclocks,true},{included_applications,[]},{allow_strfun,false},{storage_backend,riak_kv_bitcask_backend},{legacy_keylisting,false},{reduce_js_vm_count,0},{js_thread_stack,16},{riak_kv_stat,true},{map_js_vm_count,0},{mapred_system,pipe},{js_max_vm_mem,8},{legacy_stats,true},{mapred_name,"mapred"},{stats_urlpath,"stats"},{http_url_encoding,...},...],...}} > in riak_kv_vnode:prepare_put/3 line 698 in gen_server:terminate/6 line 747 > > > It seems that for some reason 'riak@10.0.1.192' node abort handoff every > time 'riak@10.0.1.190' tried to give it's data back. > > Riak version is 1.2.1-1 > > > > -- > View this message in context: > http://riak-users.197444.n3.nabble.com/riak-hinted-handoff-freezed-transfer-tp4028652.html > Sent from the Riak Users mailing list archive at Nabble.com. > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com