There were also errors during initial handoff, here is a full console.log
for that day:

I actually replaced two nodes that day. First one went smoothly as it
should. The second one resulted in the situation above. I replaced the
first one and then the second after few hours.

On 4 November 2014 20:44, Oleksiy Krivoshey <> wrote:

> Hi,
> I'm running a 5 node cluster (Riak 2.0.0) and I had to replace hardware on
> one of the servers. So I did a 'cluster leave', waited till the node
> exited, checked the ring status and members status, all was ok, with no
> pending changes. Then later after about 5 minutes every client connection
> to any of the 4 remaining nodes started to fail with
> [Error: {error,mailbox_overload}
> I have restarted one node after another and the error has gone. However I
> was still experiencing connectivity issues (timeouts) and riak error log is
> full of various errors even after I joined the 5th node back.
> Error are like:
> Failed to merge
> {["/var/lib/riak/bitcask_expire_1d/685078892498860742907977265335757665463718379520/"]
> gen_fsm <0.818.0> in state active terminated with reason: bad record state
> in riak_kv_vnode:set_vnode_forwarding/2 line 991
> @riak_pipe_vnode:new_worker:826 Pipe worker startup failed:
> msg,7,[{file,"gen_fsm.erl"},{line,505}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]
> 2014-11-04 16:07:57.124 [error]
> <0.11128.0>@riak_core_handoff_sender:start_fold:279 hinted_handoff transfer
> of riak_kv_vnode from 'riak@'
> 353957427791078050502454920423474793822921162752 to 'riak@
>' 353957427791078050502454920423474793822921162752 failed because
> of error:undef
> [{riak_core_format,human_size_fmt,["~.2f",588],[]},{riak_core_handoff_sender,start_fold,5,[{file,"src/riak_core_han
> doff_sender.erl"},{line,246}]}]
> The full error log file is available here:
> There was no significant load on Riak so I would like to understand what
> caused so many errors?
> --
> Oleksiy

Oleksiy Krivoshey
riak-users mailing list

Reply via email to