Handoff stalled on 1.0.2 riak cluster

John Axel Eriksson Sun, 03 Jun 2012 02:07:04 -0700

Hi.

We had an issue where one of the riak servers died (had to be force removed
from cluster). After we did that things got really bad and most data was
unreachable for hours. I added a new node to replace the old one at one
point as well - that never got any data and even now about a day later it
hasn't gotten any data.
What seems to be the issue now is that there are a few nodes are waiting on
handoff of 1 partition. When I look at ring_status I see this:


Attempting to restart script through sudo -u riak
================================== Claimant
===================================
Claimant:  'riak@r-001.x.x.x
Status:     up
Ring Ready: true

============================== Ownership Handoff
==============================
Owner:      riak@r-004.x.x.x
Next Owner: riak@r-003.x.x.x

Index: 930565495644285842450002452081070828921550798848
  Waiting on: []
  Complete:   [riak_kv_vnode,riak_pipe_vnode,riak_search_vnode]

-------------------------------------------------------------------------------

============================== Unreachable Nodes
==============================
All nodes are up and reachable


Ok, so it looks like the problem described in the Release Notes for 1.0.2
here https://github.com/basho/riak/blob/1.0.2-release/RELEASE-NOTES.org.
Unfortunately I've run that code (through riak attach) with no result.

It's been in this state for 12 hours now I think. What can we do to fix our
cluster?

I upgraded to 1.0.3 hoping it would fix our problems but that didn't help.
I cannot upgrade to 1.1.x because we mainly use Luwak for large object
support
and that's discontinued in 1.1.x as far as I know.

Thanks for your help,
John

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Handoff stalled on 1.0.2 riak cluster

Reply via email to