Ivaylo,

It appears that handoff is stuck. This should not happen, and I have
filed a bug report to track this issue:
https://issues.basho.com/show_bug.cgi?id=1317

I believe I know the cause, and believe it is fixed in master leading
up to our next major release. I'll update the ticket with my findings
after I get a chance to investigate further, so feel free to follow
that ticket and add any additional details if you have them.

Now, to actually fix your cluster, I believe both of the following
options should work.

Option 1:
Restart r...@xxx.xxx.xxx.xxx'. In other words: 'riak stop', followed
by 'riak start'.

Option 2:
Connect to the Erlang console of r...@xxx.xxx.xxx.xxx' and manually
force handoff:
1. Connect to the console: riak attach
2: Copy/paste:
riak_core_ring_manager:force_update().
3: Copy/paste:
riak_core_vnode_manager:force_handoffs().
4. Exit the console without shutting down Riak: CTRL-D

Let me know if you still have trouble after trying these solutions.

Regards,
Joe

On Tue, Jan 10, 2012 at 9:13 AM, Ivaylo Panitchkov
<ipanitch...@hibernum.com> wrote:
>
> Hello All,
>
> We have a cluster of three machines (Debian 6.0, 4GB RAM,
> riak_1.0.2-1_amd64.deb, n_val: 3) that serves an application for a while. As
> we go to production soon added a fourth machine to the cluster (exactly the
> same as the first three) yesterday. The partition handoff began in the late
> afternoon and I had an impression that the transition will not take too long
> as there are only few hundred IMPORTANT records in the storage for the
> moment. Today in the morning checked the situation again and realized the
> partition handoff still runs (or get stuck). The Ownership Handoff is still
> the same since yesterday (at least 19 hours till now). Any suggestions to
> fix the problem are welcome :-)
>
> REMARK: Replaced the IP addresses for security sake
>
>
> # riak-admin ringready
> Attempting to restart script through sudo -u riak
> TRUE All nodes agree on the ring
> ['r...@yyy.yyy.yyy.yyy','r...@xxx.xxx.xxx.xxx','r...@aaa.aaa.aaa.aaa','r...@bbb.bbb.bbb.bbb']
>
>
> # riak-admin transfers
> Attempting to restart script through sudo -u riak
> 'r...@bbb.bbb.bbb.bbb' waiting to handoff 2 partitions
> 'r...@aaa.aaa.aaa.aaa' waiting to handoff 2 partitions
> 'r...@yyy.yyy.yyy.yyy' waiting to handoff 2 partitions
>
>
> # riak-admin ring_status
> Attempting to restart script through sudo -u riak
> ================================== Claimant
> ===================================
> Claimant: 'r...@xxx.xxx.xxx.xxx'
> Status: up
> Ring Ready: true
>
> ============================== Ownership Handoff
> ==============================
> Owner: r...@xxx.xxx.xxx.xxx
> Next Owner: r...@yyy.yyy.yyy.yyy
>
> Index: 548063113999088594326381812268606132370974703616
> Waiting on: [riak_kv_vnode]
> Complete: [riak_pipe_vnode]
>
> Index: 1370157784997721485815954530671515330927436759040
> Waiting on: [riak_kv_vnode]
> Complete: [riak_pipe_vnode]
>
> -------------------------------------------------------------------------------
>
> ============================== Unreachable Nodes
> ==============================
> All nodes are up and reachable
>
>
> # riak-admin member_status
> Attempting to restart script through sudo -u riak
> ================================= Membership
> ==================================
> Status Ring Pending Node
> -------------------------------------------------------------------------------
> valid 21.9% 25.0% 'r...@yyy.yyy.yyy.yyy'
> valid 28.1% 25.0% 'r...@xxx.xxx.xxx.xxx'
> valid 25.0% 25.0% 'r...@aaa.aaa.aaa.aaa'
> valid 25.0% 25.0% 'r...@bbb.bbb.bbb.bbb'
> -------------------------------------------------------------------------------
> Valid:4 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
>
>
> --
> Ivaylo Panitchkov
> Software developer
> Hibernum Creations Inc.
>
> Ce courriel est confidentiel et peut aussi être protégé par la loi.Si vous
> avez reçu ce courriel par erreur, veuillez nous en aviser immédiatement en y
> répondant, puis supprimer ce message de votre système. Veuillez ne pas le
> copier, l’utiliser pour quelque raison que ce soit ni divulguer son contenu
> à quiconque.
> This email is confidential and may also be legally privileged. If you have
> received this email in error, please notify us immediately by reply email
> and then delete this message from your system. Please do not copy it or use
> it for any purpose or disclose its content.
>
>
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



-- 
Joseph Blomstedt <j...@basho.com>
Software Engineer
Basho Technologies, Inc.
http://www.basho.com/

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to