Re: Riak Clustering Changes in 1.0

Alexander Sicular Wed, 07 Sep 2011 18:30:32 -0700

Seconded, force-remove.

@siculars
http://siculars.posterous.com


Sent from my rotary phone.
On Sep 7, 2011 9:14 PM, "Nicholas Campbell" <nicholas.j.campb...@gmail.com>
wrote:
> +1 to force-remove. Clear is better than less typing (to a point).
>
> This sounds great!
>
> - Nick Campbell
>
> http://digitaltumbleweed.com
>
>
> On Wed, Sep 7, 2011 at 8:12 PM, Joseph Blomstedt <j...@basho.com> wrote:
>
>> Given that 1.0 prerelease packages are now available, I wanted to
>> mention some changes to Riak's clustering capabilities in 1.0. In
>> particular, there are some subtle semantic differences in the
>> riak-admin commands. More complete docs will be updated in the near
>> future, but I hope a quick email suffices for now.
>>
>> [nodeB/riak-admin join nodeA] is now strictly one-way. It joins nodeB
>> to the cluster that nodeA is a member of. This is semantically
>> different than pre-1.0 Riak in which join essentially joined clusters
>> together rather than joined a node to a cluster. As part of this
>> change, the joining node (nodeB in this case) must be a singleton
>> (1-node) cluster.
>>
>> In pre-1.0, leave and remove were essentially the same operation, with
>> leave just being an alias for 'remove this-node'. This has changed.
>> Leave and remove are now very different operations.
>>
>> [nodeB/riak-admin leave] is the only safe way to have a node leave the
>> cluster, and it must be executed by the node that you want to remove.
>> In this case, nodeB will start leaving the cluster, and will not leave
>> the cluster until after it has handed off all its data. Even if nodeB
>> is restarted (crashed/shutdown/whatever), it will remain in the leave
>> state and continue handing off partitions until done. After handoff,
>> it will leave the cluster, and eventually shutdown.
>>
>> [nodeA/riak-admin remove nodeB] immediately removes nodeB from the
>> cluster, without handing off its data. All replicas held by nodeB are
>> therefore lost, and will need to be re-generated through read-repair.
>> Use this command carefully. It's intended for nodes that are
>> permanently unrecoverable and therefore for which handoff doesn't make
>> sense. By the final 1.0 release, this command may be renamed
>> "force-remove" just to make the distinction clear.
>>
>> There are now two new commands that provide additional insight into
>> the cluster. [riak-admin member_status] and [riak-admin ring_status].
>>
>> Underneath, the clustering protocol has been mostly re-written. The
>> new approach has the following advantages:
>> 1. It is no longer necessary to wait on [riak-admin ringready] in
>> between adding/removing nodes from the cluster, and adding/removing is
>> also much more sound/graceful. Starting up 16 nodes and issuing
>> [nodeX: riak-admin join node1] for X=1:16 should just work.
>>
>> 2. Data is first transferred to new partition owners before handing
>> over partition ownership. This change fixes numerous bugs, such as
>> 404s/not_founds during ownership changes. The Ring/Pending columns in
>> [riak-admin member_status] visualize this at a high-level, and the
>> full transfer status in [riak-admin ring_status] provide additional
>> insight.
>>
>> 3. All partition ownership decisions are now made by a single node in
>> the cluster (the claimant). Any node can be the claimant, and the duty
>> is automatically taken over if the previous claimant is removed from
>> the cluster. [riak-admin member_status] will list the current
>> claimant.
>>
>> 4. Handoff related to ownership changes can now occur under load;
>> hinted handoff still only occurs when a vnode is inactive. This change
>> allows a cluster to scale up/down under load, although this needs to
>> be further benchmarked and tuned before 1.0.
>>
>> To support all of the above, a new limitation has been introduced.
>> Cluster changes (member addition/removal, ring rebalance, etc) can
>> only occur when all nodes are up and reachable. [riak-admin
>> ring_status] will complain when this is not the case. If a node is
>> down, you must issue [riak-admin down <node>] to mark the node as
>> down, and the remaining nodes will then proceed to converge as usual.
>> Once the down node comes back online, it will automatically
>> re-integrate into the cluster. However, there is nothing preventing
>> client requests being served by a down node before it re-integrates.
>> Before issuing [down <node>], make sure to update your load balancers
>> / connection pools to not include this node. Future releases of Riak
>> may make offlining a node an automatic operation, but it's a
>> user-initiated action in 1.0.
>>
>> -Joe
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak Clustering Changes in 1.0

Reply via email to