Re: why leaving riak cluster so slowly and how to accelerate the speed

changmao wang Sun, 09 Aug 2015 19:21:06 -0700

Is there any ideas to fix this issue?

Amao


On Fri, Aug 7, 2015 at 6:55 AM, changmao wang <wang.chang...@gmail.com>
wrote:

> Dmitri,
>
> Thanks for your quick reply.
> my question are as below:
> 1. what's the current status of the whole cluster? Is't doing data balance?
> 2. there's so many errors during one of the node error log. how to handle
> it?
> 2015-08-05 01:38:59.717 [error]
> <0.23000.298>@riak_core_handoff_sender:start_fold:262 ownership_transfer
> transfer of riak_kv_vnode from 'riak@10.21.136.81'
> 525227150915793236229449236757414210188850757632 to 'riak@10.21.136.94'
> 525227150915793236229449236757414210188850757632 failed because of enotconn
> 2015-08-05 01:38:59.718 [error]
> <0.195.0>@riak_core_handoff_manager:handle_info:289 An outbound handoff of
> partition riak_kv_vnode 525227150915793236229449236757414210188850757632
> was terminated for reason: {shutdown,{error,enotconn}}
>
> During the last 5 days, there's no changes of the "riak-admin member
> status" output.
> 3. how to accelerate the data balance?
>
>
> On Fri, Aug 7, 2015 at 6:41 AM, Dmitri Zagidulin <dzagidu...@basho.com>
> wrote:
>
>> Ok, I think I understand so far. So what's the question?
>>
>> On Thursday, August 6, 2015, Changmao.Wang <changmao.w...@datayes.com>
>> wrote:
>>
>>> Hi Riak users,
>>>
>>> Before adding new nodes, the cluster only have five nodes. The member
>>> list are as below:
>>> 10.21.136.66,10.21.136.71,10.21.136.76,10.21.136.81,10.21.136.86.
>>> We did not setup http proxy for the cluster, only one node of the
>>> cluster provide the http service.  so the CPU load is always high on this
>>> node.
>>>
>>> After that, I added four nodes (10.21.136.[91-94]) to those cluster.
>>> During the ring/data balance progress, each node failed(riak stopped)
>>> because of disk 100% full.
>>> I used multi-disk path to "data_root" parameter in
>>> '/etc/riak/app.config'. Each disk is only 580MB size.
>>> As you know, bitcask storage engine did not support multi-disk path.
>>> After one of the disks is 100% full, it can not switch next idle disk. So
>>> the "riak" service is down.
>>>
>>> After that, I removed the new add four nodes at active nodes with
>>> "riak-admin cluster leave riak@'10.21.136.91'".
>>> and then stop "riak" service on other active new nodes, reformat the
>>> above new nodes with LVM disk management (bind 6 disk with virtual disk
>>> group).
>>> Replace the "data-root" parameter with one folder, and then start "riak"
>>> service again. After that, the cluster began the data balance again.
>>> That's the whole story.
>>>
>>>
>>> Amao
>>>
>>> ------------------------------
>>> *From: *"Dmitri Zagidulin" <dzagidu...@basho.com>
>>> *To: *"Changmao.Wang" <changmao.w...@datayes.com>
>>> *Sent: *Thursday, August 6, 2015 10:46:59 PM
>>> *Subject: *Re: why leaving riak cluster so slowly and how to accelerate
>>> the speed
>>>
>>> Hi Amao,
>>>
>>> Can you explain a bit more which steps you've taken, and what the
>>> problem is?
>>>
>>> Which nodes have been added, and which nodes are leaving the cluster?
>>>
>>> On Tue, Jul 28, 2015 at 11:03 PM, Changmao.Wang <
>>> changmao.w...@datayes.com> wrote:
>>>
>>>> Hi Raik user group,
>>>>
>>>>  I'm using riak and riak-cs 1.4.2. Last weekend, I added four nodes to
>>>> cluster with 5 nodes. However, it's failed with one of disks 100% full.
>>>> As you know bitcask storage engine can not support multifolders.
>>>>
>>>> After that, I restarted the "riak" and leave the cluster with the
>>>> command "riak-admin cluster leave" and "riak-admin cluster plan", and the
>>>> commit.
>>>> However, riak is always doing KV balance after my submit leaving
>>>> command. I guess that it's doing join cluster progress.
>>>>
>>>> Could you show us how to accelerate the leaving progress? I have tuned
>>>> the "transfer-limit" parameters on 9 nodes.
>>>>
>>>> below is some commands output:
>>>> riak-admin member-status
>>>> ================================= Membership
>>>> ==================================
>>>> Status     Ring    Pending    Node
>>>>
>>>> -------------------------------------------------------------------------------
>>>> leaving     6.3%     10.9%    'riak@10.21.136.91'
>>>> leaving     9.4%     10.9%    'riak@10.21.136.92'
>>>> leaving     6.3%     10.9%    'riak@10.21.136.93'
>>>> leaving     6.3%     10.9%    'riak@10.21.136.94'
>>>> valid      10.9%     10.9%    'riak@10.21.136.66'
>>>> valid      12.5%     10.9%    'riak@10.21.136.71'
>>>> valid      18.8%     10.9%    'riak@10.21.136.76'
>>>> valid      18.8%     12.5%    'riak@10.21.136.81'
>>>> valid      10.9%     10.9%    'riak@10.21.136.86'
>>>>
>>>>  riak-admin transfer_limit
>>>> =============================== Transfer Limit
>>>> ================================
>>>> Limit        Node
>>>>
>>>> -------------------------------------------------------------------------------
>>>>   200        'riak@10.21.136.66'
>>>>   200        'riak@10.21.136.71'
>>>>   100        'riak@10.21.136.76'
>>>>   100        'riak@10.21.136.81'
>>>>   200        'riak@10.21.136.86'
>>>>   500        'riak@10.21.136.91'
>>>>   500        'riak@10.21.136.92'
>>>>   500        'riak@10.21.136.93'
>>>>   500        'riak@10.21.136.94'
>>>>
>>>> Any more details for your diagnosing the problem?
>>>>
>>>> Amao
>>>>
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> riak-users@lists.basho.com
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>
>>>
>>>
>>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
>
> --
> Amao Wang
> Best & Regards
>



-- 
Amao Wang
Best & Regards

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: why leaving riak cluster so slowly and how to accelerate the speed

Reply via email to