Re: [ceph-users] Ceph Rebalance Issue

Roozbeh Shafiee Sun, 03 Jul 2016 02:34:59 -0700

Actually I tried all the ways which I found them on Ceph Docs and mailing lists 
but
non of them had no effect. As a last resort I changed pg/pgp.


Anyway… What can I do as the best way to solve this problem?

Thanks

> On Jul 3, 2016, at 1:43 PM, Wido den Hollander <w...@42on.com> wrote:
> 
> 
>> Op 3 juli 2016 om 11:02 schreef Roozbeh Shafiee <roozbeh.shaf...@gmail.com>:
>> 
>> 
>> Yes, you’re right but I have 0 object/s recovery last night. when I changed 
>> pg/pgp from 1400
>> to 2048, rebalancing speeded up but the percentage of rebalancing backed to 
>> 53%.
>> 
> 
> Why did you change that? I would not change that value while a cluster is 
> still in recovery.
> 
>> I have this situation again n again since I dropped out failed OSD when I 
>> increase pg/pgp but 
>> each time rebalancing stopped at 0 objects/s and low speed transfer.
>> 
> 
> Hard to judge at this point. You might want to try and restart osd.27 and see 
> if that gets things going again. It seems to be involved in many PGs which 
> are in 'backfilling' state.
> 
> Wido
> 
>> Thanks
>> 
>>> On Jul 3, 2016, at 1:25 PM, Wido den Hollander <w...@42on.com> wrote:
>>> 
>>> 
>>>> Op 3 juli 2016 om 10:50 schreef Roozbeh Shafiee 
>>>> <roozbeh.shaf...@gmail.com>:
>>>> 
>>>> 
>>>> Thanks for quick response, Wido
>>>> 
>>>> the "ceph -s" output has pasted here:
>>>> http://pastie.org/10897747
>>>> 
>>>> and this is output of “ceph health detail”:
>>>> http://pastebin.com/vMeURWC9
>>>> 
>>> 
>>> It seems the cluster is still backfilling PGs and you 'ceph -s' shows so: 
>>> 'recovery io 62375 kB/s, 15 objects/s'
>>> 
>>> It will just take some time before it finishes.
>>> 
>>> Wido
>>> 
>>>> Thank you
>>>> 
>>>>> On Jul 3, 2016, at 1:10 PM, Wido den Hollander <w...@42on.com> wrote:
>>>>> 
>>>>> 
>>>>>> Op 3 juli 2016 om 10:34 schreef Roozbeh Shafiee 
>>>>>> <roozbeh.shaf...@gmail.com>:
>>>>>> 
>>>>>> 
>>>>>> Hi list,
>>>>>> 
>>>>>> A few days ago one of my OSDs failed and I dropped out that but 
>>>>>> afterwards I got
>>>>>> HEALTH_WARN until now. After turing off the OSD, the self-healing system 
>>>>>> started
>>>>>> to rebalance data between other OSDs.
>>>>>> 
>>>>>> My question is: At the end of rebalancing, the process doesn’t complete 
>>>>>> and I get this message
>>>>>> at the end of “ceph -s” output:
>>>>>> 
>>>>>> recovery io 1456 KB/s, 0 object/s
>>>>>> 
>>>>> 
>>>>> Could you post the exact output of 'ceph -s'?
>>>>> 
>>>>> There is something more which needs to be shown.
>>>>> 
>>>>> 'ceph health detail' also might tell you more.
>>>>> 
>>>>> Wido
>>>>> 
>>>>>> how can I back to HEALTH_OK situation again?
>>>>>> 
>>>>>> My cluster details are:
>>>>>> 
>>>>>> - 27 OSDs
>>>>>> - 3 MONs
>>>>>> - 2048 pg/pgs
>>>>>> - Each OSD has 4 TB of space
>>>>>> - CentOS 7.2 with 3.10 linux kernel
>>>>>> - Ceph Hammer version
>>>>>> 
>>>>>> Thank you,
>>>>>> Roozbeh_______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users@lists.ceph.com
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> 
>> 

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph Rebalance Issue

Reply via email to