[ceph-users] Re: Increase number of objects in flight during recovery

Frank Schilder Sun, 06 Dec 2020 05:32:12 -0800

Just to update the case for others: Setting

ceph config set osd/class:ssd osd_recovery_sleep 0.001
ceph config set osd/class:hdd osd_recovery_sleep 0.05


had the desired effect. I'm running another massive rebalancing operation right 
now and these settings seem to help. It would be nice if one could use a pool 
name in a filter though (osd/pool:NAME). I have 2 different pools on the same 
SSDs and only objects from one of these pools require the lower sleep setting.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Joachim Kraftmayer <joachim.kraftma...@clyso.com>
Sent: 03 December 2020 16:49:51
To: 胡 玮文; Frank Schilder
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: Increase number of objects in flight during 
recovery

Hi Frank,

this values we used to reduce the recovery impact before luminous.

#reduce recovery impact
osd max backfills
osd recovery max active
osd recovery max single start
osd recovery op priority
osd recovery threads
osd backfill scan max
osd backfill scan min

I do not know how many osds and pgs you have in your cluster. But the
backfill performance depends on osds, pgs and objects/pg.

Regards, Joachim

___________________________________

Clyso GmbH

Am 03.12.2020 um 12:35 schrieb 胡 玮文:
> Hi,
>
> There is a “OSD recovery priority” dialog box in web dashboard. 
> Configurations it will change includes:
>
> osd_max_backfill
> osd_recovery_max_active
> osd_recovery_max_single_start
> osd_recovery_sleep
>
> Tune these config may helps. “High” priority corresponding to 4, 4, 4, 0, 
> respectively. Some of these also have a _ssd/_hdd variant.
>
>> 在 2020年12月3日，17:11，Frank Schilder <fr...@dtu.dk> 写道：
>>
>> Hi all,
>>
>> I have the opposite problem as discussed in "slow down keys/s in recovery". 
>> I need to increase the number of objects in flight during rebalance. It is 
>> already all remapped PGs in state backfilling, but it looks like no more 
>> than 8 objects/sec are transferred per PG at a time. The pools sits on 
>> high-performance SSDs and could easily handle a transfer of 100 or more 
>> objects/sec simultaneously. Is there any way to increase the number of 
>> transfers/sec or simultaneous transfers? Increasing the options 
>> osd_max_backfills and osd_recovery_max_active has no effect.
>>
>> Background: The pool in question (con-fs2-meta2) is the default data pool of 
>> a ceph fs, which stores exclusively the kind of meta data that goes into 
>> this pool. Storage consumption is reported as 0, but the number of objects 
>> is huge:
>>
>>     NAME                     ID     USED        %USED     MAX AVAIL     
>> OBJECTS
>>     con-fs2-meta1            12     216 MiB      0.02       933 GiB      
>> 13311115
>>     con-fs2-meta2            13         0 B         0       933 GiB     
>> 118389897
>>     con-fs2-data             14     698 TiB     72.15       270 TiB     
>> 286826739
>>
>> Unfortunately, there were no recommendations on dimensioning PG numbers for 
>> this pool, so I used the same for con-fs2-meta1, and con-fs2-meta2. In 
>> hindsight, this was potentially a bad idea, the meta2 pool should have a 
>> much higher PG count or a much more aggressive recovery policy.
>>
>> I now need to rebalance PGs on meta2 and it is going way too slow compared 
>> with the performance of the SSDs it is located on. In a way, I would like to 
>> keep the PG count where it is, but increase the recovery rate for this pool 
>> by a factor of 10. Please let me know what options I have.
>>
>> Best regards,
>> =================
>> Frank Schilder
>> AIT Risø Campus
>> Bygning 109, rum S14
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Increase number of objects in flight during recovery

Reply via email to