Hi Eugen,

We tried that already. the osd_max_backfills is in 24 and the
osd_recovery_max_active is in 20.

On Mon, Dec 12, 2022 at 3:47 PM Eugen Block <ebl...@nde.ag> wrote:

> Hi,
>
> there are many threads dicussing recovery throughput, have you tried
> any of the solutions? First thing to try is to increase
> osd_recovery_max_active and osd_max_backfills. What are the current
> values in your cluster?
>
>
> Zitat von Monish Selvaraj <mon...@xaasability.com>:
>
> > Hi,
> >
> > Our ceph cluster consists of 20 hosts and 240 osds.
> >
> > We used the erasure-coded pool with cache-pool concept.
> >
> > Some time back 2 hosts went down and the pg are in a degraded state. We
> got
> > the 2 hosts back up in some time. After the pg is started recovering but
> it
> > takes a long time ( months )  . While this was happening we had the
> cluster
> > with 664.4 M objects and 987 TB data. The recovery status is not changed;
> > it remains 88 pgs degraded.
> >
> > During this period, we increase the pg size from 256 to 512 for the
> > data-pool ( erasure-coded pool ).
> >
> > We also observed (one week ) the recovery to be very slow, the current
> > recovery around 750 Mibs.
> >
> > Is there any way to increase this recovery throughput ?
> >
> > *Ceph-version : quincy*
> >
> > [image: image.png]
>
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to