[ceph-users] Re: Snaptriming speed degrade with pg increase

Frédéric Nass Fri, 29 Nov 2024 01:59:43 -0800

Hi Istvan,

Did the PG split involved using more OSDs than before? If so then increasing 
these values (apart from the sleep) should not have a negative impact on 
clients I/O compared to before the split and should accelerate the whole 
process.


Did you reshard the buckets as discussed in the other thread?

Regards,
Frédéric.

----- Le 29 Nov 24, à 3:30, Istvan Szabo, Agoda istvan.sz...@agoda.com a écrit :

> Hi,
> 
> When we scale the placement group on a pool located in a full nvme cluster, 
> the
> snaptriming speed degrades a lot.
> Currently we are running with these values to not degrade client op and have
> some progress on snaptrimmin, but it is terrible. (octopus 15.2.17 on ubuntu
> 20.04)
> 
> -osd_max_trimming_pgs=2
> --osd_snap_trim_sleep=0.1
> --osd_pg_max_concurrent_snap_trims=2
> 
> We had a big pool which we used to have 128PG and that length of the
> snaptrimming took around 45-60 minutes.
> Due to impossible to do maintenance on the cluster with 600GB pg sizes because
> it can easily max out a cluster (which we did), we increased to 1024 and the
> snaptrimming duration increased to 3.5 hours.
> 
> Is there any good solution that we are missing to fix this?
> 
> On the hardware level I've changed server profile to tune some numa settings 
> but
> seems like didn't help still.
> 
> Thank you
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Snaptriming speed degrade with pg increase

Reply via email to