Hi, I sent another version of this message with pictures that awaits moderation since it is so big - apologies for that
In the meantime I got approval to share the output of some of the command - see attached I have a 19.2.2 cluster deployed with cephadm 7 nodes and 2 networks ( cluster (2 x 100GB) and public (2 x 25Gb)) It provides an S3 bucket as a target for my backup The underlying pool (default.rgw.buckets.data) is using an EC 4+2 profile with a storage class for spinning disks All spinning disks are keeping WAl/DB on NVME The amount of data grew pretty fast and , since I started the pool with pg_autoscale_mode = warn,I have decided to increase the number of PG manually ( from 128 to 256) As expected, the backfilling started ...and it never ended ...even now after more than 1 week I still have about 29 pgs backfilling and 13 backfilling_wait What worries me is that the number of backfilling PGs varies very little over time e.g 28 and 12 ALTHOUGH there is constant "recovery" traffic between 250 and 350MiB There is no OSD or capacity issue ( if I enable pg_autoscale_mode the cluster health is OK ) The "recovery" seems to be doing something ( but number of objects remain the same ) Since the recovery should run over the cluster network and the amount of data in the pool is not huge, I am not sure why it takes so many days - it seems stuck actually The only strange thing I noticed is a discrepancy between the number of PG and PGP that the pool currently has ...and what autoscale-status says Any help / suggestions would be very appreciated What I have tried so for : increase recovery speed ( by changing mclock profile to "high_recovery_ops" and overriding various parameters) (recovery_max_active, recovery_max_active_hdd ... etc) redeploying some of the OSDs that were "UP_PRIMARY but part of the backfill_wait PGs query the PGs and look for a "stuck reason" stop scrub and deep-scrub repair the PGs (some) change the pg_autoscale_mode to true check the balancer status Many thanks Steven _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io