[ceph-users] Re: upmap balancer and consequences of osds briefly marked out

2020-05-04 Thread Dan van der Ster
Right, it would freeze the PGs in place at the time upmap-remapped is run. You need to keep running the upmap balancer afterwards to restore the optimized state. I don't quite understand your question about a failed / replaced osd, but yes it is relevant here. Suppose you have osds 0, 1, 2, and 3

[ceph-users] Re: upmap balancer and consequences of osds briefly marked out

2020-05-04 Thread Dan van der Ster
Hi Dylan, The backfillfull_ratio, which defaults to 0.9, prevents backfilling into an osd which is getting too full. So, worst case scenario is that your cluster will have some osds getting up to 90% full, after which case the upmap balancer should starting putting things back into place. Also, c

[ceph-users] Re: upmap balancer and consequences of osds briefly marked out

2020-05-03 Thread Anthony D'Atri
Do I misunderstand this script, or does it not _quite_ do what’s desired here? I fully get the scenario of applying a full-cluster map to allow incremental topology changes. To be clear, if this is run to effectively freeze backfill during / following a traumatic event, it will freeze that adap

[ceph-users] Re: upmap balancer and consequences of osds briefly marked out

2020-05-01 Thread Dylan McCulloch
Thanks Dan, that looks like a really neat method & script for a few use-cases. We've actually used several of the scripts in that repo over the years, so, many thanks for sharing. That method will definitely help in the scenario in which a set of unnecessary pg remaps have been triggered and ca

[ceph-users] Re: upmap balancer and consequences of osds briefly marked out

2020-05-01 Thread Dan van der Ster
Hi, You're correct that all the relevant upmap entries are removed when an OSD is marked out. You can try to use this script which will recreate them and get the cluster back to HEALTH_OK quickly: https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-remapped.py Cheers, Dan On F