[ceph-users] Re: jj's "improved" ceph balancer

Dan van der Ster Wed, 20 Oct 2021 12:57:01 -0700

Hi Josh,

That's another interesting dimension...
Indeed a cluster that has plenty of free capacity could indeed be balanced
by workload/iops, but once it reaches maybe 60 or 70% full, then I think
capacity would need to take priority.


But to be honest I don't really understand the workload/iops balancing
use-case. Can you describe some of the scenarios you have in mind?

.. Dan


On Wed, 20 Oct 2021, 20:45 Josh Salomon, <jsalo...@redhat.com> wrote:

> Just another point of view:
> The current balancer balances the capacity but this is not enough. The
> balancer should also balance the workload and we plan on adding primary
> balancing for Quincy. In order to balance the workload you should work pool
> by pool because pools have different workloads. So while the observation
> about the +1 PGs is correct, I believe the correct solution should be
> talking this into consideration while still balancing capacity pool by pool.
> Capacity balancing is a functional requirement, while workload balancing
> is a performance requirement so it is important only for very loaded
> systems (loaded in terms of high IOPS not nearly full systems)
>
> I would appreciate comments on this thought.
>
> On Wed, 20 Oct 2021, 20:57 Dan van der Ster, <d...@vanderster.com> wrote:
>
>> Hi Jonas,
>>
>> From your readme:
>>
>> "the best possible solution is some OSDs having an offset of 1 PG to the
>> ideal count. As a PG-distribution-optimization is done per pool, without
>> checking other pool's distribution at all, some devices will be the +1 more
>> often than others. At worst one OSD is the +1 for each pool in the cluster."
>>
>> That's an interesting observation/flaw which hadn't occurred to me
>> before. I think we don't ever see it in practice in our clusters because we
>> do not have multiple large pools on the same osds.
>>
>> How large are the variances in your real clusters? I hope the example in
>> your readme isn't from real life??
>>
>> Cheers, Dan
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Wed, 20 Oct 2021, 15:11 Jonas Jelten, <jel...@in.tum.de> wrote:
>>
>>> Hi!
>>>
>>> I've been working on this for quite some time now and I think it's ready
>>> for some broader testing and feedback.
>>>
>>> https://github.com/TheJJ/ceph-balancer
>>>
>>> It's an alternative standalone balancer implementation, optimizing for
>>> equal OSD storage utilization and PG placement across all pools.
>>>
>>> It doesn't change your cluster in any way, it just prints the commands
>>> you can run to apply the PG movements.
>>> Please play around with it :)
>>>
>>> Quickstart example: generate 10 PG movements on hdd to stdout
>>>
>>>     ./placementoptimizer.py -v balance --max-pg-moves 10
>>> --only-crushclass hdd | tee /tmp/balance-upmaps
>>>
>>> When there's remapped pgs (e.g. by applying the above upmaps), you can
>>> inspect progress with:
>>>
>>>     ./placementoptimizer.py showremapped
>>>     ./placementoptimizer.py showremapped --by-osd
>>>
>>> And you can get a nice Pool and OSD usage overview:
>>>
>>>     ./placementoptimizer.py show --osds --per-pool-count
>>> --sort-utilization
>>>
>>>
>>> Of course there's many more features and optimizations to be added,
>>> but it served us very well in reclaiming terrabytes of until then
>>> unavailable storage already where the `mgr balancer` could no longer
>>> optimize.
>>>
>>> What do you think?
>>>
>>> Cheers
>>>   -- Jonas
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>>
>> _______________________________________________
>> Dev mailing list -- d...@ceph.io
>> To unsubscribe send an email to dev-le...@ceph.io
>>
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: jj's "improved" ceph balancer

Reply via email to