We derive appropriate values for each model/firmware of disk using fio tests, and set them explicitly. We set

    osd_mclock_skip_benchmark = true

to get rid of the inbuilt benchmark altogether. I don't know what the algorithm is for automatically re-running the inbuilt benchmark, but with that parameter set to true it never cuts in.

Using that approach we have never had a problem with mclock. Under reef we found it pretty good; under squid it is better again. And in most cases better than wpq.

Chris

On 11/06/2025 09:12, Michel Jouvin wrote:
Janne,

Thanks for your answer, I'll do as you suggest and see if we observe negative side effects. We are struggling with slow deep scrubs (like described in https://tracker.ceph.com/issues/69078) and I'm wondering if the OSDs with low values may contribute to the problem...

Michel

Le 11/06/2025 à 09:37, Janne Johansson a écrit :
Den tis 10 juni 2025 kl 18:59 skrev Michel Jouvin
<michel.jou...@ijclab.in2p3.fr>:
a little bit surprised that the osd_mclock_capacity_iops_hdd computed
for each OSD is so different (basically a x2 between the lowest and
highest values).
Also, the documentation explains that you can define a value that you
measured and seems to suggest that once defined, it will not be updated.
Am I right? If yes, does it mean that once the automatic bench has
determined a value the only way to update it is to delete it from the
config and restart the OSD (if you want the automatic bench to
update/redefine it)?
I think your assessment is correct on all details. I guess you would
take a decent value from the high end of your range and set it on all
drives, to "compensate" for the tests being done in various times. Not
necessarily the exact highest, but if it was showing between 100 to
200 iops, then perhaps 150 or 175 could be reasonable for all drives,
and unless it causes problems just leave it there for the hdd drives.
It's hard from the outside to tell if it is worse that it becomes only
100 for one or some drives because it was tested when the system was a
bit more busy than usual, and hence get less io scheduled to it
(scrubs and repairs and so on), compared to how bad it would be if one
drive actually only can deliver 100 for some reason and you hard code
it to 150 so it is given 50% too many non-client-IO requests.

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to