[ceph-users] Re: 18.2.7: modification of osd_deep_scrub_interval in config ignored by mon

Anthony D'Atri Mon, 26 May 2025 07:34:04 -0700

I’ll put in a PR today


> On May 26, 2025, at 4:40 AM, Eugen Block <ebl...@nde.ag> wrote:
> 
> Right, that’s what I found out as well here 
> (https://heiterbiswolkig.blogs.nde.ag/2024/09/06/pgs-not-deep-scrubbed-in-time/),
>  but I also kind of hoped that this would have been corrected in the 
> meantime. I don’t remember right now if I created a tracker. I’ll check when 
> I have time.
> 
> Zitat von Michel Jouvin <michel.jou...@ijclab.in2p3.fr>:
> 
>> Eugen,
>> 
>> Thanks for 'ceph config help', I always forget about it! But it doesn't 
>> really help in this case:
>> 
>> -----
>> 
>> osd_deep_scrub_interval - Deep scrub each PG (i.e., verify data checksums) 
>> at least this often
>>   (float, advanced)
>>   Default: 604800.000000
>>   Can update at runtime: true
>>   Services: [osd]
>> -----
>> 
>> And clearly it is wrong. It should mention osd + mgr or probably better 
>> 'global'. If you modify it only on OSD (what we did), you end up with deep 
>> scrubs being properly scheduled by OSDs but your cluster reported in WARNING 
>> state with an incredibly high number of late deep scrubs that can be 
>> worrying...
>> 
>> Michel
>> 
>>> Le 26/05/2025 à 09:56, Eugen Block a écrit :
>>> It’s reported by the mgr, so you’ll either have to pass global or mgr and 
>>> osd to the configuration change. You can also check ‚ceph config help 
>>> {CONFIG}‘ to check which services are related to that configuration value.
>>> 
>>> Zitat von Michel Jouvin <michel.jou...@ijclab.in2p3.fr>:
>>> 
>>>> The page I checked, 
>>>> https://docs.ceph.com/en/reef/rados/configuration/osd-config-ref/, is just 
>>>> describing the parameters, mentioning they can be defined for all OSDs or 
>>>> just a specific one. But there is nothing that I have identified about the 
>>>> fact that some of these parameters must be defined as global as they both 
>>>> control the OSD behaviour and the alarms generated by mon...
>>>> 
>>>> Michel
>>>> 
>>>> Le 26/05/2025 à 09:31, Gregory Orange a écrit :
>>>>> This is a great illustration of the need for this to be global. Is it
>>>>> documented that way?
>>>>> 
>>>>> There was a discussion on Slack a couple of weeks ago where someone was
>>>>> asserting that it should be an osd value whereas we always use global -
>>>>> well, ever since we hit the same problem as you, a few years ago!
>>>>> 
>>>>> 
>>>>> On 26/5/25 15:21, Michel Jouvin wrote:
>>>>>> Sorry for the noise, I found the mistake right after sending this
>>>>>> message. We did a 'ceph config set osd osd_deep_scrub_interval' instead
>>>>>> of a `ceph config set global...'. As a result only the OSDs saw the
>>>>>> changes. Fixing this, the cluster was back to CEPH_OK immediatly!
>>>>>> 
>>>>>> Michel
>>>>>> 
>>>>>> Le 26/05/2025 à 09:17, Michel Jouvin a écrit :
>>>>>>> Hi,
>>>>>>> 
>>>>>>> Last week we increased osd_deep_scrub_interval from 10 days to 14 days
>>>>>>> as we tended to have permanently 1 PG with a late deep scrub (the PG
>>>>>>> changing all the time). We did it with `ceph config set ...`. From
>>>>>>> what we have seen, the deep scrubs are now spread over 14 days (the
>>>>>>> oldest are 14 days) meaning that OSDs took this change into account
>>>>>>> (without being restarted). But the number of late deep scrubs reported
>>>>>>> by `ceph -s) is ~700 which is unexpected. Does it mean that the mon
>>>>>>> (who is in charge of the report if I am right) have not seen the
>>>>>>> changes (they have not been restarted)?
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> 
>>>>>>> Michel
>>>>>>> 
>>>>>>> 
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list -- ceph-users@ceph.io
>>>>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>>> _______________________________________________
>>>> ceph-users mailing list -- ceph-users@ceph.io
>>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>> 
>>> 
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: 18.2.7: modification of osd_deep_scrub_interval in config ignored by mon

Reply via email to