The code checks the pg with the oldest scrub_stamp/deep_scrub_stamp to see 
whether the osd_scrub_min_interval/osd_deep_scrub_interval time has elapsed.  
So the output you are showing with the very old scrub stamps shouldn’t happen 
under default settings.  As soon set deep-scrub is re-enabled, the 5 pgs with 
that old stamp should be the first to get run.

A PG needs to have active and clean set to be scrubbed.   If any weren’t 
active+clean, then even a manual scrub would do nothing.

Now that I’m looking at the code I see that your symptom is possible if the 
values of osd_scrub_min_interval or osd_scrub_max_interval are larger than your 
osd_deep_scrub_interval.  Should the osd_scrub_min_interval be greater than 
osd_deep_scrub_interval, there won't be a deep scrub until the 
osd_scrub_min_interval has elapsed.  If an OSD is under load and the 
osd_scrub_max_interval is greater than the osd_deep_scrub_interval, there won't 
be a deep scrub until osd_scrub_max_interval has elapsed.

Please check the 3 interval config values.  Verify that your PGs are 
active+clean just to be sure.

David


On May 20, 2014, at 5:21 PM, Mike Dawson <mike.daw...@cloudapt.com> wrote:

> Today I noticed that deep-scrub is consistently missing some of my Placement 
> Groups, leaving me with the following distribution of PGs and the last day 
> they were successfully deep-scrubbed.
> 
> # ceph pg dump all | grep active | awk '{ print $20}' | sort -k1 | uniq -c
>      5 2013-11-06
>    221 2013-11-20
>      1 2014-02-17
>     25 2014-02-19
>     60 2014-02-20
>      4 2014-03-06
>      3 2014-04-03
>      6 2014-04-04
>      6 2014-04-05
>     13 2014-04-06
>      4 2014-04-08
>      3 2014-04-10
>      2 2014-04-11
>     50 2014-04-12
>     28 2014-04-13
>     14 2014-04-14
>      3 2014-04-15
>     78 2014-04-16
>     44 2014-04-17
>      8 2014-04-18
>      1 2014-04-20
>     16 2014-05-02
>     69 2014-05-04
>    140 2014-05-05
>    569 2014-05-06
>   9231 2014-05-07
>    103 2014-05-08
>    514 2014-05-09
>   1593 2014-05-10
>    393 2014-05-16
>   2563 2014-05-17
>   1283 2014-05-18
>   1640 2014-05-19
>   1979 2014-05-20
> 
> I have been running the default "osd deep scrub interval" of once per week, 
> but have disabled deep-scrub on several occasions in an attempt to avoid the 
> associated degraded cluster performance I have written about before.
> 
> To get the PGs longest in need of a deep-scrub started, I set the 
> nodeep-scrub flag, and wrote a script to manually kick off deep-scrub 
> according to age. It is processing as expected.
> 
> Do you consider this a feature request or a bug? Perhaps the code that 
> schedules PGs to deep-scrub could be improved to prioritize PGs that have 
> needed a deep-scrub the longest.
> 
> Thanks,
> Mike Dawson
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to