Re: [ceph-users] Annoying PGs not deep-scrubbed in time messages in Nautilus.

Paul Emmerich Mon, 09 Dec 2019 11:30:43 -0800

Hi,

nice coincidence that you mention that today; I've just debugged the exact
same problem on a setup where deep_scrub_interval was increased.


The solution was to set the deep_scrub_interval directly on all pools
instead (which was better for this particular setup anyways):

ceph osd pool set <pool> deep_scrub_interval <deep_scrub_in_seconds>

Here's the code that generates the warning:
https://github.com/ceph/ceph/blob/v14.2.4/src/mon/PGMap.cc#L3058

* There's no obvious bug in the code, no reason why it shouldn't work with
the option unless "pool->opts.get(pool_opts_t::DEEP_SCRUB_INTERVAL, x)"
returns the wrong thing if it's not configured for a pool
* I've used "config diff" to check that all mons use the correct value for
deep_scrub_interval
* mon_warn_pg_not_deep_scrubbed_ratio is a little bit odd because the
warning will trigger at (mon_warn_pg_not_deep_scrubbed_ratio + 1) *
deep_scrub_interval which is somewhat unexpected, so by default at 125% the
configured interval



Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90


On Mon, Dec 9, 2019 at 5:17 PM Robert LeBlanc <rob...@leblancnet.us> wrote:

> I've increased the deep_scrub interval on the OSDs on our Nautilus cluster
> with the following added to the [osd] section:
>
> osd_deep_scrub_interval = 2600000
>
> And I started seeing
>
> 1518 pgs not deep-scrubbed in time
>
> in ceph -s. So I added
>
> mon_warn_pg_not_deep_scrubbed_ratio = 1
>
> since the default would start warning with a whole week left to scrub. But
> the messages persist. The cluster has been running for a month with these
> settings. Here is an example of the output. As you can see, some of these
> are not even two weeks old, no where close to the 75% of 4 weeks.
>
>     pg 6.1f49 not deep-scrubbed since 2019-11-09 23:04:55.370373
>    pg 6.1f47 not deep-scrubbed since 2019-11-18 16:10:52.561204
>    pg 6.1f44 not deep-scrubbed since 2019-11-18 15:48:16.825569
>    pg 6.1f36 not deep-scrubbed since 2019-11-20 05:39:00.309340
>    pg 6.1f31 not deep-scrubbed since 2019-11-27 02:48:45.347680
>    pg 6.1f30 not deep-scrubbed since 2019-11-11 21:34:15.795622
>    pg 6.1f2d not deep-scrubbed since 2019-11-24 11:37:39.502829
>    pg 6.1f27 not deep-scrubbed since 2019-11-25 07:38:58.689315
>    pg 6.1f25 not deep-scrubbed since 2019-11-20 00:13:43.048569
>    pg 6.1f1a not deep-scrubbed since 2019-11-09 15:08:43.516666
>    pg 6.1f19 not deep-scrubbed since 2019-11-25 10:24:47.884332
>    1468 more pgs...
> Mon Dec  9 08:12:01 PST 2019
>
> There is very little data on the cluster, so it's not a problem of
> deep-scrubs taking too long:
>
> $ ceph df
> RAW STORAGE:
>    CLASS     SIZE        AVAIL       USED        RAW USED     %RAW USED
>    hdd       6.3 PiB     6.1 PiB     153 TiB      154 TiB          2.39
>    nvme      5.8 TiB     5.6 TiB     138 GiB      197 GiB          3.33
>    TOTAL     6.3 PiB     6.2 PiB     154 TiB      154 TiB          2.39
>
> POOLS:
>    POOL                           ID     STORED      OBJECTS     USED
>        %USED     MAX AVAIL
>    .rgw.root                       1     3.0 KiB           7     3.0 KiB
>         0       1.8 PiB
>    default.rgw.control             2         0 B           8         0 B
>         0       1.8 PiB
>    default.rgw.meta                3     7.4 KiB          24     7.4 KiB
>         0       1.8 PiB
>    default.rgw.log                 4      11 GiB         341      11 GiB
>         0       1.8 PiB
>    default.rgw.buckets.data        6     100 TiB      41.84M     100 TiB
>      1.82       4.2 PiB
>    default.rgw.buckets.index       7      33 GiB         574      33 GiB
>         0       1.8 PiB
>    default.rgw.buckets.non-ec      8     8.1 MiB          22     8.1 MiB
>         0       1.8 PiB
>
> Please help me figure out what I'm doing wrong with these settings.
>
> Thanks,
> Robert LeBlanc
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Annoying PGs not deep-scrubbed in time messages in Nautilus.

Reply via email to