You should probably have used 2048 following the usual target of 100 PGs
per OSD.
Just increase the mon_max_pg_per_osd option, ~200 is still okay-ish and
your cluster will grow out of it :)

Paul

2018-08-01 19:55 GMT+02:00 Alexandros Afentoulis <alexaf+c...@noc.grnet.gr>:

> Hello people :)
>
> we are facing a situation quite similar to the one described here:
> http://tracker.ceph.com/issues/23117
>
> Namely:
>
> we have a Luminous cluster consisting of 16 hosts, where each host holds
> 12 OSDs on spinning disks and 4 OSDs on SSDs. Let's forget the SSDs for
> now since they're not used atm.
>
> We have a Erasure Coding pool (k=6, m=3) with 4096 PGs, residing on the
> spinning disks, with failure domain the host.
>
> After getting a host (and their OSDs) out for maintenance, we're trying
> to put the OSDs back in. While cluster starts recovering we observe
>
> > Reduced data availability: 170 pgs inactive
>
> and
>
> > 170  activating+remapped
>
> This eventually leads to slow/stucked requests and we have to get the
> OSDs out again.
>
> While searching around we came across the already mentioned issue on
> tracker [1] and we're wondering "PG overdose protection" [2] is what
> we're really facing now.
>
> Our cluster features:
>
> "mon_max_pg_per_osd": "200",
> "osd_max_pg_per_osd_hard_ratio": "2.000000",
>
> What is more, we observed that the PGs distribution among the OSDs is
> not uniform, eg:
>
> > ID  CLASS WEIGHT    REWEIGHT SIZE   USE    AVAIL  %USE  VAR  PGS TYPE
> NAME
> >  -1       711.29004        -   666T   165T   500T     0    0   - root
> default
> > -17        44.68457        - 45757G 11266G 34491G 24.62 0.99   -
>  host rd3-1427
> >   9   hdd   3.66309  1.00000  3751G   976G  2774G 26.03 1.05 212
>  osd.9
> >  30   hdd   3.66309  1.00000  3751G   961G  2789G 25.64 1.03 209
>  osd.30
> >  46   hdd   3.66309  1.00000  3751G   902G  2848G 24.07 0.97 196
>  osd.46
> >  61   hdd   3.66309  1.00000  3751G   877G  2873G 23.40 0.94 190
>  osd.61
> >  76   hdd   3.66309  1.00000  3751G   984G  2766G 26.24 1.05 214
>  osd.76
> >  92   hdd   3.66309  1.00000  3751G   894G  2856G 23.84 0.96 194
>  osd.92
> > 107   hdd   3.66309  1.00000  3751G   881G  2869G 23.50 0.94 191
>  osd.107
> > 123   hdd   3.66309  1.00000  3751G   973G  2777G 25.97 1.04 212
>  osd.123
> > 138   hdd   3.66309  1.00000  3751G   975G  2775G 26.01 1.05 212
>  osd.138
> > 156   hdd   3.66309  1.00000  3751G   813G  2937G 21.69 0.87 176
>  osd.156
> > 172   hdd   3.66309  1.00000  3751G  1016G  2734G 27.09 1.09 221
>  osd.172
> > 188   hdd   3.66309  1.00000  3751G   998G  2752G 26.62 1.07 217
>  osd.188
>
> Could these OSDs, holding more than 200 PGs, contribute to the problem?
>
> Is there any way to confirm that we're hitting the "PG overdose
> protection"? If that's true how can restore our cluster back to normal.
>
> Apart from getting these OSDs back to work, we're concerned about the
> overall choices regarding the number of PGs (4096) for that (6,3) EC pool.
>
> Any help appreciated,
> Alex
>
> [1] http://tracker.ceph.com/issues/23117
> [2] https://ceph.com/community/new-luminous-pg-overdose-protection/
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to