You should probably have used 2048 following the usual target of 100 PGs per OSD. Just increase the mon_max_pg_per_osd option, ~200 is still okay-ish and your cluster will grow out of it :)
Paul 2018-08-01 19:55 GMT+02:00 Alexandros Afentoulis <alexaf+c...@noc.grnet.gr>: > Hello people :) > > we are facing a situation quite similar to the one described here: > http://tracker.ceph.com/issues/23117 > > Namely: > > we have a Luminous cluster consisting of 16 hosts, where each host holds > 12 OSDs on spinning disks and 4 OSDs on SSDs. Let's forget the SSDs for > now since they're not used atm. > > We have a Erasure Coding pool (k=6, m=3) with 4096 PGs, residing on the > spinning disks, with failure domain the host. > > After getting a host (and their OSDs) out for maintenance, we're trying > to put the OSDs back in. While cluster starts recovering we observe > > > Reduced data availability: 170 pgs inactive > > and > > > 170 activating+remapped > > This eventually leads to slow/stucked requests and we have to get the > OSDs out again. > > While searching around we came across the already mentioned issue on > tracker [1] and we're wondering "PG overdose protection" [2] is what > we're really facing now. > > Our cluster features: > > "mon_max_pg_per_osd": "200", > "osd_max_pg_per_osd_hard_ratio": "2.000000", > > What is more, we observed that the PGs distribution among the OSDs is > not uniform, eg: > > > ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE > NAME > > -1 711.29004 - 666T 165T 500T 0 0 - root > default > > -17 44.68457 - 45757G 11266G 34491G 24.62 0.99 - > host rd3-1427 > > 9 hdd 3.66309 1.00000 3751G 976G 2774G 26.03 1.05 212 > osd.9 > > 30 hdd 3.66309 1.00000 3751G 961G 2789G 25.64 1.03 209 > osd.30 > > 46 hdd 3.66309 1.00000 3751G 902G 2848G 24.07 0.97 196 > osd.46 > > 61 hdd 3.66309 1.00000 3751G 877G 2873G 23.40 0.94 190 > osd.61 > > 76 hdd 3.66309 1.00000 3751G 984G 2766G 26.24 1.05 214 > osd.76 > > 92 hdd 3.66309 1.00000 3751G 894G 2856G 23.84 0.96 194 > osd.92 > > 107 hdd 3.66309 1.00000 3751G 881G 2869G 23.50 0.94 191 > osd.107 > > 123 hdd 3.66309 1.00000 3751G 973G 2777G 25.97 1.04 212 > osd.123 > > 138 hdd 3.66309 1.00000 3751G 975G 2775G 26.01 1.05 212 > osd.138 > > 156 hdd 3.66309 1.00000 3751G 813G 2937G 21.69 0.87 176 > osd.156 > > 172 hdd 3.66309 1.00000 3751G 1016G 2734G 27.09 1.09 221 > osd.172 > > 188 hdd 3.66309 1.00000 3751G 998G 2752G 26.62 1.07 217 > osd.188 > > Could these OSDs, holding more than 200 PGs, contribute to the problem? > > Is there any way to confirm that we're hitting the "PG overdose > protection"? If that's true how can restore our cluster back to normal. > > Apart from getting these OSDs back to work, we're concerned about the > overall choices regarding the number of PGs (4096) for that (6,3) EC pool. > > Any help appreciated, > Alex > > [1] http://tracker.ceph.com/issues/23117 > [2] https://ceph.com/community/new-luminous-pg-overdose-protection/ > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com