Hi, We added some more osds to the cluster and it was fixed.
Karun Josy On Tue, Jan 2, 2018 at 6:21 AM, 한승진 <yongi...@gmail.com> wrote: > Are all odsd are same version? > I recently experienced similar situation. > > I upgraded all osds to exact same version and reset of pool configuration > like below > > ceph osd pool set <pool-name> min_size 5 > > I have 5+2 erasure code the important thing is not the number of min_size > but re-configuration I think. > I hope this help you. > > 2017. 12. 19. 오전 5:25에 "Karun Josy" <karunjo...@gmail.com>님이 작성: > > I think what happened is this : >> >> http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/ >> >> >> Note >> >> >> Sometimes, typically in a “small” cluster with few hosts (for instance >> with a small testing cluster), the fact to take out the OSD can spawn a >> CRUSH corner case where some PGs remain stuck in the active+remapped >> state >> >> Its a small cluster with unequal number of osds and one of the OSD disk >> failed and I had taken it out. >> I have already purged it, so I cannot use the reweight option mentioned >> in that link. >> >> >> So any other workarounds ? >> Will adding more disks will clear it ? >> >> Karun Josy >> >> On Mon, Dec 18, 2017 at 9:06 AM, David Turner <drakonst...@gmail.com> >> wrote: >> >>> Maybe try outing the disk that should have a copy of the PG, but >>> doesn't. Then mark it back in. It might check that it has everything >>> properly and pull a copy of the data it's missing. I dunno. >>> >>> On Sun, Dec 17, 2017, 10:00 PM Karun Josy <karunjo...@gmail.com> wrote: >>> >>>> Tried restarting all osds. Still no luck. >>>> >>>> Will adding a new disk to any of the server forces a rebalance and fix >>>> it? >>>> >>>> Karun Josy >>>> >>>> On Sun, Dec 17, 2017 at 12:22 PM, Cary <dynamic.c...@gmail.com> wrote: >>>> >>>>> Karun, >>>>> >>>>> Could you paste in the output from "ceph health detail"? Which OSD >>>>> was just added? >>>>> >>>>> Cary >>>>> -Dynamic >>>>> >>>>> On Sun, Dec 17, 2017 at 4:59 AM, Karun Josy <karunjo...@gmail.com> >>>>> wrote: >>>>> > Any help would be appreciated! >>>>> > >>>>> > Karun Josy >>>>> > >>>>> > On Sat, Dec 16, 2017 at 11:04 PM, Karun Josy <karunjo...@gmail.com> >>>>> wrote: >>>>> >> >>>>> >> Hi, >>>>> >> >>>>> >> Repair didnt fix the issue. >>>>> >> >>>>> >> In the pg dump details, I notice this None. Seems pg is missing >>>>> from one >>>>> >> of the OSD >>>>> >> >>>>> >> [0,2,NONE,4,12,10,5,1] >>>>> >> [0,2,1,4,12,10,5,1] >>>>> >> >>>>> >> There is no way Ceph corrects this automatically ? I have to edit/ >>>>> >> troubleshoot it manually ? >>>>> >> >>>>> >> Karun >>>>> >> >>>>> >> On Sat, Dec 16, 2017 at 10:44 PM, Cary <dynamic.c...@gmail.com> >>>>> wrote: >>>>> >>> >>>>> >>> Karun, >>>>> >>> >>>>> >>> Running ceph pg repair should not cause any problems. It may not >>>>> fix >>>>> >>> the issue though. If that does not help, there is more information >>>>> at >>>>> >>> the link below. >>>>> >>> http://ceph.com/geen-categorie/ceph-manually-repair-object/ >>>>> >>> >>>>> >>> I recommend not rebooting, or restarting while Ceph is repairing or >>>>> >>> recovering. If possible, wait until the cluster is in a healthy >>>>> state >>>>> >>> first. >>>>> >>> >>>>> >>> Cary >>>>> >>> -Dynamic >>>>> >>> >>>>> >>> On Sat, Dec 16, 2017 at 2:05 PM, Karun Josy <karunjo...@gmail.com> >>>>> wrote: >>>>> >>> > Hi Cary, >>>>> >>> > >>>>> >>> > No, I didnt try to repair it. >>>>> >>> > I am comparatively new in ceph. Is it okay to try to repair it ? >>>>> >>> > Or should I take any precautions while doing it ? >>>>> >>> > >>>>> >>> > Karun Josy >>>>> >>> > >>>>> >>> > On Sat, Dec 16, 2017 at 2:08 PM, Cary <dynamic.c...@gmail.com> >>>>> wrote: >>>>> >>> >> >>>>> >>> >> Karun, >>>>> >>> >> >>>>> >>> >> Did you attempt a "ceph pg repair <pgid>"? Replace <pgid> with >>>>> the pg >>>>> >>> >> ID that needs repaired, 3.4. >>>>> >>> >> >>>>> >>> >> Cary >>>>> >>> >> -D123 >>>>> >>> >> >>>>> >>> >> On Sat, Dec 16, 2017 at 8:24 AM, Karun Josy < >>>>> karunjo...@gmail.com> >>>>> >>> >> wrote: >>>>> >>> >> > Hello, >>>>> >>> >> > >>>>> >>> >> > I added 1 disk to the cluster and after rebalancing, it shows >>>>> 1 PG >>>>> >>> >> > is in >>>>> >>> >> > remapped state. How can I correct it ? >>>>> >>> >> > >>>>> >>> >> > (I had to restart some osds during the rebalancing as there >>>>> were >>>>> >>> >> > some >>>>> >>> >> > slow >>>>> >>> >> > requests) >>>>> >>> >> > >>>>> >>> >> > $ ceph pg dump | grep remapped >>>>> >>> >> > dumped all >>>>> >>> >> > 3.4 981 0 0 0 0 >>>>> >>> >> > 2655009792 >>>>> >>> >> > 1535 1535 active+clean+remapped 2017-12-15 22:07:21.663964 >>>>> >>> >> > 2824'785115 >>>>> >>> >> > 2824:2297888 [0,2,NONE,4,12,10,5,1] 0 >>>>> [0,2,1,4,12,10,5,1] >>>>> >>> >> > 0 2288'767367 2017-12-14 11:00:15.576741 417'518549 >>>>> 2017-12-08 >>>>> >>> >> > 03:56:14.006982 >>>>> >>> >> > >>>>> >>> >> > That PG belongs to an erasure pool with k=5, m =3 profile, >>>>> failure >>>>> >>> >> > domain is >>>>> >>> >> > host. >>>>> >>> >> > >>>>> >>> >> > =========== >>>>> >>> >> > >>>>> >>> >> > $ ceph osd tree >>>>> >>> >> > ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT >>>>> PRI-AFF >>>>> >>> >> > -1 16.94565 root default >>>>> >>> >> > -3 2.73788 host ceph-a1 >>>>> >>> >> > 0 ssd 1.86469 osd.0 up 1.00000 >>>>> 1.00000 >>>>> >>> >> > 14 ssd 0.87320 osd.14 up 1.00000 >>>>> 1.00000 >>>>> >>> >> > -5 2.73788 host ceph-a2 >>>>> >>> >> > 1 ssd 1.86469 osd.1 up 1.00000 >>>>> 1.00000 >>>>> >>> >> > 15 ssd 0.87320 osd.15 up 1.00000 >>>>> 1.00000 >>>>> >>> >> > -7 1.86469 host ceph-a3 >>>>> >>> >> > 2 ssd 1.86469 osd.2 up 1.00000 >>>>> 1.00000 >>>>> >>> >> > -9 1.74640 host ceph-a4 >>>>> >>> >> > 3 ssd 0.87320 osd.3 up 1.00000 >>>>> 1.00000 >>>>> >>> >> > 4 ssd 0.87320 osd.4 up 1.00000 >>>>> 1.00000 >>>>> >>> >> > -11 1.74640 host ceph-a5 >>>>> >>> >> > 5 ssd 0.87320 osd.5 up 1.00000 >>>>> 1.00000 >>>>> >>> >> > 6 ssd 0.87320 osd.6 up 1.00000 >>>>> 1.00000 >>>>> >>> >> > -13 1.74640 host ceph-a6 >>>>> >>> >> > 7 ssd 0.87320 osd.7 up 1.00000 >>>>> 1.00000 >>>>> >>> >> > 8 ssd 0.87320 osd.8 up 1.00000 >>>>> 1.00000 >>>>> >>> >> > -15 1.74640 host ceph-a7 >>>>> >>> >> > 9 ssd 0.87320 osd.9 up 1.00000 >>>>> 1.00000 >>>>> >>> >> > 10 ssd 0.87320 osd.10 up 1.00000 >>>>> 1.00000 >>>>> >>> >> > -17 2.61960 host ceph-a8 >>>>> >>> >> > 11 ssd 0.87320 osd.11 up 1.00000 >>>>> 1.00000 >>>>> >>> >> > 12 ssd 0.87320 osd.12 up 1.00000 >>>>> 1.00000 >>>>> >>> >> > 13 ssd 0.87320 osd.13 up 1.00000 >>>>> 1.00000 >>>>> >>> >> > >>>>> >>> >> > >>>>> >>> >> > >>>>> >>> >> > Karun >>>>> >>> >> > >>>>> >>> >> > _______________________________________________ >>>>> >>> >> > ceph-users mailing list >>>>> >>> >> > ceph-users@lists.ceph.com >>>>> >>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>> >>> >> > >>>>> >>> > >>>>> >>> > >>>>> >> >>>>> >> >>>>> > >>>>> >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@lists.ceph.com >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >>
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com