Hi,

We added some more osds to the cluster and it was fixed.

Karun Josy

On Tue, Jan 2, 2018 at 6:21 AM, 한승진 <yongi...@gmail.com> wrote:

> Are all odsd are same version?
> I recently experienced similar situation.
>
> I upgraded all osds to exact same version and reset of pool configuration
> like below
>
> ceph osd pool set <pool-name> min_size 5
>
> I have 5+2 erasure code the important thing is not the number of min_size
> but re-configuration I think.
> I hope this help you.
>
> 2017. 12. 19. 오전 5:25에 "Karun Josy" <karunjo...@gmail.com>님이 작성:
>
> I think what happened is this :
>>
>> http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/
>>
>>
>> Note
>>
>>
>> Sometimes, typically in a “small” cluster with few hosts (for instance
>> with a small testing cluster), the fact to take out the OSD can spawn a
>> CRUSH corner case where some PGs remain stuck in the active+remapped
>>  state
>>
>> Its a small cluster with unequal number of osds and one of the OSD disk
>> failed and I had taken it out.
>> I have already purged it, so I cannot use the reweight option mentioned
>> in that link.
>>
>>
>> So any other workarounds ?
>> Will adding more disks will clear it ?
>>
>> Karun Josy
>>
>> On Mon, Dec 18, 2017 at 9:06 AM, David Turner <drakonst...@gmail.com>
>> wrote:
>>
>>> Maybe try outing the disk that should have a copy of the PG, but
>>> doesn't. Then mark it back in. It might check that it has everything
>>> properly and pull a copy of the data it's missing. I dunno.
>>>
>>> On Sun, Dec 17, 2017, 10:00 PM Karun Josy <karunjo...@gmail.com> wrote:
>>>
>>>> Tried restarting all osds. Still no luck.
>>>>
>>>> Will adding a new disk to any of the server forces a rebalance and fix
>>>> it?
>>>>
>>>> Karun Josy
>>>>
>>>> On Sun, Dec 17, 2017 at 12:22 PM, Cary <dynamic.c...@gmail.com> wrote:
>>>>
>>>>> Karun,
>>>>>
>>>>>  Could you paste in the output from "ceph health detail"? Which OSD
>>>>> was just added?
>>>>>
>>>>> Cary
>>>>> -Dynamic
>>>>>
>>>>> On Sun, Dec 17, 2017 at 4:59 AM, Karun Josy <karunjo...@gmail.com>
>>>>> wrote:
>>>>> > Any help would be appreciated!
>>>>> >
>>>>> > Karun Josy
>>>>> >
>>>>> > On Sat, Dec 16, 2017 at 11:04 PM, Karun Josy <karunjo...@gmail.com>
>>>>> wrote:
>>>>> >>
>>>>> >> Hi,
>>>>> >>
>>>>> >> Repair didnt fix the issue.
>>>>> >>
>>>>> >> In the pg dump details, I notice this None. Seems pg is missing
>>>>> from one
>>>>> >> of the OSD
>>>>> >>
>>>>> >> [0,2,NONE,4,12,10,5,1]
>>>>> >> [0,2,1,4,12,10,5,1]
>>>>> >>
>>>>> >> There is no way Ceph corrects this automatically ? I have to edit/
>>>>> >> troubleshoot it manually ?
>>>>> >>
>>>>> >> Karun
>>>>> >>
>>>>> >> On Sat, Dec 16, 2017 at 10:44 PM, Cary <dynamic.c...@gmail.com>
>>>>> wrote:
>>>>> >>>
>>>>> >>> Karun,
>>>>> >>>
>>>>> >>>  Running ceph pg repair should not cause any problems. It may not
>>>>> fix
>>>>> >>> the issue though. If that does not help, there is more information
>>>>> at
>>>>> >>> the link below.
>>>>> >>> http://ceph.com/geen-categorie/ceph-manually-repair-object/
>>>>> >>>
>>>>> >>> I recommend not rebooting, or restarting while Ceph is repairing or
>>>>> >>> recovering. If possible, wait until the cluster is in a healthy
>>>>> state
>>>>> >>> first.
>>>>> >>>
>>>>> >>> Cary
>>>>> >>> -Dynamic
>>>>> >>>
>>>>> >>> On Sat, Dec 16, 2017 at 2:05 PM, Karun Josy <karunjo...@gmail.com>
>>>>> wrote:
>>>>> >>> > Hi Cary,
>>>>> >>> >
>>>>> >>> > No, I didnt try to repair it.
>>>>> >>> > I am comparatively new in ceph. Is it okay to try to repair it ?
>>>>> >>> > Or should I take any precautions while doing it ?
>>>>> >>> >
>>>>> >>> > Karun Josy
>>>>> >>> >
>>>>> >>> > On Sat, Dec 16, 2017 at 2:08 PM, Cary <dynamic.c...@gmail.com>
>>>>> wrote:
>>>>> >>> >>
>>>>> >>> >> Karun,
>>>>> >>> >>
>>>>> >>> >>  Did you attempt a "ceph pg repair <pgid>"? Replace <pgid> with
>>>>> the pg
>>>>> >>> >> ID that needs repaired, 3.4.
>>>>> >>> >>
>>>>> >>> >> Cary
>>>>> >>> >> -D123
>>>>> >>> >>
>>>>> >>> >> On Sat, Dec 16, 2017 at 8:24 AM, Karun Josy <
>>>>> karunjo...@gmail.com>
>>>>> >>> >> wrote:
>>>>> >>> >> > Hello,
>>>>> >>> >> >
>>>>> >>> >> > I added 1 disk to the cluster and after rebalancing, it shows
>>>>> 1 PG
>>>>> >>> >> > is in
>>>>> >>> >> > remapped state. How can I correct it ?
>>>>> >>> >> >
>>>>> >>> >> > (I had to restart some osds during the rebalancing as there
>>>>> were
>>>>> >>> >> > some
>>>>> >>> >> > slow
>>>>> >>> >> > requests)
>>>>> >>> >> >
>>>>> >>> >> > $ ceph pg dump | grep remapped
>>>>> >>> >> > dumped all
>>>>> >>> >> > 3.4         981                  0        0         0       0
>>>>> >>> >> > 2655009792
>>>>> >>> >> > 1535     1535 active+clean+remapped 2017-12-15 22:07:21.663964
>>>>> >>> >> > 2824'785115
>>>>> >>> >> > 2824:2297888 [0,2,NONE,4,12,10,5,1]          0
>>>>>  [0,2,1,4,12,10,5,1]
>>>>> >>> >> > 0  2288'767367 2017-12-14 11:00:15.576741      417'518549
>>>>> 2017-12-08
>>>>> >>> >> > 03:56:14.006982
>>>>> >>> >> >
>>>>> >>> >> > That PG belongs to an erasure pool with k=5, m =3 profile,
>>>>> failure
>>>>> >>> >> > domain is
>>>>> >>> >> > host.
>>>>> >>> >> >
>>>>> >>> >> > ===========
>>>>> >>> >> >
>>>>> >>> >> > $ ceph osd tree
>>>>> >>> >> > ID  CLASS WEIGHT   TYPE NAME                STATUS REWEIGHT
>>>>> PRI-AFF
>>>>> >>> >> >  -1       16.94565 root default
>>>>> >>> >> >  -3        2.73788     host ceph-a1
>>>>> >>> >> >   0   ssd  1.86469         osd.0                up  1.00000
>>>>> 1.00000
>>>>> >>> >> >  14   ssd  0.87320         osd.14               up  1.00000
>>>>> 1.00000
>>>>> >>> >> >  -5        2.73788     host ceph-a2
>>>>> >>> >> >   1   ssd  1.86469         osd.1                up  1.00000
>>>>> 1.00000
>>>>> >>> >> >  15   ssd  0.87320         osd.15               up  1.00000
>>>>> 1.00000
>>>>> >>> >> >  -7        1.86469     host ceph-a3
>>>>> >>> >> >   2   ssd  1.86469         osd.2                up  1.00000
>>>>> 1.00000
>>>>> >>> >> >  -9        1.74640     host ceph-a4
>>>>> >>> >> >   3   ssd  0.87320         osd.3                up  1.00000
>>>>> 1.00000
>>>>> >>> >> >   4   ssd  0.87320         osd.4                up  1.00000
>>>>> 1.00000
>>>>> >>> >> > -11        1.74640     host ceph-a5
>>>>> >>> >> >   5   ssd  0.87320         osd.5                up  1.00000
>>>>> 1.00000
>>>>> >>> >> >   6   ssd  0.87320         osd.6                up  1.00000
>>>>> 1.00000
>>>>> >>> >> > -13        1.74640     host ceph-a6
>>>>> >>> >> >   7   ssd  0.87320         osd.7                up  1.00000
>>>>> 1.00000
>>>>> >>> >> >   8   ssd  0.87320         osd.8                up  1.00000
>>>>> 1.00000
>>>>> >>> >> > -15        1.74640     host ceph-a7
>>>>> >>> >> >   9   ssd  0.87320         osd.9                up  1.00000
>>>>> 1.00000
>>>>> >>> >> >  10   ssd  0.87320         osd.10               up  1.00000
>>>>> 1.00000
>>>>> >>> >> > -17        2.61960     host ceph-a8
>>>>> >>> >> >  11   ssd  0.87320         osd.11               up  1.00000
>>>>> 1.00000
>>>>> >>> >> >  12   ssd  0.87320         osd.12               up  1.00000
>>>>> 1.00000
>>>>> >>> >> >  13   ssd  0.87320         osd.13               up  1.00000
>>>>> 1.00000
>>>>> >>> >> >
>>>>> >>> >> >
>>>>> >>> >> >
>>>>> >>> >> > Karun
>>>>> >>> >> >
>>>>> >>> >> > _______________________________________________
>>>>> >>> >> > ceph-users mailing list
>>>>> >>> >> > ceph-users@lists.ceph.com
>>>>> >>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>> >>> >> >
>>>>> >>> >
>>>>> >>> >
>>>>> >>
>>>>> >>
>>>>> >
>>>>>
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to