Re: [ceph-users] Cluster not recovering after OSD deamon is down

Gaurav Bafna Tue, 03 May 2016 07:03:47 -0700

The replication size is 3 and min_size is 2. Yes , they don't have
enough copies. Ceph by itself should recover from this state to ensure
durability .


@Tupper : In this bug, each node is hosting only three osds . In my
set up , every node has 23 osds. So this should not be the issue .



On Tue, May 3, 2016 at 7:00 PM, Varada Kari <varada.k...@sandisk.com> wrote:
> Pgs are degraded because they don't have enough copies of the data. What
> is your replication size?
>
> You can refer to
> http://docs.ceph.com/docs/master/rados/operations/pg-states/  for PG states.
>
> Varada
>
> On Tuesday 03 May 2016 06:56 PM, Gaurav Bafna wrote:
>> Also , the old PGs are not mapped to the down osd as seen from the
>> ceph health detail
>>
>> pg 5.72 is active+undersized+degraded, acting [16,49]
>> pg 5.4e is active+undersized+degraded, acting [16,38]
>> pg 5.32 is active+undersized+degraded, acting [39,19]
>> pg 5.37 is active+undersized+degraded, acting [43,1]
>> pg 5.2c is active+undersized+degraded, acting [47,18]
>> pg 5.27 is active+undersized+degraded, acting [26,19]
>> pg 6.13 is active+undersized+degraded, acting [30,16]
>> pg 4.17 is active+undersized+degraded, acting [47,20]
>> pg 7.a is active+undersized+degraded, acting [38,2]
>>
>> From pg query of 7.a
>>
>> {
>>     "state": "active+undersized+degraded",
>>     "snap_trimq": "[]",
>>     "epoch": 857,
>>     "up": [
>>         38,
>>         2
>>     ],
>>     "acting": [
>>         38,
>>         2
>>     ],
>>     "actingbackfill": [
>>         "2",
>>         "38"
>>     ],
>>     "info": {
>>         "pgid": "7.a",
>>         "last_update": "0'0",
>>         "last_complete": "0'0",
>>         "log_tail": "0'0",
>>         "last_user_version": 0,
>>         "last_backfill": "MAX",
>>         "purged_snaps": "[]",
>>         "history": {
>>             "epoch_created": 13,
>>             "last_epoch_started": 818,
>>             "last_epoch_clean": 818,
>>             "last_epoch_split": 0,
>>             "same_up_since": 817,
>>             "same_interval_since": 817,
>>
>>
>> Complete pq query info at : http://pastebin.com/ZHB6M4PQ
>>
>> On Tue, May 3, 2016 at 6:46 PM, Gaurav Bafna <baf...@gmail.com> wrote:
>>> Thanks Tupper for replying.
>>>
>>> Shouldn't the PG be remapped to other OSDs ?
>>>
>>> Yes , removing OSD from the cluster is resulting into full recovery.
>>> But that should not be needed , right ?
>>>
>>>
>>>
>>> On Tue, May 3, 2016 at 6:31 PM, Tupper Cole <tc...@redhat.com> wrote:
>>>> The degraded pgs are mapped to the down OSD and have not mapped to a new
>>>> OSD. Removing the OSD would likely result in a full recovery.
>>>>
>>>> As a note, having two monitors (or any even number of monitors) is not
>>>> recommended. If either monitor goes down you will lose quorum. The
>>>> recommended number of monitors for any cluster is at least three.
>>>>
>>>> On Tue, May 3, 2016 at 8:42 AM, Gaurav Bafna <baf...@gmail.com> wrote:
>>>>> Hi Cephers,
>>>>>
>>>>> I am running a very small cluster of 3 storage and 2 monitor nodes.
>>>>>
>>>>> After I kill 1 osd-daemon, the cluster never recovers fully. 9 PGs
>>>>> remain undersized for unknown reason.
>>>>>
>>>>> After I restart that 1 osd deamon, the cluster recovers in no time .
>>>>>
>>>>> Size of all pools are 3 and min_size is 2.
>>>>>
>>>>> Can anybody please help ?
>>>>>
>>>>> Output of  "ceph -s"
>>>>>     cluster fac04d85-db48-4564-b821-deebda046261
>>>>>      health HEALTH_WARN
>>>>>             9 pgs degraded
>>>>>             9 pgs stuck degraded
>>>>>             9 pgs stuck unclean
>>>>>             9 pgs stuck undersized
>>>>>             9 pgs undersized
>>>>>             recovery 3327/195138 objects degraded (1.705%)
>>>>>             pool .users pg_num 512 > pgp_num 8
>>>>>      monmap e2: 2 mons at
>>>>> {dssmon2=10.140.13.13:6789/0,dssmonleader1=10.140.13.11:6789/0}
>>>>>             election epoch 1038, quorum 0,1 dssmonleader1,dssmon2
>>>>>      osdmap e857: 69 osds: 68 up, 68 in
>>>>>       pgmap v106601: 896 pgs, 9 pools, 435 MB data, 65047 objects
>>>>>             279 GB used, 247 TB / 247 TB avail
>>>>>             3327/195138 objects degraded (1.705%)
>>>>>                  887 active+clean
>>>>>                    9 active+undersized+degraded
>>>>>   client io 395 B/s rd, 0 B/s wr, 0 op/s
>>>>>
>>>>> ceph health detail output :
>>>>>
>>>>> HEALTH_WARN 9 pgs degraded; 9 pgs stuck degraded; 9 pgs stuck unclean;
>>>>> 9 pgs stuck undersized; 9 pgs undersized; recovery 3327/195138 objects
>>>>> degraded (1.705%); pool .users pg_num 512 > pgp_num 8
>>>>> pg 7.a is stuck unclean for 322742.938959, current state
>>>>> active+undersized+degraded, last acting [38,2]
>>>>> pg 5.27 is stuck unclean for 322754.823455, current state
>>>>> active+undersized+degraded, last acting [26,19]
>>>>> pg 5.32 is stuck unclean for 322750.685684, current state
>>>>> active+undersized+degraded, last acting [39,19]
>>>>> pg 6.13 is stuck unclean for 322732.665345, current state
>>>>> active+undersized+degraded, last acting [30,16]
>>>>> pg 5.4e is stuck unclean for 331869.103538, current state
>>>>> active+undersized+degraded, last acting [16,38]
>>>>> pg 5.72 is stuck unclean for 331871.208948, current state
>>>>> active+undersized+degraded, last acting [16,49]
>>>>> pg 4.17 is stuck unclean for 331822.771240, current state
>>>>> active+undersized+degraded, last acting [47,20]
>>>>> pg 5.2c is stuck unclean for 323021.274535, current state
>>>>> active+undersized+degraded, last acting [47,18]
>>>>> pg 5.37 is stuck unclean for 323007.574395, current state
>>>>> active+undersized+degraded, last acting [43,1]
>>>>> pg 7.a is stuck undersized for 322487.284302, current state
>>>>> active+undersized+degraded, last acting [38,2]
>>>>> pg 5.27 is stuck undersized for 322487.287164, current state
>>>>> active+undersized+degraded, last acting [26,19]
>>>>> pg 5.32 is stuck undersized for 322487.285566, current state
>>>>> active+undersized+degraded, last acting [39,19]
>>>>> pg 6.13 is stuck undersized for 322487.287168, current state
>>>>> active+undersized+degraded, last acting [30,16]
>>>>> pg 5.4e is stuck undersized for 331351.476170, current state
>>>>> active+undersized+degraded, last acting [16,38]
>>>>> pg 5.72 is stuck undersized for 331351.475707, current state
>>>>> active+undersized+degraded, last acting [16,49]
>>>>> pg 4.17 is stuck undersized for 322487.280309, current state
>>>>> active+undersized+degraded, last acting [47,20]
>>>>> pg 5.2c is stuck undersized for 322487.286347, current state
>>>>> active+undersized+degraded, last acting [47,18]
>>>>> pg 5.37 is stuck undersized for 322487.280027, current state
>>>>> active+undersized+degraded, last acting [43,1]
>>>>> pg 7.a is stuck degraded for 322487.284340, current state
>>>>> active+undersized+degraded, last acting [38,2]
>>>>> pg 5.27 is stuck degraded for 322487.287202, current state
>>>>> active+undersized+degraded, last acting [26,19]
>>>>> pg 5.32 is stuck degraded for 322487.285604, current state
>>>>> active+undersized+degraded, last acting [39,19]
>>>>> pg 6.13 is stuck degraded for 322487.287207, current state
>>>>> active+undersized+degraded, last acting [30,16]
>>>>> pg 5.4e is stuck degraded for 331351.476209, current state
>>>>> active+undersized+degraded, last acting [16,38]
>>>>> pg 5.72 is stuck degraded for 331351.475746, current state
>>>>> active+undersized+degraded, last acting [16,49]
>>>>> pg 4.17 is stuck degraded for 322487.280348, current state
>>>>> active+undersized+degraded, last acting [47,20]
>>>>> pg 5.2c is stuck degraded for 322487.286386, current state
>>>>> active+undersized+degraded, last acting [47,18]
>>>>> pg 5.37 is stuck degraded for 322487.280066, current state
>>>>> active+undersized+degraded, last acting [43,1]
>>>>> pg 5.72 is active+undersized+degraded, acting [16,49]
>>>>> pg 5.4e is active+undersized+degraded, acting [16,38]
>>>>> pg 5.32 is active+undersized+degraded, acting [39,19]
>>>>> pg 5.37 is active+undersized+degraded, acting [43,1]
>>>>> pg 5.2c is active+undersized+degraded, acting [47,18]
>>>>> pg 5.27 is active+undersized+degraded, acting [26,19]
>>>>> pg 6.13 is active+undersized+degraded, acting [30,16]
>>>>> pg 4.17 is active+undersized+degraded, acting [47,20]
>>>>> pg 7.a is active+undersized+degraded, acting [38,2]
>>>>> recovery 3327/195138 objects degraded (1.705%)
>>>>> pool .users pg_num 512 > pgp_num 8
>>>>>
>>>>>
>>>>> My crush map is default.
>>>>>
>>>>> Ceph.conf is :
>>>>>
>>>>> [osd]
>>>>> osd mkfs type=xfs
>>>>> osd recovery threads=2
>>>>> osd disk thread ioprio class=idle
>>>>> osd disk thread ioprio priority=7
>>>>> osd journal=/var/lib/ceph/osd/ceph-$id/journal
>>>>> filestore flusher=False
>>>>> osd op num shards=3
>>>>> debug osd=5
>>>>> osd disk threads=2
>>>>> osd data=/var/lib/ceph/osd/ceph-$id
>>>>> osd op num threads per shard=5
>>>>> osd op threads=4
>>>>> keyring=/var/lib/ceph/osd/ceph-$id/keyring
>>>>> osd journal size=4096
>>>>>
>>>>>
>>>>> [global]
>>>>> filestore max sync interval=10
>>>>> auth cluster required=cephx
>>>>> osd pool default min size=3
>>>>> osd pool default size=3
>>>>> public network=10.140.13.0/26
>>>>> objecter inflight op_bytes=1073741824
>>>>> auth service required=cephx
>>>>> filestore min sync interval=1
>>>>> fsid=fac04d85-db48-4564-b821-deebda046261
>>>>> keyring=/etc/ceph/keyring
>>>>> cluster network=10.140.13.0/26
>>>>> auth client required=cephx
>>>>> filestore xattr use omap=True
>>>>> max open files=65536
>>>>> objecter inflight ops=2048
>>>>> osd pool default pg num=512
>>>>> log to syslog = true
>>>>> #err to syslog = true
>>>>>
>>>>>
>>>>> --
>>>>> Gaurav Bafna
>>>>> 9540631400
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@lists.ceph.com
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Thanks,
>>>> Tupper Cole
>>>> Senior Storage Consultant
>>>> Global Storage Consulting, Red Hat
>>>> tc...@redhat.com
>>>> phone:  + 01 919-720-2612
>>>
>>>
>>> --
>>> Gaurav Bafna
>>> 9540631400
>>
>>
>
> PLEASE NOTE: The information contained in this electronic mail message is 
> intended only for the use of the designated recipient(s) named above. If the 
> reader of this message is not the intended recipient, you are hereby notified 
> that you have received this message in error and that any review, 
> dissemination, distribution, or copying of this message is strictly 
> prohibited. If you have received this communication in error, please notify 
> the sender by telephone or e-mail (as shown above) immediately and destroy 
> any and all copies of this message in your possession (whether hard copies or 
> electronically stored copies).



-- 
Gaurav Bafna
9540631400
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Cluster not recovering after OSD deamon is down

Reply via email to