Re: [ceph-users] PG is in 'stuck unclean' state, but all acting OSD are up

Heller, Chris Tue, 16 Aug 2016 04:35:12 -0700

I’d like to understand more why the down OSD would cause the PG to get stuck 
after CRUSH was able to locate enough OSD to map the PG.


Is this some form of safety catch that prevents it from recovering, even though 
OSD.116 is no longer important for data integrity?

Marking the OSD lost is an option here, but it’s not really lost … it just 
takes some time to get a machine rebooted.
I’m still working out my operational procedures for CEPH and marking the OSD 
lost but having it pop back up once the system reboots could be an issue that 
I’m not yet sure how to resolve.

Can an OSD be marked as ‘found’ once it returns to the network?

-Chris

From: Goncalo Borges <goncalo.bor...@sydney.edu.au>
Date: Monday, August 15, 2016 at 11:36 PM
To: "Heller, Chris" <chel...@akamai.com>, "ceph-users@lists.ceph.com" 
<ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] PG is in 'stuck unclean' state, but all acting OSD 
are up


Hi Chris...

The precise osd set you see now [79,8,74] was obtained on epoch 104536 but this 
was after a lot of tries as showed by the recovery section.

Actually, in the first try (on epoch 100767) osd 116 was selected somehow 
(maybe it was up at the time?) and probably the pg got stuck because it went 
down during the recover process?

recovery_state": [
        {
            "name": "Started\/Primary\/Peering\/GetInfo",
            "enter_time": "2016-08-11 11:45:06.052568",
            "requested_info_from": []
        },
        {
            "name": "Started\/Primary\/Peering",
            "enter_time": "2016-08-11 11:45:06.052558",
            "past_intervals": [
                {
                    "first": 100767,
                    "last": 100777,
                    "maybe_went_rw": 1,
                    "up": [
                        79,
                        116,
                        74
                    ],
                    "acting": [
                        79,
                        116,
                        74
                    ],
                    "primary": 79,
                    "up_primary": 79
                },

The pg query also shows

peering_blocked_by": [
                {
                    "osd": 116,
                    "current_lost_at": 0,
                    "comment": "starting or marking this osd lost may let us 
proceed"
                }

Maybe, you can check the documentation in [1] and see if you think you could 
follow the suggestion inside the pg and mark osd 116 as lost. This should be 
done after proper evaluation from you.

Another thing I found strange is that in the recovery section, there are a lot 
of tries where you do not get a proper osd set. The very last recover try was 
on epoch 104540.

                {
                    "first": 104536,
                    "last": 104540,
                    "maybe_went_rw": 1,
                    "up": [
                        2147483647,
                        8,
                        74
                    ],
                    "acting": [
                        2147483647,
                        8,
                        74
                    ],
                    "primary": 8,
                    "up_primary": 8
                }

From [2], "When CRUSH fails to find enough OSDs to map to a PG, it will show as 
a 2147483647 which is ITEM_NONE or no OSD found.".

This could be an artifact of the peering being blocked by osd.116, or a genuine 
problem where you are not being able to get a proper osd set. That could be for 
a variety of reasons: from network issues, to osds being almost full or simply 
because the system can't get 3 osds in 3 different hosts.

Cheers

Goncalo


[1] 
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#placement-group-down-peering-failure<https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.ceph.com_docs_master_rados_troubleshooting_troubleshooting-2Dpg_-23placement-2Dgroup-2Ddown-2Dpeering-2Dfailure&d=DQMDaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=ylcFa5bBSUyTQqbx1Aqz47ec5BJJc7uk0YQ4EQKh-DY&m=Cq5DVZCgs9mbiZGc07mZmmOibYWa4CNvlbBNFpJAcuU&s=0vRtD0EvbI7L8KOHeGcZLDfYW3iNcY7bZMtHjU5MHqI&e=>

[2] 
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/<https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.ceph.com_docs_master_rados_troubleshooting_troubleshooting-2Dpg_&d=DQMDaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=ylcFa5bBSUyTQqbx1Aqz47ec5BJJc7uk0YQ4EQKh-DY&m=Cq5DVZCgs9mbiZGc07mZmmOibYWa4CNvlbBNFpJAcuU&s=M96YeyltKJ3cxXFQSoJrk8ezhgvD667Q11kYH9uFN1o&e=>

On 08/16/2016 11:42 AM, Heller, Chris wrote:
Output of `ceph pg dump_stuck`

# ceph pg dump_stuck
ok
pg_stat state   up      up_primary      acting  acting_primary
4.2a8   down+peering    [79,8,74]       79      [79,8,74]       79
4.c3    down+peering    [56,79,67]      56      [56,79,67]      56

-Chris

From: Goncalo Borges 
<goncalo.bor...@sydney.edu.au><mailto:goncalo.bor...@sydney.edu.au>
Date: Monday, August 15, 2016 at 9:03 PM
To: "ceph-users@lists.ceph.com"<mailto:ceph-users@lists.ceph.com> 
<ceph-users@lists.ceph.com><mailto:ceph-users@lists.ceph.com>, "Heller, Chris" 
<chel...@akamai.com><mailto:chel...@akamai.com>
Subject: Re: [ceph-users] PG is in 'stuck unclean' state, but all acting OSD 
are up


Hi Heller...

Can you actually post the result of

   ceph pg dump_stuck ?

Cheers

G.



On 08/15/2016 10:19 PM, Heller, Chris wrote:
I’d like to better understand the current state of my CEPH cluster.

I currently have 2 PG that are in the ‘stuck unclean’ state:

# ceph health detail
HEALTH_WARN 2 pgs down; 2 pgs peering; 2 pgs stuck inactive; 2 pgs stuck unclean
pg 4.2a8 is stuck inactive for 124516.777791, current state down+peering, last 
acting [79,8,74]
pg 4.c3 is stuck inactive since forever, current state down+peering, last 
acting [56,79,67]
pg 4.2a8 is stuck unclean for 124536.223284, current state down+peering, last 
acting [79,8,74]
pg 4.c3 is stuck unclean since forever, current state down+peering, last acting 
[56,79,67]
pg 4.2a8 is down+peering, acting [79,8,74]
pg 4.c3 is down+peering, acting [56,79,67]

While my cluster does currently have some down OSD, none are in the acting set 
for either PG:

ceph osd tree | grep down
73   1.00000         osd.73              down        0          1.00000
96   1.00000         osd.96              down        0          1.00000
110   1.00000         osd.110             down        0          1.00000
116   1.00000         osd.116             down        0          1.00000
120   1.00000         osd.120             down        0          1.00000
126   1.00000         osd.126             down        0          1.00000
124   1.00000         osd.124             down        0          1.00000
119   1.00000         osd.119             down        0          1.00000

I’ve queried one of the two PG, and see that recovery is currently blocked on 
OSD.116, which is indeed down, but is not part of the acting set of OSD for 
that PG:

http://pastebin.com/Rg2hK9GE<https://urldefense.proofpoint.com/v2/url?u=http-3A__pastebin.com_Rg2hK9GE&d=DQMD-g&c=96ZbZZcaMF4w0F4jpN6LZg&r=ylcFa5bBSUyTQqbx1Aqz47ec5BJJc7uk0YQ4EQKh-DY&m=1I7INncBAJrC1GhybLtpQDEPndNnH3g0mIg6r_dCqAk&s=eMFLR4yFAYyD9jbJfLHkeWwkyOqAyYN4yLpT-0xHjb8&e=>

This is all with CEPH version 0.94.3:

# ceph version
ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)

Why does this PG remain ‘stuck unclean’?
Is there some steps I can take to unstick it, given that all the acting OSD are 
up and in?

(* Re-sent, now that I’m subscribed to list *)
-Chris





_______________________________________________

ceph-users mailing list

ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com&d=DQMD-g&c=96ZbZZcaMF4w0F4jpN6LZg&r=ylcFa5bBSUyTQqbx1Aqz47ec5BJJc7uk0YQ4EQKh-DY&m=1I7INncBAJrC1GhybLtpQDEPndNnH3g0mIg6r_dCqAk&s=oQbaHI6URK-ks5cOKdCtfn1wpegbytvQ4tm9HkbD5d0&e=>




--

Goncalo Borges

Research Computing

ARC Centre of Excellence for Particle Physics at the Terascale

School of Physics A28 | University of Sydney, NSW  2006

T: +61 2 93511937



--

Goncalo Borges

Research Computing

ARC Centre of Excellence for Particle Physics at the Terascale

School of Physics A28 | University of Sydney, NSW  2006

T: +61 2 93511937

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] PG is in 'stuck unclean' state, but all acting OSD are up

Reply via email to