Ok, great glad you got your issue sorted. I’m still battling along with mine.
From: Karun Josy [mailto:karunjo...@gmail.com] Sent: 13 December 2017 12:22 To: n...@fisk.me.uk Cc: ceph-users <ceph-users@lists.ceph.com> Subject: Re: [ceph-users] Health Error : Request Stuck Hi Nick, Finally, was able to correct the issue! We found that there were many slow requests in ceph health detail. And found that some osds were slowing the cluster down. Initially the cluster was unusable when there were 10 PGs with "activating+remapped" status and slow requests. Slow requests were mainly on 2 osds. And we restarted osd daemons one by one, which cleared the block requests. And that made the cluster reusable. However, there were 4 PGs still in inactive state. So I took down one of the osd with slow requests for some time, and allowed the cluster to rebalance. And it worked! To be honest, not exactly sure its the correct way. P.S : I had upgraded to Luminous 12.2.2 yesterday. Karun Josy On Wed, Dec 13, 2017 at 4:31 PM, Nick Fisk <n...@fisk.me.uk <mailto:n...@fisk.me.uk> > wrote: Hi Karun, I too am experiencing something very similar with a PG stuck in activating+remapped state after re-introducing a OSD back into the cluster as Bluestore. Although this new OSD is not the one listed against the PG’s stuck activating. I also see the same thing as you where the up set is different to the acting set. Can I just ask what ceph version you are running and the output of ceph osd tree? From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com <mailto:ceph-users-boun...@lists.ceph.com> ] On Behalf Of Karun Josy Sent: 13 December 2017 07:06 To: ceph-users <ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > Subject: Re: [ceph-users] Health Error : Request Stuck Cluster is unusable because of inactive PGs. How can we correct it? ============= ceph pg dump_stuck inactive ok PG_STAT STATE UP UP_PRIMARY ACTING ACTING_PRIMARY 1.4b activating+remapped [5,2,0,13,1] 5 [5,2,13,1,4] 5 1.35 activating+remapped [2,7,0,1,12] 2 [2,7,1,12,9] 2 1.12 activating+remapped [1,3,5,0,7] 1 [1,3,5,7,2] 1 1.4e activating+remapped [1,3,0,9,2] 1 [1,3,0,9,5] 1 2.3b activating+remapped [13,1,0] 13 [13,1,2] 13 1.19 activating+remapped [2,13,8,9,0] 2 [2,13,8,9,1] 2 1.1e activating+remapped [2,3,1,10,0] 2 [2,3,1,10,5] 2 2.29 activating+remapped [1,0,13] 1 [1,8,11] 1 1.6f activating+remapped [8,2,0,4,13] 8 [8,2,4,13,1] 8 1.74 activating+remapped [7,13,2,0,4] 7 [7,13,2,4,1] 7 ==== Karun Josy On Wed, Dec 13, 2017 at 8:27 AM, Karun Josy <karunjo...@gmail.com <mailto:karunjo...@gmail.com> > wrote: Hello, We added a new disk to the cluster and while rebalancing we are getting error warnings. ============= Overall status: HEALTH_ERR REQUEST_SLOW: 1824 slow requests are blocked > 32 sec REQUEST_STUCK: 1022 stuck requests are blocked > 4096 sec ============== The load in the servers seems to be very low. How can I correct it? Karun
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com