Re: [ceph-users] How to fix a Ceph PG in unkown state with no OSDs?

Paul Emmerich Thu, 14 Jun 2018 13:47:16 -0700

Can you post your whole crushmap?

ceph osd getcrushmap -o crushmap
crushtool -d crushmap -o crushmap.txt



Paul


2018-06-14 22:39 GMT+02:00 Oliver Schulz <oliver.sch...@tu-dortmund.de>:

> Thanks, Greg!!
>
> I reset all the OSD weights to 1.00, and I think I'm in a much
> better state now. The only trouble left in "ceph health detail" is
>
> PG_DEGRADED Degraded data redundancy: 4/404985012 objects degraded
> (0.000%), 3 pgs degraded
>     pg 2.47 is active+recovery_wait+degraded+remapped, acting [177,68,187]
>     pg 2.1fd is active+recovery_wait+degraded+remapped, acting [36,83,185]
>     pg 2.748 is active+recovery_wait+degraded, acting [31,8,149]
>
> (There's a lot of misplaced PGs now, obviously). The interesting
> thing is that my "lost" PG is back, too, with three acting OSDs.
>
> Maybe I dodged the bullet - what do you think?
>
> One question: Is there a way to give recovery of the three
> degraded PGs priority over backfilling the misplaced ones?
> I tried "ceph pg force-recovery" but it didn't seem to have
> any effect, they were still on "recovery_wait", after.
>
>
> Cheers,
>
> Oliver
>
>
> On 14.06.2018 22:09, Gregory Farnum wrote:
>
>> On Thu, Jun 14, 2018 at 4:07 PM Oliver Schulz <
>> oliver.sch...@tu-dortmund.de <mailto:oliver.sch...@tu-dortmund.de>>
>> wrote:
>>
>>     Hi Greg,
>>
>>     I increased the hard limit and rebooted everything. The
>>     PG without acting OSDs still has none, but I also have
>>     quite a few PGs with that look like this, now:
>>
>>           pg 1.79c is stuck undersized for 470.640254, current state
>>     active+undersized+degraded, last acting [179,154]
>>
>>     I had that problem before (only two acting OSDs on a few PGs),
>>     I always solved it by setting the primary OSD to out and then
>>     back in a few seconds later (resulting in a very quick recovery,
>>     then all was fine again). But maybe that's not the ideal solution?
>>
>>     Here's "ceph pg map" for one of them:
>>
>>           osdmap e526060 pg 1.79c (1.79c) -> up [179,154] acting [179,154]
>>
>>     I also have two PG's that have only one acting OSD, now:
>>
>>           osdmap e526060 pg 0.58a (0.58a) -> up [174] acting [174]
>>           osdmap e526060 pg 2.139 (2.139) -> up [61] acting [61]
>>
>>     How can I make Ceph assign three OSD's to all of these weird PGs?
>>     Before the reboot, they all did have three OSDs assigned (except for
>>     the one that has none), and they were not shown as degraded.
>>
>>
>>       > If it's the second, then fixing the remapping problem will
>>     resolve it.
>>       > That's probably/hopefully just by undoing the remap-by-utilization
>>       > changes.
>>
>>     How do I do that, best? Just set all the weights back to 1.00?
>>
>>
>> Yeah. This is probably the best way to fix up the other undersized PGs —
>> at least, assuming it doesn't result in an over-full PG!
>>
>> I don't work with overflowing OSDs/clusters often, but my suspicion is
>> you're better off with something like CERN's reweight scripts than using
>> reweight-by-utilization. Unless it's improved without my noticing, that
>> algorithm just isn't very good. :/
>> -Greg
>>
>>
>>
>>     Cheers,
>>
>>     Oliver
>>
>>
>>     P.S.: Thanks so much for helping!
>>
>>
>>
>>     On 14.06.2018 21:37, Gregory Farnum wrote:
>>      > On Thu, Jun 14, 2018 at 3:26 PM Oliver Schulz
>>      > <oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>
>>     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>>> wrote:
>>      >
>>      >     But the contents of the remapped PGs should still be
>>      >     Ok, right? What confuses me is that they don't
>>      >     backfill - why don't the "move" where they belong?
>>      >
>>      >     As for the PG hard limit, yes, I ran into this. Our
>>      >     cluster had been very (very) full, but I wanted the
>>      >     new OSD nodes to use bluestore, so I updated to
>>      >     Luminous before I added the additional storage. I
>>      >     temporarily increased the pg hard limit and after
>>      >     a while (and after adding the new OSDs) the cluster
>>      >     seemed to be in a decent state again. Afterwards,
>>      >     I set the PG hard limit back to normal.
>>      >
>>      >     I don't have a "too many PGs per OSD" health warning,
>>      >     currently - should I still increase the PG hard limit?
>>      >
>>      >
>>      > Well, it's either the hard limit getting hit, or the fact that
>>     the PG
>>      > isn't getting mapped to any OSD and there not being an existing
>>     primary
>>      > to take responsibility for remapping it.
>>      >
>>      > If it's the second, then fixing the remapping problem will
>>     resolve it.
>>      > That's probably/hopefully just by undoing the
>>     remap-by-utilization changes.
>>      >
>>      >
>>      >     On 14.06.2018 20:58, Gregory Farnum wrote:
>>      >      > Okay, I can’t tell you what happened to that one pg, but
>>     you’ve got
>>      >      > another 445 remapped pgs and that’s not a good state to be
>>     in. It
>>      >     was
>>      >      > probably your use of the rewritten-by-utilization. :/ I am
>>     pretty
>>      >     sure
>>      >      > the missing PG and remapped ones have the same root cause,
>>     and it’s
>>      >      > possible but by no means certain fixing one will fix the
>>     others.
>>      >      >
>>      >      >
>>      >      > ...oh, actually, the most likely cause just came up in an
>>     unrelated
>>      >      > conversation. You’ve probably run into the pg overdose
>>     protection
>>      >     that
>>      >      > was added in luminous. Check the list archives for the exact
>>      >     name, but
>>      >      > you’ll want to increase the pg hard limit and restart the
>>     osds that
>>      >      > exceeded the previous/current setting.
>>      >      > -Greg
>>      >      > On Thu, Jun 14, 2018 at 2:33 PM Oliver Schulz
>>      >      > <oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>
>>      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>>
>>      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>
>>      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>>>> wrote:
>>      >      >
>>      >      >     I'm not running the balancer, but I did
>>     reweight-by-utilization
>>      >      >     a few times recently.
>>      >      >
>>      >      >     "ceph osd tree" and "ceph -s" say:
>>      >      >
>>      >      >
>>     https://gist.github.com/oschulz/36d92af84851ec42e09ce1f3cacbc110
>>      >      >
>>      >      >
>>      >      >
>>      >      >     On 14.06.2018 20:23, Gregory Farnum wrote:
>>      >      >      > Well, if this pg maps to no osds, something has
>>     certainly
>>      >     gone wrong
>>      >      >      > with your crush map. What’s the crush rule it’s
>>     using, and
>>      >     what’s
>>      >      >     the
>>      >      >      > output of “ceph osd tree”?
>>      >      >      > Are you running the manager’s balancer module or
>>     something
>>      >     that
>>      >      >     might be
>>      >      >      > putting explicit mappings into the osd map and
>>     broken it?
>>      >      >      >
>>      >      >      > I’m not certain off-hand about the pg reporting, but
>> I
>>      >     believe if
>>      >      >     it’s
>>      >      >      > reporting the state as unknown that means *no*
>>     running osd
>>      >     which
>>      >      >      > contains any copy of that pg. That’s not something
>>     which ceph
>>      >      >     could do
>>      >      >      > on its own without failures of osds. What’s the
>>     output of
>>      >     “ceph -s”?
>>      >      >      > On Thu, Jun 14, 2018 at 2:15 PM Oliver Schulz
>>      >      >      > <oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>
>>      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>>
>>      >      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>
>>      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>>>
>>      >      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>
>>      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>>
>>      >      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>
>>      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>>>>> wrote:
>>      >      >      >
>>      >      >      >     Dear Greg,
>>      >      >      >
>>      >      >      >     no, it's a very old cluster (continuous operation
>>      >     since 2013,
>>      >      >      >     with multiple extensions). It's a production
>>     cluster and
>>      >      >      >     there's about 300TB of valuable data on it.
>>      >      >      >
>>      >      >      >     We recently updated to luminous and added more
>>     OSDs (a
>>      >     month
>>      >      >      >     ago or so), but everything seemed Ok since then.
>> We
>>      >     didn't have
>>      >      >      >     any disk failures, but we had trouble with the
>>     MDS daemons
>>      >      >      >     in the last days, so there were a few reboots.
>>      >      >      >
>>      >      >      >     Is it somehow possible to find this "lost" PG
>>     again? Since
>>      >      >      >     it's in the metadata pool, large parts of our
>>     CephFS
>>      >     directory
>>      >      >      >     tree are currently unavailable. I turned the MDS
>>      >     daemons off
>>      >      >      >     for now ...
>>      >      >      >
>>      >      >      >
>>      >      >      >     Cheers
>>      >      >      >
>>      >      >      >     Oliver
>>      >      >      >
>>      >      >      >     On 14.06.2018 19:59, Gregory Farnum wrote:
>>      >      >      >      > Is this a new cluster? Or did the crush map
>>     change
>>      >     somehow
>>      >      >      >     recently? One
>>      >      >      >      > way this might happen is if CRUSH just failed
>>      >     entirely to
>>      >      >     map a pg,
>>      >      >      >      > although I think if the pg exists anywhere it
>>      >     should still be
>>      >      >      >     getting
>>      >      >      >      > reported as inactive.
>>      >      >      >      > On Thu, Jun 14, 2018 at 8:40 AM Oliver Schulz
>>      >      >      >      > <oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>
>>      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>>
>>      >      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>
>>      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>>>
>>      >      >      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>
>>      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>>
>>      >      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>
>>      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>>>>
>>      >      >      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>
>>      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>>
>>      >      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>
>>      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>>>
>>      >      >      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>
>>      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>>
>>      >      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>
>>      >     <mailto:oliver.sch...@tu-dortmund.de
>>     <mailto:oliver.sch...@tu-dortmund.de>>>>>> wrote:
>>      >      >      >      >
>>      >      >      >      >     Dear all,
>>      >      >      >      >
>>      >      >      >      >     I have a serious problem with our Ceph
>>     cluster:
>>      >     One of our
>>      >      >      >     PGs somehow
>>      >      >      >      >     ended up in this state (reported by "ceph
>>      >     health detail":
>>      >      >      >      >
>>      >      >      >      >           pg 1.XXX is stuck inactive for
>>     ..., current
>>      >      >     state unknown,
>>      >      >      >      >     last acting []
>>      >      >      >      >
>>      >      >      >      >     Also, "ceph pg map 1.xxx" reports:
>>      >      >      >      >
>>      >      >      >      >           osdmap e525812 pg 1.721 (1.721) ->
>>     up []
>>      >     acting []
>>      >      >      >      >
>>      >      >      >      >     I can't use "ceph pg 1.XXX query", it just
>>      >     hangs with
>>      >      >     no output.
>>      >      >      >      >
>>      >      >      >      >     All OSDs are up and in, I have MON
>>     quorum, all
>>      >     other
>>      >      >     PGs seem
>>      >      >      >     to be
>>      >      >      >      >     fine.
>>      >      >      >      >
>>      >      >      >      >     How can diagnose/fix this?
>>     Unfortunately, the PG in
>>      >      >     question
>>      >      >      >     is part
>>      >      >      >      >     of the CephFS metadata pool ...
>>      >      >      >      >
>>      >      >      >      >     Any help would be very, very much
>>     appreciated!
>>      >      >      >      >
>>      >      >      >      >
>>      >      >      >      >     Cheers,
>>      >      >      >      >
>>      >      >      >      >     Oliver
>>      >      >      >      >         _____________________________
>> __________________
>>      >      >      >      >     ceph-users mailing list
>>      >      >      >      > ceph-users@lists.ceph.com
>>     <mailto:ceph-users@lists.ceph.com>
>>      >     <mailto:ceph-users@lists.ceph.com
>>     <mailto:ceph-users@lists.ceph.com>>
>>      >      >     <mailto:ceph-users@lists.ceph.com
>>     <mailto:ceph-users@lists.ceph.com>
>>      >     <mailto:ceph-users@lists.ceph.com
>>     <mailto:ceph-users@lists.ceph.com>>>
>>      >     <mailto:ceph-users@lists.ceph.com
>>     <mailto:ceph-users@lists.ceph.com> <mailto:ceph-users@lists.ceph.com
>>     <mailto:ceph-users@lists.ceph.com>>
>>      >      >     <mailto:ceph-users@lists.ceph.com
>>     <mailto:ceph-users@lists.ceph.com>
>>      >     <mailto:ceph-users@lists.ceph.com
>>     <mailto:ceph-users@lists.ceph.com>>>>
>>      >      >      >     <mailto:ceph-users@lists.ceph.com
>>     <mailto:ceph-users@lists.ceph.com>
>>      >     <mailto:ceph-users@lists.ceph.com
>>     <mailto:ceph-users@lists.ceph.com>>
>>      >      >     <mailto:ceph-users@lists.ceph.com
>>     <mailto:ceph-users@lists.ceph.com>
>>      >     <mailto:ceph-users@lists.ceph.com
>>     <mailto:ceph-users@lists.ceph.com>>>
>>      >     <mailto:ceph-users@lists.ceph.com
>>     <mailto:ceph-users@lists.ceph.com> <mailto:ceph-users@lists.ceph.com
>>     <mailto:ceph-users@lists.ceph.com>>
>>      >      >     <mailto:ceph-users@lists.ceph.com
>>     <mailto:ceph-users@lists.ceph.com>
>>      >     <mailto:ceph-users@lists.ceph.com
>>     <mailto:ceph-users@lists.ceph.com>>>>>
>>      >      >      >      >
>>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>      >      >      >      >
>>      >      >      >
>>      >      >
>>      >
>>
>> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How to fix a Ceph PG in unkown state with no OSDs?

Reply via email to