Re: [ceph-users] mds isn't working anymore after osd's running full

Gregory Farnum Tue, 14 Oct 2014 14:52:36 -0700

ceph-mds --undump-journal <rank> <journal-file>
Looks like it accidentally (or on purpose? you can break things with
it) got left out of the help text.


On Tue, Oct 14, 2014 at 8:19 AM, Jasper Siero
<jasper.si...@target-holding.nl> wrote:
> Hello Greg,
>
> I dumped the journal successful to a file:
>
> journal is 9483323613~134215459
> read 134213311 bytes at offset 9483323613
> wrote 134213311 bytes at offset 9483323613 to journaldumptgho
> NOTE: this is a _sparse_ file; you can
>         $ tar cSzf journaldumptgho.tgz journaldumptgho
>       to efficiently compress it while preserving sparseness.
>
> I see the option for resetting the mds journal but I can't find the option 
> for undumping /importing the journal:
>
>  usage: ceph-mds -i name [flags] [[--journal_check 
> rank]|[--hot-standby][rank]]
>   -m monitorip:port
>         connect to monitor at given address
>   --debug_mds n
>         debug MDS level (e.g. 10)
>   --dump-journal rank filename
>         dump the MDS journal (binary) for rank.
>   --dump-journal-entries rank filename
>         dump the MDS journal (JSON) for rank.
>   --journal-check rank
>         replay the journal for rank, then exit
>   --hot-standby rank
>         start up as a hot standby for rank
>   --reset-journal rank
>         discard the MDS journal for rank, and replace it with a single
>         event that updates/resets inotable and sessionmap on replay.
>
> Do you know how to "undump" the journal back into ceph?
>
> Jasper
>
> ________________________________________
> Van: Gregory Farnum [g...@inktank.com]
> Verzonden: vrijdag 10 oktober 2014 23:45
> Aan: Jasper Siero
> CC: ceph-users
> Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running full
>
> Ugh, "debug journaler", not "debug journaled."
>
> That said, the filer output tells me that you're missing an object out
> of the MDS log. (200.000008f5) I think this issue should be resolved
> if you "dump" the journal to a file, "reset" it, and then "undump" it.
> (These are commands you can invoke from ceph-mds.)
> I haven't done this myself in a long time, so there may be some hard
> edges around it. In particular, I'm not sure if the dumped journal
> file will stop when the data stops, or if it will be a little too
> long. If so, we can fix that by truncating the dumped file to the
> proper length and resetting and undumping again.
> (And just to harp on it, this journal manipulation is a lot simpler in
> Giant... ;) )
> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
>
> On Wed, Oct 8, 2014 at 7:11 AM, Jasper Siero
> <jasper.si...@target-holding.nl> wrote:
>> Hello Greg,
>>
>> No problem thanks for looking into the log. I attached the log to this email.
>> I'm looking forward for the new release because it would be nice to have 
>> more possibilities to diagnose problems.
>>
>> Kind regards,
>>
>> Jasper Siero
>> ________________________________________
>> Van: Gregory Farnum [g...@inktank.com]
>> Verzonden: dinsdag 7 oktober 2014 19:45
>> Aan: Jasper Siero
>> CC: ceph-users
>> Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running 
>> full
>>
>> Sorry; I guess this fell off my radar.
>>
>> The issue here is not that it's waiting for an osdmap; it got the
>> requested map and went into replay mode almost immediately. In fact
>> the log looks good except that it seems to finish replaying the log
>> and then simply fail to transition into active. Generate a new one,
>> adding in "debug journaled = 20" and "debug filer = 20", and we can
>> probably figure out how to fix it.
>> (This diagnosis is much easier in the upcoming Giant!)
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>
>>
>> On Tue, Oct 7, 2014 at 7:55 AM, Jasper Siero
>> <jasper.si...@target-holding.nl> wrote:
>>> Hello Gregory,
>>>
>>> We still have the same problems with our test ceph cluster and didn't 
>>> receive a reply from you after I send you the requested log files. Do you 
>>> know if it's possible to get our cephfs filesystem working again or is it 
>>> better to give up the files on cephfs and start over again?
>>>
>>> We restarted the cluster serveral times but it's still degraded:
>>> [root@th1-mon001 ~]# ceph -w
>>>     cluster c78209f5-55ea-4c70-8968-2231d2b05560
>>>      health HEALTH_WARN mds cluster is degraded
>>>      monmap e3: 3 mons at 
>>> {th1-mon001=10.1.2.21:6789/0,th1-mon002=10.1.2.22:6789/0,th1-mon003=10.1.2.23:6789/0},
>>>  election epoch 432, quorum 0,1,2 th1-mon001,th1-mon002,th1-mon003
>>>      mdsmap e190: 1/1/1 up {0=th1-mon001=up:replay}, 1 up:standby
>>>      osdmap e2248: 12 osds: 12 up, 12 in
>>>       pgmap v197548: 492 pgs, 4 pools, 60297 MB data, 470 kobjects
>>>             124 GB used, 175 GB / 299 GB avail
>>>                  491 active+clean
>>>                    1 active+clean+scrubbing+deep
>>>
>>> One placement group stays in the deep scrubbing fase.
>>>
>>> Kind regards,
>>>
>>> Jasper Siero
>>>
>>>
>>> ________________________________________
>>> Van: Jasper Siero
>>> Verzonden: donderdag 21 augustus 2014 16:43
>>> Aan: Gregory Farnum
>>> Onderwerp: RE: [ceph-users] mds isn't working anymore after osd's running 
>>> full
>>>
>>> I did restart it but you are right about the epoch number which has changed 
>>> but the situation looks the same.
>>> 2014-08-21 16:33:06.032366 7f9b5f3cd700  1 mds.0.27  need osdmap epoch 
>>> 1994, have 1993
>>> 2014-08-21 16:33:06.032368 7f9b5f3cd700  1 mds.0.27  waiting for osdmap 
>>> 1994 (which blacklists
>>> prior instance)
>>> I started the mds with the debug options and attached the log.
>>>
>>> Thanks,
>>>
>>> Jasper
>>> ________________________________________
>>> Van: Gregory Farnum [g...@inktank.com]
>>> Verzonden: woensdag 20 augustus 2014 18:38
>>> Aan: Jasper Siero
>>> CC: ceph-users@lists.ceph.com
>>> Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running 
>>> full
>>>
>>> After restarting your MDS, it still says it has epoch 1832 and needs
>>> epoch 1833? I think you didn't really restart it.
>>> If the epoch numbers have changed, can you restart it with "debug mds
>>> = 20", "debug objecter = 20", "debug ms = 1" in the ceph.conf and post
>>> the resulting log file somewhere?
>>> -Greg
>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>
>>>
>>> On Wed, Aug 20, 2014 at 12:49 AM, Jasper Siero
>>> <jasper.si...@target-holding.nl> wrote:
>>>> Unfortunately that doesn't help. I restarted both the active and standby 
>>>> mds but that doesn't change the state of the mds. Is there a way to force 
>>>> the mds to look at the 1832 epoch (or earlier) instead of 1833 (need 
>>>> osdmap epoch 1833, have 1832)?
>>>>
>>>> Thanks,
>>>>
>>>> Jasper
>>>> ________________________________________
>>>> Van: Gregory Farnum [g...@inktank.com]
>>>> Verzonden: dinsdag 19 augustus 2014 19:49
>>>> Aan: Jasper Siero
>>>> CC: ceph-users@lists.ceph.com
>>>> Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running 
>>>> full
>>>>
>>>> On Mon, Aug 18, 2014 at 6:56 AM, Jasper Siero
>>>> <jasper.si...@target-holding.nl> wrote:
>>>>> Hi all,
>>>>>
>>>>> We have a small ceph cluster running version 0.80.1 with cephfs on five
>>>>> nodes.
>>>>> Last week some osd's were full and shut itself down. To help de osd's 
>>>>> start
>>>>> again I added some extra osd's and moved some placement group directories 
>>>>> on
>>>>> the full osd's (which has a copy on another osd) to another place on the
>>>>> node (as mentioned in
>>>>> http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/)
>>>>> After clearing some space on the full osd's I started them again. After a
>>>>> lot of deep scrubbing and two pg inconsistencies which needed to be 
>>>>> repaired
>>>>> everything looked fine except the mds which still is in the replay state 
>>>>> and
>>>>> it stays that way.
>>>>> The log below says that mds need osdmap epoch 1833 and have 1832.
>>>>>
>>>>> 2014-08-18 12:29:22.268248 7fa786182700  1 mds.-1.0 handle_mds_map standby
>>>>> 2014-08-18 12:29:22.273995 7fa786182700  1 mds.0.25 handle_mds_map i am 
>>>>> now
>>>>> mds.0.25
>>>>> 2014-08-18 12:29:22.273998 7fa786182700  1 mds.0.25 handle_mds_map state
>>>>> change up:standby --> up:replay
>>>>> 2014-08-18 12:29:22.274000 7fa786182700  1 mds.0.25 replay_start
>>>>> 2014-08-18 12:29:22.274014 7fa786182700  1 mds.0.25  recovery set is
>>>>> 2014-08-18 12:29:22.274016 7fa786182700  1 mds.0.25  need osdmap epoch 
>>>>> 1833,
>>>>> have 1832
>>>>> 2014-08-18 12:29:22.274017 7fa786182700  1 mds.0.25  waiting for osdmap 
>>>>> 1833
>>>>> (which blacklists prior instance)
>>>>>
>>>>>  # ceph status
>>>>>     cluster c78209f5-55ea-4c70-8968-2231d2b05560
>>>>>      health HEALTH_WARN mds cluster is degraded
>>>>>      monmap e3: 3 mons at
>>>>> {th1-mon001=10.1.2.21:6789/0,th1-mon002=10.1.2.22:6789/0,th1-mon003=10.1.2.23:6789/0},
>>>>> election epoch 362, quorum 0,1,2 th1-mon001,th1-mon002,th1-mon003
>>>>>      mdsmap e154: 1/1/1 up {0=th1-mon001=up:replay}, 1 up:standby
>>>>>      osdmap e1951: 12 osds: 12 up, 12 in
>>>>>       pgmap v193685: 492 pgs, 4 pools, 60297 MB data, 470 kobjects
>>>>>             124 GB used, 175 GB / 299 GB avail
>>>>>                  492 active+clean
>>>>>
>>>>> # ceph osd tree
>>>>> # id    weight    type name    up/down    reweight
>>>>> -1    0.2399    root default
>>>>> -2    0.05997        host th1-osd001
>>>>> 0    0.01999            osd.0    up    1
>>>>> 1    0.01999            osd.1    up    1
>>>>> 2    0.01999            osd.2    up    1
>>>>> -3    0.05997        host th1-osd002
>>>>> 3    0.01999            osd.3    up    1
>>>>> 4    0.01999            osd.4    up    1
>>>>> 5    0.01999            osd.5    up    1
>>>>> -4    0.05997        host th1-mon003
>>>>> 6    0.01999            osd.6    up    1
>>>>> 7    0.01999            osd.7    up    1
>>>>> 8    0.01999            osd.8    up    1
>>>>> -5    0.05997        host th1-mon002
>>>>> 9    0.01999            osd.9    up    1
>>>>> 10    0.01999            osd.10    up    1
>>>>> 11    0.01999            osd.11    up    1
>>>>>
>>>>> What is the way to get the mds up and running again?
>>>>>
>>>>> I still have all the placement group directories which I moved from the 
>>>>> full
>>>>> osds which where down to create disk space.
>>>>
>>>> Try just restarting the MDS daemon. This sounds a little familiar so I
>>>> think it's a known bug which may be fixed in a later dev or point
>>>> release on the MDS, but it's a soft-state rather than a disk state
>>>> issue.
>>>> -Greg
>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] mds isn't working anymore after osd's running full

Reply via email to