Re: [ceph-users] mds isn't working anymore after osd's running full

Jasper Siero Thu, 16 Oct 2014 03:29:24 -0700

Hi John,

Thanks I will look into it. Is there already a new Giant release date?


Jasper
________________________________________
Van: john.sp...@inktank.com [john.sp...@inktank.com] namens John Spray 
[john.sp...@redhat.com]
Verzonden: donderdag 16 oktober 2014 12:23
Aan: Jasper Siero
CC: Gregory Farnum; ceph-users
Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running full

Following up: firefly fix for undump is: https://github.com/ceph/ceph/pull/2734

Jasper: if you still need to try undumping on this existing firefly
cluster, then you can download ceph-mds packages from this
wip-firefly-undump branch from
http://gitbuilder.ceph.com/ceph-deb-precise-x86_64-basic/ref/

Cheers,
John

On Wed, Oct 15, 2014 at 8:15 PM, John Spray <john.sp...@redhat.com> wrote:
> Sadly undump has been broken for quite some time (it was fixed in
> giant as part of creating cephfs-journal-tool).  If there's a one line
> fix for this then it's probably worth putting in firefly since it's a
> long term supported branch -- I'll do that now.
>
> John
>
> On Wed, Oct 15, 2014 at 8:23 AM, Jasper Siero
> <jasper.si...@target-holding.nl> wrote:
>> Hello Greg,
>>
>> The dump and reset of the journal was succesful:
>>
>> [root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --pid-file 
>> /var/run/ceph/mds.th1-mon001.pid -c /etc/ceph/ceph.conf --cluster ceph 
>> --dump-journal 0 journaldumptgho-mon001
>> journal is 9483323613~134215459
>> read 134213311 bytes at offset 9483323613
>> wrote 134213311 bytes at offset 9483323613 to journaldumptgho-mon001
>> NOTE: this is a _sparse_ file; you can
>>         $ tar cSzf journaldumptgho-mon001.tgz journaldumptgho-mon001
>>       to efficiently compress it while preserving sparseness.
>>
>> [root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --pid-file 
>> /var/run/ceph/mds.th1-mon001.pid -c /etc/ceph/ceph.conf --cluster ceph 
>> --reset-journal 0
>> old journal was 9483323613~134215459
>> new journal start will be 9621733376 (4194304 bytes past old end)
>> writing journal head
>> writing EResetJournal entry
>> done
>>
>>
>> Undumping the journal was not successful and looking into the error 
>> "client_lock.is_locked()" is showed several times. The mds is not running 
>> when I start the undumping so maybe have forgot something?
>>
>> [root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --pid-file 
>> /var/run/ceph/mds.th1-mon001.pid -c /etc/ceph/ceph.conf --cluster ceph 
>> --undump-journal 0 journaldumptgho-mon001
>> undump journaldumptgho-mon001
>> start 9483323613 len 134213311
>> writing header 200.00000000
>> osdc/Objecter.cc: In function 'ceph_tid_t 
>> Objecter::op_submit(Objecter::Op*)' thread 7fec3e5ad7a0 time 2014-10-15 
>> 09:09:32.020287
>> osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked())
>>  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
>>  1: /usr/bin/ceph-mds() [0x80f15e]
>>  2: (Dumper::undump(char const*)+0x65d) [0x56c7ad]
>>  3: (main()+0x1632) [0x569c62]
>>  4: (__libc_start_main()+0xfd) [0x7fec3ca68d5d]
>>  5: /usr/bin/ceph-mds() [0x567d99]
>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
>> interpret this.
>> 2014-10-15 09:09:32.021313 7fec3e5ad7a0 -1 osdc/Objecter.cc: In function 
>> 'ceph_tid_t Objecter::op_submit(Objecter::Op*)' thread 7fec3e5ad7a0 time 
>> 2014-10-15 09:09:32.020287
>> osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked())
>>
>>  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
>>  1: /usr/bin/ceph-mds() [0x80f15e]
>>  2: (Dumper::undump(char const*)+0x65d) [0x56c7ad]
>>  3: (main()+0x1632) [0x569c62]
>>  4: (__libc_start_main()+0xfd) [0x7fec3ca68d5d]
>>  5: /usr/bin/ceph-mds() [0x567d99]
>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
>> interpret this.
>>
>>      0> 2014-10-15 09:09:32.021313 7fec3e5ad7a0 -1 osdc/Objecter.cc: In 
>> function 'ceph_tid_t Objecter::op_submit(Objecter::Op*)' thread 7fec3e5ad7a0 
>> time 2014-10-15 09:09:32.020287
>> osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked())
>>
>>  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c
>> [root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --p8a65c2c0feba6)
>>  1: /usr/bin/ceph-mds() [0x80f15e]
>>  2: (Dumper::undump(char const*)+0x65d) [0x56c7ad]
>>  3: (main()+0x1632) [0x569c62]
>>  4: (__libc_start_main()+0xfd) [0x7fec3ca68d5d]
>>  5: /usr/bin/ceph-mds() [0x567d99]
>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
>> interpret this.
>>
>> terminate called after throwing an instance of 'ceph::FailedAssertion'
>> *** Caught signal (Aborted) **
>>  in thread 7fec3e5ad7a0
>>  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
>>  1: /usr/bin/ceph-mds() [0x82ef61]
>>  2: (()+0xf710) [0x7fec3d9a6710]
>>  3: (gsignal()+0x35) [0x7fec3ca7c635]
>>  4: (abort()+0x175) [0x7fec3ca7de15]
>>  5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x7fec3d336a5d]
>>  6: (()+0xbcbe6) [0x7fec3d334be6]
>>  7: (()+0xbcc13) [0x7fec3d334c13]
>>  8: (()+0xbcd0e) [0x7fec3d334d0e]
>>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
>> const*)+0x7f2) [0x94b812]
>>  10: /usr/bin/ceph-mds() [0x80f15e]
>>  11: (Dumper::undump(char const*)+0x65d) [0x56c7ad]
>>  12: (main()+0x1632) [0x569c62]
>>  13: (__libc_start_main()+0xfd) [0x7fec3ca68d5d]
>>  14: /usr/bin/ceph-mds() [0x567d99]
>> 2014-10-15 09:09:32.024248 7fec3e5ad7a0 -1 *** Caught signal (Aborted) **
>>  in thread 7fec3e5ad7a0
>>
>>  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
>>  1: /usr/bin/ceph-mds() [0x82ef61]
>>  2: (()+0xf710) [0x7fec3d9a6710]
>>  3: (gsignal()+0x35) [0x7fec3ca7c635]
>>  4: (abort()+0x175) [0x7fec3ca7de15]
>>  5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x7fec3d336a5d]
>>  6: (()+0xbcbe6) [0x7fec3d334be6]
>>  7: (()+0xbcc13) [0x7fec3d334c13]
>>  8: (()+0xbcd0e) [0x7fec3d334d0e]
>>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
>> const*)+0x7f2) [0x94b812]
>>  10: /usr/bin/ceph-mds() [0x80f15e]
>>  11: (Dumper::undump(char const*)+0x65d) [0x56c7ad]
>>  12: (main()+0x1632) [0x569c62]
>>  13: (__libc_start_main()+0xfd) [0x7fec3ca68d5d]
>>  14: /usr/bin/ceph-mds() [0x567d99]
>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
>> interpret this.
>>
>>      0> 2014-10-15 09:09:32.024248 7fec3e5ad7a0 -1 *** Caught signal 
>> (Aborted) **
>>  in thread 7fec3e5ad7a0
>>
>>  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
>>  1: /usr/bin/ceph-mds() [0x82ef61]
>>  2: (()+0xf710) [0x7fec3d9a6710]
>>  3: (gsignal()+0x35) [0x7fec3ca7c635]
>>  4: (abort()+0x175) [0x7fec3ca7de15]
>>  5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x7fec3d336a5d]
>>  6: (()+0xbcbe6) [0x7fec3d334be6]
>>  7: (()+0xbcc13) [0x7fec3d334c13]
>>  8: (()+0xbcd0e) [0x7fec3d334d0e]
>>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
>> const*)+0x7f2) [0x94b812]
>>  10: /usr/bin/ceph-mds() [0x80f15e]
>>  11: (Dumper::undump(char const*)+0x65d) [0x56c7ad]
>>  12: (main()+0x1632) [0x569c62]
>>  13: (__libc_start_main()+0xfd) [0x7fec3ca68d5d]
>>  14: /usr/bin/ceph-mds() [0x567d99]
>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
>> interpret this.
>>
>> Aborted
>>
>> Jasper
>> ________________________________________
>> Van: Gregory Farnum [g...@inktank.com]
>> Verzonden: dinsdag 14 oktober 2014 23:40
>> Aan: Jasper Siero
>> CC: ceph-users
>> Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running 
>> full
>>
>> ceph-mds --undump-journal <rank> <journal-file>
>> Looks like it accidentally (or on purpose? you can break things with
>> it) got left out of the help text.
>>
>> On Tue, Oct 14, 2014 at 8:19 AM, Jasper Siero
>> <jasper.si...@target-holding.nl> wrote:
>>> Hello Greg,
>>>
>>> I dumped the journal successful to a file:
>>>
>>> journal is 9483323613~134215459
>>> read 134213311 bytes at offset 9483323613
>>> wrote 134213311 bytes at offset 9483323613 to journaldumptgho
>>> NOTE: this is a _sparse_ file; you can
>>>         $ tar cSzf journaldumptgho.tgz journaldumptgho
>>>       to efficiently compress it while preserving sparseness.
>>>
>>> I see the option for resetting the mds journal but I can't find the option 
>>> for undumping /importing the journal:
>>>
>>>  usage: ceph-mds -i name [flags] [[--journal_check 
>>> rank]|[--hot-standby][rank]]
>>>   -m monitorip:port
>>>         connect to monitor at given address
>>>   --debug_mds n
>>>         debug MDS level (e.g. 10)
>>>   --dump-journal rank filename
>>>         dump the MDS journal (binary) for rank.
>>>   --dump-journal-entries rank filename
>>>         dump the MDS journal (JSON) for rank.
>>>   --journal-check rank
>>>         replay the journal for rank, then exit
>>>   --hot-standby rank
>>>         start up as a hot standby for rank
>>>   --reset-journal rank
>>>         discard the MDS journal for rank, and replace it with a single
>>>         event that updates/resets inotable and sessionmap on replay.
>>>
>>> Do you know how to "undump" the journal back into ceph?
>>>
>>> Jasper
>>>
>>> ________________________________________
>>> Van: Gregory Farnum [g...@inktank.com]
>>> Verzonden: vrijdag 10 oktober 2014 23:45
>>> Aan: Jasper Siero
>>> CC: ceph-users
>>> Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running 
>>> full
>>>
>>> Ugh, "debug journaler", not "debug journaled."
>>>
>>> That said, the filer output tells me that you're missing an object out
>>> of the MDS log. (200.000008f5) I think this issue should be resolved
>>> if you "dump" the journal to a file, "reset" it, and then "undump" it.
>>> (These are commands you can invoke from ceph-mds.)
>>> I haven't done this myself in a long time, so there may be some hard
>>> edges around it. In particular, I'm not sure if the dumped journal
>>> file will stop when the data stops, or if it will be a little too
>>> long. If so, we can fix that by truncating the dumped file to the
>>> proper length and resetting and undumping again.
>>> (And just to harp on it, this journal manipulation is a lot simpler in
>>> Giant... ;) )
>>> -Greg
>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>
>>> On Wed, Oct 8, 2014 at 7:11 AM, Jasper Siero
>>> <jasper.si...@target-holding.nl> wrote:
>>>> Hello Greg,
>>>>
>>>> No problem thanks for looking into the log. I attached the log to this 
>>>> email.
>>>> I'm looking forward for the new release because it would be nice to have 
>>>> more possibilities to diagnose problems.
>>>>
>>>> Kind regards,
>>>>
>>>> Jasper Siero
>>>> ________________________________________
>>>> Van: Gregory Farnum [g...@inktank.com]
>>>> Verzonden: dinsdag 7 oktober 2014 19:45
>>>> Aan: Jasper Siero
>>>> CC: ceph-users
>>>> Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running 
>>>> full
>>>>
>>>> Sorry; I guess this fell off my radar.
>>>>
>>>> The issue here is not that it's waiting for an osdmap; it got the
>>>> requested map and went into replay mode almost immediately. In fact
>>>> the log looks good except that it seems to finish replaying the log
>>>> and then simply fail to transition into active. Generate a new one,
>>>> adding in "debug journaled = 20" and "debug filer = 20", and we can
>>>> probably figure out how to fix it.
>>>> (This diagnosis is much easier in the upcoming Giant!)
>>>> -Greg
>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>>
>>>>
>>>> On Tue, Oct 7, 2014 at 7:55 AM, Jasper Siero
>>>> <jasper.si...@target-holding.nl> wrote:
>>>>> Hello Gregory,
>>>>>
>>>>> We still have the same problems with our test ceph cluster and didn't 
>>>>> receive a reply from you after I send you the requested log files. Do you 
>>>>> know if it's possible to get our cephfs filesystem working again or is it 
>>>>> better to give up the files on cephfs and start over again?
>>>>>
>>>>> We restarted the cluster serveral times but it's still degraded:
>>>>> [root@th1-mon001 ~]# ceph -w
>>>>>     cluster c78209f5-55ea-4c70-8968-2231d2b05560
>>>>>      health HEALTH_WARN mds cluster is degraded
>>>>>      monmap e3: 3 mons at 
>>>>> {th1-mon001=10.1.2.21:6789/0,th1-mon002=10.1.2.22:6789/0,th1-mon003=10.1.2.23:6789/0},
>>>>>  election epoch 432, quorum 0,1,2 th1-mon001,th1-mon002,th1-mon003
>>>>>      mdsmap e190: 1/1/1 up {0=th1-mon001=up:replay}, 1 up:standby
>>>>>      osdmap e2248: 12 osds: 12 up, 12 in
>>>>>       pgmap v197548: 492 pgs, 4 pools, 60297 MB data, 470 kobjects
>>>>>             124 GB used, 175 GB / 299 GB avail
>>>>>                  491 active+clean
>>>>>                    1 active+clean+scrubbing+deep
>>>>>
>>>>> One placement group stays in the deep scrubbing fase.
>>>>>
>>>>> Kind regards,
>>>>>
>>>>> Jasper Siero
>>>>>
>>>>>
>>>>> ________________________________________
>>>>> Van: Jasper Siero
>>>>> Verzonden: donderdag 21 augustus 2014 16:43
>>>>> Aan: Gregory Farnum
>>>>> Onderwerp: RE: [ceph-users] mds isn't working anymore after osd's running 
>>>>> full
>>>>>
>>>>> I did restart it but you are right about the epoch number which has 
>>>>> changed but the situation looks the same.
>>>>> 2014-08-21 16:33:06.032366 7f9b5f3cd700  1 mds.0.27  need osdmap epoch 
>>>>> 1994, have 1993
>>>>> 2014-08-21 16:33:06.032368 7f9b5f3cd700  1 mds.0.27  waiting for osdmap 
>>>>> 1994 (which blacklists
>>>>> prior instance)
>>>>> I started the mds with the debug options and attached the log.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Jasper
>>>>> ________________________________________
>>>>> Van: Gregory Farnum [g...@inktank.com]
>>>>> Verzonden: woensdag 20 augustus 2014 18:38
>>>>> Aan: Jasper Siero
>>>>> CC: ceph-users@lists.ceph.com
>>>>> Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running 
>>>>> full
>>>>>
>>>>> After restarting your MDS, it still says it has epoch 1832 and needs
>>>>> epoch 1833? I think you didn't really restart it.
>>>>> If the epoch numbers have changed, can you restart it with "debug mds
>>>>> = 20", "debug objecter = 20", "debug ms = 1" in the ceph.conf and post
>>>>> the resulting log file somewhere?
>>>>> -Greg
>>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>>>
>>>>>
>>>>> On Wed, Aug 20, 2014 at 12:49 AM, Jasper Siero
>>>>> <jasper.si...@target-holding.nl> wrote:
>>>>>> Unfortunately that doesn't help. I restarted both the active and standby 
>>>>>> mds but that doesn't change the state of the mds. Is there a way to 
>>>>>> force the mds to look at the 1832 epoch (or earlier) instead of 1833 
>>>>>> (need osdmap epoch 1833, have 1832)?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Jasper
>>>>>> ________________________________________
>>>>>> Van: Gregory Farnum [g...@inktank.com]
>>>>>> Verzonden: dinsdag 19 augustus 2014 19:49
>>>>>> Aan: Jasper Siero
>>>>>> CC: ceph-users@lists.ceph.com
>>>>>> Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's 
>>>>>> running full
>>>>>>
>>>>>> On Mon, Aug 18, 2014 at 6:56 AM, Jasper Siero
>>>>>> <jasper.si...@target-holding.nl> wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> We have a small ceph cluster running version 0.80.1 with cephfs on five
>>>>>>> nodes.
>>>>>>> Last week some osd's were full and shut itself down. To help de osd's 
>>>>>>> start
>>>>>>> again I added some extra osd's and moved some placement group 
>>>>>>> directories on
>>>>>>> the full osd's (which has a copy on another osd) to another place on the
>>>>>>> node (as mentioned in
>>>>>>> http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/)
>>>>>>> After clearing some space on the full osd's I started them again. After 
>>>>>>> a
>>>>>>> lot of deep scrubbing and two pg inconsistencies which needed to be 
>>>>>>> repaired
>>>>>>> everything looked fine except the mds which still is in the replay 
>>>>>>> state and
>>>>>>> it stays that way.
>>>>>>> The log below says that mds need osdmap epoch 1833 and have 1832.
>>>>>>>
>>>>>>> 2014-08-18 12:29:22.268248 7fa786182700  1 mds.-1.0 handle_mds_map 
>>>>>>> standby
>>>>>>> 2014-08-18 12:29:22.273995 7fa786182700  1 mds.0.25 handle_mds_map i am 
>>>>>>> now
>>>>>>> mds.0.25
>>>>>>> 2014-08-18 12:29:22.273998 7fa786182700  1 mds.0.25 handle_mds_map state
>>>>>>> change up:standby --> up:replay
>>>>>>> 2014-08-18 12:29:22.274000 7fa786182700  1 mds.0.25 replay_start
>>>>>>> 2014-08-18 12:29:22.274014 7fa786182700  1 mds.0.25  recovery set is
>>>>>>> 2014-08-18 12:29:22.274016 7fa786182700  1 mds.0.25  need osdmap epoch 
>>>>>>> 1833,
>>>>>>> have 1832
>>>>>>> 2014-08-18 12:29:22.274017 7fa786182700  1 mds.0.25  waiting for osdmap 
>>>>>>> 1833
>>>>>>> (which blacklists prior instance)
>>>>>>>
>>>>>>>  # ceph status
>>>>>>>     cluster c78209f5-55ea-4c70-8968-2231d2b05560
>>>>>>>      health HEALTH_WARN mds cluster is degraded
>>>>>>>      monmap e3: 3 mons at
>>>>>>> {th1-mon001=10.1.2.21:6789/0,th1-mon002=10.1.2.22:6789/0,th1-mon003=10.1.2.23:6789/0},
>>>>>>> election epoch 362, quorum 0,1,2 th1-mon001,th1-mon002,th1-mon003
>>>>>>>      mdsmap e154: 1/1/1 up {0=th1-mon001=up:replay}, 1 up:standby
>>>>>>>      osdmap e1951: 12 osds: 12 up, 12 in
>>>>>>>       pgmap v193685: 492 pgs, 4 pools, 60297 MB data, 470 kobjects
>>>>>>>             124 GB used, 175 GB / 299 GB avail
>>>>>>>                  492 active+clean
>>>>>>>
>>>>>>> # ceph osd tree
>>>>>>> # id    weight    type name    up/down    reweight
>>>>>>> -1    0.2399    root default
>>>>>>> -2    0.05997        host th1-osd001
>>>>>>> 0    0.01999            osd.0    up    1
>>>>>>> 1    0.01999            osd.1    up    1
>>>>>>> 2    0.01999            osd.2    up    1
>>>>>>> -3    0.05997        host th1-osd002
>>>>>>> 3    0.01999            osd.3    up    1
>>>>>>> 4    0.01999            osd.4    up    1
>>>>>>> 5    0.01999            osd.5    up    1
>>>>>>> -4    0.05997        host th1-mon003
>>>>>>> 6    0.01999            osd.6    up    1
>>>>>>> 7    0.01999            osd.7    up    1
>>>>>>> 8    0.01999            osd.8    up    1
>>>>>>> -5    0.05997        host th1-mon002
>>>>>>> 9    0.01999            osd.9    up    1
>>>>>>> 10    0.01999            osd.10    up    1
>>>>>>> 11    0.01999            osd.11    up    1
>>>>>>>
>>>>>>> What is the way to get the mds up and running again?
>>>>>>>
>>>>>>> I still have all the placement group directories which I moved from the 
>>>>>>> full
>>>>>>> osds which where down to create disk space.
>>>>>>
>>>>>> Try just restarting the MDS daemon. This sounds a little familiar so I
>>>>>> think it's a known bug which may be fixed in a later dev or point
>>>>>> release on the MDS, but it's a soft-state rather than a disk state
>>>>>> issue.
>>>>>> -Greg
>>>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] mds isn't working anymore after osd's running full

Reply via email to