Hi Luke,

(copying list back in)

You should stop all MDS services before attempting to use
--reset-journal (but make sure mons and OSDs are running).  The status
of the mds map shouldn't make a difference.

John

On Tue, Mar 18, 2014 at 5:23 PM, Luke Jing Yuan <jyl...@mimos.my> wrote:
> Hi John,
>
> I noticed that while we was running with the --reset-journal option, ceph.log 
> keep show something like the following lines:
>
> 2014-03-19 01:17:09.977892 mon.0 10.4.118.21:6789/0 192 : [INF] mdsmap 
> e29851: 1/1/1 up {0=mon01=up:replay(laggy or crashed)}
>
> And the mdsmap epock just keep increasing, is this what we should be 
> expecting? Also should we consider using "ceph mds" command to fail the mds 
> before the running with --reset-journal.
>
> Apologize for being asking so many times. Thanks in advance.
>
> Regards,
> Luke
>
> -----Original Message-----
> From: John Spray [mailto:john.sp...@inktank.com]
> Sent: Tuesday, 18 March, 2014 8:12 PM
> To: Luke Jing Yuan
> Cc: Wong Ming Tat; Mohd Bazli Ab Karim
> Subject: Re: [ceph-users] Ceph MDS replaying journal
>
> That command should be almost instant, so it sounds like it has become stuck. 
>  Run with "-d --debug-mds=20" to get more output.  One way it can get stuck 
> is if the "-i" argument doesn't correspond to the host you're running on, in 
> which case it gets stuck trying to find keys.  I put "-i mon0" in the example 
> command because that looked like the host you were running on, but perhaps 
> you're running from somewhere else.
>
> I must emphasize that this is all very unsupported.  If you have data that is 
> critical for your users you should preferably restore it from backups.
>
> John
>
> On Tue, Mar 18, 2014 at 12:04 PM, Luke Jing Yuan <jyl...@mimos.my> wrote:
>> Hi John,
>>
>> We are using the 2nd option you mentioned, but after more than 10hours of 
>> running we have no idea whether its working nor when it will complete. Are 
>> there any way for us to further monitor the progress? We dare not use the 
>> newfs option as there are data that are critical to our user. Kindly advice.
>>
>> Thanks.
>>
>> Regards,
>> Luke
>>
>> -----Original Message-----
>> From: ceph-users-boun...@lists.ceph.com
>> [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Luke Jing Yuan
>> Sent: Tuesday, 18 March, 2014 2:33 PM
>> To: John Spray
>> Cc: ceph-users@lists.ceph.com
>> Subject: Re: [ceph-users] Ceph MDS replaying journal
>>
>> Hi John,
>>
>> Is there a way for us to verify that step 2 is working properly? We are 
>> seeing the process running for almost 4 hours but there is no indication 
>> when it will end. Thanks.
>>
>> Regards,
>> Luke
>>
>> -----Original Message-----
>> From: John Spray [mailto:john.sp...@inktank.com]
>> Sent: Tuesday, 18 March, 2014 5:13 AM
>> To: Luke Jing Yuan
>> Cc: Wong Ming Tat; ceph-users@lists.ceph.com
>> Subject: Re: [ceph-users] Ceph MDS replaying journal
>>
>> Thanks for sending the logs so quickly.
>>
>> 626 2014-03-18 00:58:01.009623 7fba5cbbe700 10 mds.0.journal 
>> EMetaBlob.replay sessionmap v8632368 -(1|2) == table 7235981 prealloc
>> [1000041df86~1] used     1000041db9e
>> 627 2014-03-18 00:58:01.009627 7fba5cbbe700 20 mds.0.journal  (session
>> prealloc [10000373451~3e8])
>> 628 2014-03-18 00:58:01.010696 7fba5cbbe700 -1 mds/journal.cc: In function 
>> 'void EMetaBlob::replay(MDS*, LogSegment*, MDSlaveUpdate*)'
>> thread 7fba5cbbe    700 time 2014-03-18 00:58:01.009644
>>
>> The first line indicates that the version of SessionMap loaded from disk is 
>> 7235981 while the version updated in the journal is 8632368.
>> The difference is much larger than one would expect, as we are only a few 
>> events into the journal at the point of the failure.  The assertion is 
>> checking that the inode claimed by the journal is in the range allocated to 
>> the client session, and it is failing because the stale sessionmap version 
>> is in use.
>>
>> In version 0.72.2, there was a bug in the MDS that caused failures to
>> write the SessionMap object to disk to be ignored.  This could result
>> in a situation where there is an inconsistency between the contents of
>> the log and the contents of the SessionMap object.  A check was added
>> to avoid this in the latest code (b0dce8a0)
>>
>> In a future release we will be adding tools for repairing damaged systems in 
>> cases like this, but at the moment your options are quite limited.
>>  * If the data is replaceable then you might simply use "ceph mds newfs" to 
>> start from scratch.
>>  * If you can cope with losing some of the most recent modifications but 
>> keeping most of the filesystem, you could try the experimental journal reset 
>> function:
>>      ceph-mds -i mon0 -d  --reset-journal 0
>>    This is destructive: it will discard any metadata updates that have been 
>> written to the journal but not to the backing store.  However, it is less 
>> destructive than newfs.  It may crash when it completes, look for output 
>> like this at the beginning before any stack trace to indicate success:
>>    writing journal head
>>    writing EResetJournal entry
>>    done
>>
>> We are looking forward to making the MDS and associated tools more resilient 
>> ahead of making the filesystem a fully supported part of ceph.
>>
>> John
>>
>> On Mon, Mar 17, 2014 at 5:09 PM, Luke Jing Yuan <jyl...@mimos.my> wrote:
>>> Hi John,
>>>
>>> Thanks for responding to our issues, attached is the ceph.log file as per 
>>> request. As for the ceph-mds.log, I will have to send it in 3 parts later 
>>> due to our SMTP server's policy.
>>>
>>> Regards,
>>> Luke
>>>
>>> -----Original Message-----
>>> From: ceph-users-boun...@lists.ceph.com
>>> [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of John Spray
>>> Sent: Tuesday, 18 March, 2014 12:57 AM
>>> To: Wong Ming Tat
>>> Cc: ceph-users@lists.ceph.com
>>> Subject: Re: [ceph-users] Ceph MDS replaying journal
>>>
>>> Clarification: in step 1, stop the MDS service on *all* MDS servers (I 
>>> notice there are standby daemons in the "ceph status" output).
>>>
>>> John
>>>
>>> On Mon, Mar 17, 2014 at 4:45 PM, John Spray <john.sp...@inktank.com> wrote:
>>>> Hello,
>>>>
>>>> To understand what's gone wrong here, we'll need to increase the
>>>> verbosity of the logging from the MDS service and then trying
>>>> starting it again.
>>>>
>>>> 1. Stop the MDS service (on ubuntu this would be "stop
>>>> ceph-mds-all") 2. Move your old log file away so that we will have a
>>>> fresh one mv /var/log/ceph/ceph-mds.mon01.log
>>>> /var/log/ceph/ceph-mds.mon01.log.old
>>>> 3. Start the mds service manually (so that it just tries once
>>>> instead of flapping):
>>>> ceph-mds -i mon01 -f --debug-mds=20 --debug-journaler=10
>>>>
>>>> The resulting log file may be quite big so you may want to gzip it
>>>> before sending it to the list.
>>>>
>>>> In addition to the MDS log, please attach your cluster log
>>>> (/var/log/ceph/ceph.log).
>>>>
>>>> Thanks,
>>>> John
>>>>
>>>> On Mon, Mar 17, 2014 at 7:02 AM, Wong Ming Tat <mt.w...@mimos.my> wrote:
>>>>> Hi,
>>>>>
>>>>>
>>>>>
>>>>> I receive the MDS replaying journal error as below.
>>>>>
>>>>> Hope anyone can give some information to solve this problem.
>>>>>
>>>>>
>>>>>
>>>>> # ceph health detail
>>>>>
>>>>> HEALTH_WARN mds cluster is degraded
>>>>>
>>>>> mds cluster is degraded
>>>>>
>>>>> mds.mon01 at x.x.x.x:6800/26426 rank 0 is replaying journal
>>>>>
>>>>>
>>>>>
>>>>> # ceph -s
>>>>>
>>>>>     cluster xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>>>>
>>>>>      health HEALTH_WARN mds cluster is degraded
>>>>>
>>>>>      monmap e1: 3 mons at
>>>>> {mon01=x.x.x.x:6789/0,mon02=x.x.x.y:6789/0,mon03=x.x.x.z:6789/0},
>>>>> election epoch 1210, quorum 0,1,2 mon01,mon02,mon03
>>>>>
>>>>>      mdsmap e17020: 1/1/1 up {0=mon01=up:replay}, 2 up:standby
>>>>>
>>>>>      osdmap e20195: 24 osds: 24 up, 24 in
>>>>>
>>>>>       pgmap v1424671: 3300 pgs, 6 pools, 793 GB data, 3284 kobjects
>>>>>
>>>>>             1611 GB used, 87636 GB / 89248 GB avail
>>>>>
>>>>>                 3300 active+clean
>>>>>
>>>>>   client io 2750 kB/s rd, 0 op/s
>>>>>
>>>>>
>>>>>
>>>>> # cat /var/log/ceph/ceph-mds.mon01.log
>>>>>
>>>>> 2014-03-16 18:40:41.894404 7f0f2875c700  0 mds.0.server
>>>>> handle_client_file_setlock: start: 0, length: 0, client: 324186, pid:
>>>>> 30684,
>>>>> pid_ns: 18446612141968944256, type: 4
>>>>>
>>>>>
>>>>>
>>>>> 2014-03-16 18:49:09.993985 7f0f24645700  0 -- x.x.x.x:6801/3739 >>
>>>>> y.y.y.y:0/1662262473 pipe(0x728d2780 sd=26 :6801 s=0 pgs=0 cs=0 l=0
>>>>> c=0x100adc6e0).accept peer addr is really y.y.y.y:0/1662262473
>>>>> (socket is
>>>>> y.y.y.y:33592/0)
>>>>>
>>>>> 2014-03-16 18:49:10.000197 7f0f24645700  0 -- x.x.x.x:6801/3739 >>
>>>>> y.y.y.y:0/1662262473 pipe(0x728d2780 sd=26 :6801 s=0 pgs=0 cs=0 l=0
>>>>> c=0x100adc6e0).accept connect_seq 0 vs existing 1 state standby
>>>>>
>>>>> 2014-03-16 18:49:10.000239 7f0f24645700  0 -- x.x.x.x:6801/3739 >>
>>>>> y.y.y.y:0/1662262473 pipe(0x728d2780 sd=26 :6801 s=0 pgs=0 cs=0 l=0
>>>>> c=0x100adc6e0).accept peer reset, then tried to connect to us,
>>>>> replacing
>>>>>
>>>>> 2014-03-16 18:49:10.550726 7f4c34671780  0 ceph version 0.72.2
>>>>> (a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mds, pid
>>>>> 13282
>>>>>
>>>>> 2014-03-16 18:49:10.826713 7f4c2f6f8700  1 mds.-1.0 handle_mds_map
>>>>> standby
>>>>>
>>>>> 2014-03-16 18:49:10.984992 7f4c2f6f8700  1 mds.0.14 handle_mds_map
>>>>> i am now
>>>>> mds.0.14
>>>>>
>>>>> 2014-03-16 18:49:10.985010 7f4c2f6f8700  1 mds.0.14 handle_mds_map
>>>>> state change up:standby --> up:replay
>>>>>
>>>>> 2014-03-16 18:49:10.985017 7f4c2f6f8700  1 mds.0.14 replay_start
>>>>>
>>>>> 2014-03-16 18:49:10.985024 7f4c2f6f8700  1 mds.0.14  recovery set
>>>>> is
>>>>>
>>>>> 2014-03-16 18:49:10.985027 7f4c2f6f8700  1 mds.0.14  need osdmap
>>>>> epoch 3446, have 3445
>>>>>
>>>>> 2014-03-16 18:49:10.985030 7f4c2f6f8700  1 mds.0.14  waiting for
>>>>> osdmap 3446 (which blacklists prior instance)
>>>>>
>>>>> 2014-03-16 18:49:16.945500 7f4c2f6f8700  0 mds.0.cache creating
>>>>> system inode with ino:100
>>>>>
>>>>> 2014-03-16 18:49:16.945747 7f4c2f6f8700  0 mds.0.cache creating
>>>>> system inode with ino:1
>>>>>
>>>>> 2014-03-16 18:49:17.358681 7f4c2b5e1700 -1 mds/journal.cc: In
>>>>> function 'void EMetaBlob::replay(MDS*, LogSegment*, MDSlaveUpdate*)'
>>>>> thread 7f4c2b5e1700 time 2014-03-16 18:49:17.356336
>>>>>
>>>>> mds/journal.cc: 1316: FAILED assert(i == used_preallocated_ino)
>>>>>
>>>>>
>>>>>
>>>>> ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60)
>>>>>
>>>>> 1: (EMetaBlob::replay(MDS*, LogSegment*, MDSlaveUpdate*)+0x7587)
>>>>> [0x5af5e7]
>>>>>
>>>>> 2: (EUpdate::replay(MDS*)+0x3a) [0x5b67ea]
>>>>>
>>>>> 3: (MDLog::_replay_thread()+0x678) [0x79dbb8]
>>>>>
>>>>> 4: (MDLog::ReplayThread::entry()+0xd) [0x58bded]
>>>>>
>>>>> 5: (()+0x7e9a) [0x7f4c33a96e9a]
>>>>>
>>>>> 6: (clone()+0x6d) [0x7f4c3298b3fd]
>>>>>
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Wong Ming Tat
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ________________________________
>>>>> DISCLAIMER:
>>>>>
>>>>> This e-mail (including any attachments) is for the addressee(s)
>>>>> only and may be confidential, especially as regards personal data.
>>>>> If you are not the intended recipient, please note that any
>>>>> dealing, review, distribution, printing, copying or use of this
>>>>> e-mail is strictly prohibited. If you have received this email in
>>>>> error, please notify the sender immediately and delete the original 
>>>>> message (including any attachments).
>>>>>
>>>>>
>>>>> MIMOS Berhad is a research and development institution under the
>>>>> purview of the Malaysian Ministry of Science, Technology and
>>>>> Innovation. Opinions, conclusions and other information in this
>>>>> e-mail that do not relate to the official business of MIMOS Berhad
>>>>> and/or its subsidiaries shall be understood as neither given nor
>>>>> endorsed by MIMOS Berhad and/or its subsidiaries and neither MIMOS
>>>>> Berhad nor its subsidiaries accepts responsibility for the same.
>>>>> All liability arising from or in connection with computer viruses
>>>>> and/or corrupted e-mails is excluded to the fullest extent permitted by 
>>>>> law.
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@lists.ceph.com
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>> ________________________________
>>> DISCLAIMER:
>>>
>>>
>>> This e-mail (including any attachments) is for the addressee(s) only and 
>>> may be confidential, especially as regards personal data. If you are not 
>>> the intended recipient, please note that any dealing, review, distribution, 
>>> printing, copying or use of this e-mail is strictly prohibited. If you have 
>>> received this email in error, please notify the sender immediately and 
>>> delete the original message (including any attachments).
>>>
>>>
>>> MIMOS Berhad is a research and development institution under the purview of 
>>> the Malaysian Ministry of Science, Technology and Innovation. Opinions, 
>>> conclusions and other information in this e-mail that do not relate to the 
>>> official business of MIMOS Berhad and/or its subsidiaries shall be 
>>> understood as neither given nor endorsed by MIMOS Berhad and/or its 
>>> subsidiaries and neither MIMOS Berhad nor its subsidiaries accepts 
>>> responsibility for the same. All liability arising from or in connection 
>>> with computer viruses and/or corrupted e-mails is excluded to the fullest 
>>> extent permitted by law.
>>>
>>>
>>> ------------------------------------------------------------------
>>> -
>>> -
>>> DISCLAIMER:
>>>
>>> This e-mail (including any attachments) is for the addressee(s) only
>>> and may contain confidential information. If you are not the intended
>>> recipient, please note that any dealing, review, distribution,
>>> printing, copying or use of this e-mail is strictly prohibited. If
>>> you have received this email in error, please notify the sender
>>> immediately and delete the original message.
>>> MIMOS Berhad is a research and development institution under the
>>> purview of the Malaysian Ministry of Science, Technology and
>>> Innovation. Opinions, conclusions and other information in this e-
>>> mail that do not relate to the official business of MIMOS Berhad
>>> and/or its subsidiaries shall be understood as neither given nor
>>> endorsed by MIMOS Berhad and/or its subsidiaries and neither MIMOS
>>> Berhad nor its subsidiaries accepts responsibility for the same. All
>>> liability arising from or in connection with computer viruses and/or
>>> corrupted e-mails is excluded to the fullest extent permitted by law.
>>>
>>
>> ________________________________
>> DISCLAIMER:
>>
>>
>> This e-mail (including any attachments) is for the addressee(s) only and may 
>> be confidential, especially as regards personal data. If you are not the 
>> intended recipient, please note that any dealing, review, distribution, 
>> printing, copying or use of this e-mail is strictly prohibited. If you have 
>> received this email in error, please notify the sender immediately and 
>> delete the original message (including any attachments).
>>
>>
>> MIMOS Berhad is a research and development institution under the purview of 
>> the Malaysian Ministry of Science, Technology and Innovation. Opinions, 
>> conclusions and other information in this e-mail that do not relate to the 
>> official business of MIMOS Berhad and/or its subsidiaries shall be 
>> understood as neither given nor endorsed by MIMOS Berhad and/or its 
>> subsidiaries and neither MIMOS Berhad nor its subsidiaries accepts 
>> responsibility for the same. All liability arising from or in connection 
>> with computer viruses and/or corrupted e-mails is excluded to the fullest 
>> extent permitted by law.
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>> ________________________________
>> DISCLAIMER:
>>
>>
>> This e-mail (including any attachments) is for the addressee(s) only and may 
>> be confidential, especially as regards personal data. If you are not the 
>> intended recipient, please note that any dealing, review, distribution, 
>> printing, copying or use of this e-mail is strictly prohibited. If you have 
>> received this email in error, please notify the sender immediately and 
>> delete the original message (including any attachments).
>>
>>
>> MIMOS Berhad is a research and development institution under the purview of 
>> the Malaysian Ministry of Science, Technology and Innovation. Opinions, 
>> conclusions and other information in this e-mail that do not relate to the 
>> official business of MIMOS Berhad and/or its subsidiaries shall be 
>> understood as neither given nor endorsed by MIMOS Berhad and/or its 
>> subsidiaries and neither MIMOS Berhad nor its subsidiaries accepts 
>> responsibility for the same. All liability arising from or in connection 
>> with computer viruses and/or corrupted e-mails is excluded to the fullest 
>> extent permitted by law.
>>
>> ------------------------------------------------------------------
>> -
>> -
>> DISCLAIMER:
>>
>> This e-mail (including any attachments) is for the addressee(s) only
>> and may contain confidential information. If you are not the intended
>> recipient, please note that any dealing, review, distribution,
>> printing, copying or use of this e-mail is strictly prohibited. If you
>> have received this email in error, please notify the sender
>> immediately and delete the original message.
>> MIMOS Berhad is a research and development institution under the
>> purview of the Malaysian Ministry of Science, Technology and
>> Innovation. Opinions, conclusions and other information in this e-
>> mail that do not relate to the official business of MIMOS Berhad
>> and/or its subsidiaries shall be understood as neither given nor
>> endorsed by MIMOS Berhad and/or its subsidiaries and neither MIMOS
>> Berhad nor its subsidiaries accepts responsibility for the same. All
>> liability arising from or in connection with computer viruses and/or
>> corrupted e-mails is excluded to the fullest extent permitted by law.
>>
>>
>
> ________________________________
> DISCLAIMER:
>
>
> This e-mail (including any attachments) is for the addressee(s) only and may 
> be confidential, especially as regards personal data. If you are not the 
> intended recipient, please note that any dealing, review, distribution, 
> printing, copying or use of this e-mail is strictly prohibited. If you have 
> received this email in error, please notify the sender immediately and delete 
> the original message (including any attachments).
>
>
> MIMOS Berhad is a research and development institution under the purview of 
> the Malaysian Ministry of Science, Technology and Innovation. Opinions, 
> conclusions and other information in this e-mail that do not relate to the 
> official business of MIMOS Berhad and/or its subsidiaries shall be understood 
> as neither given nor endorsed by MIMOS Berhad and/or its subsidiaries and 
> neither MIMOS Berhad nor its subsidiaries accepts responsibility for the 
> same. All liability arising from or in connection with computer viruses 
> and/or corrupted e-mails is excluded to the fullest extent permitted by law.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to