Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc

2014-04-30 Thread Yan, Zheng
-- > From: Yan, Zheng [mailto:uker...@gmail.com] > Sent: Tuesday, April 29, 2014 10:13 PM > To: Mohd Bazli Ab Karim > Cc: Luke Jing Yuan; Wong Ming Tat > Subject: Re: [ceph-users] Ceph mds laggy and failed assert in function replay > mds/journal.cc > > On Tue, Apr 29, 2014 at

Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc

2014-04-30 Thread Mohd Bazli Ab Karim
n01 which is now > running as active as a single MDS at this moment. > After the MDS became ative, it did not send beacon to the monitor. It seems like the MDS was busy doing something else. If this issue still happen, set debug_mds=10 and send the log to me. Regards Yan, Zheng > Regards

Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc

2014-04-29 Thread Luke Jing Yuan
: Tuesday, 29 April, 2014 3:36 PM To: Jingyuan Luke Cc: ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com Subject: Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc On Tue, Apr 29, 2014 at 3:13 PM, Jingyuan Luke wrote: > Hi, > > Assuming we got MDS wor

Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc

2014-04-29 Thread Yan, Zheng
On Tue, Apr 29, 2014 at 3:13 PM, Jingyuan Luke wrote: > Hi, > > Assuming we got MDS working back on track, should we still leave the > mds_wipe_sessions in the ceph.conf or remove it and restart MDS. > Thanks. No. It has been several hours. the MDS still does not finish replaying the journal? R

Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc

2014-04-29 Thread Jingyuan Luke
Hi, Assuming we got MDS working back on track, should we still leave the mds_wipe_sessions in the ceph.conf or remove it and restart MDS. Thanks. Regards, Luke On Tue, Apr 29, 2014 at 2:12 PM, Yan, Zheng wrote: > On Tue, Apr 29, 2014 at 11:24 AM, Jingyuan Luke wrote: >> Hi, >> >> We had appli

Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc

2014-04-28 Thread Yan, Zheng
On Tue, Apr 29, 2014 at 11:24 AM, Jingyuan Luke wrote: > Hi, > > We had applied the patch and recompile ceph as well as updated the > ceph.conf as per suggested, when we re-run ceph-mds we noticed the > following: > > > 2014-04-29 10:45:22.260798 7f90b971d700 0 log [WRN] : replayed op > client.3

Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc

2014-04-28 Thread Jingyuan Luke
Hi, We had applied the patch and recompile ceph as well as updated the ceph.conf as per suggested, when we re-run ceph-mds we noticed the following: 2014-04-29 10:45:22.260798 7f90b971d700 0 log [WRN] : replayed op client.324186:51366457,12681393 no session for client.324186 2014-04-29 10:45:2

Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc

2014-04-26 Thread Yan, Zheng
On Sat, Apr 26, 2014 at 9:56 AM, Jingyuan Luke wrote: > Hi Greg, > > Actually our cluster is pretty empty, but we suspect we had a temporary > network disconnection to one of our OSD, not sure if this caused the > problem. > > Anyway we don't mind try the method you mentioned, how can we do that?

Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc

2014-04-26 Thread Jingyuan Luke
Hi Greg, Actually our cluster is pretty empty, but we suspect we had a temporary network disconnection to one of our OSD, not sure if this caused the problem. Anyway we don't mind try the method you mentioned, how can we do that? Regards, Luke On Saturday, April 26, 2014, Gregory Farnum wrote:

Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc

2014-04-25 Thread Luke Jing Yuan
...@vger.kernel.org; ceph-users@lists.ceph.com Subject: Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc Hmm, it looks like your on-disk SessionMap is horrendously out of date. Did your cluster get full at some point? In any case, we're working on tools to repair this no

Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc

2014-04-25 Thread Gregory Farnum
Hmm, it looks like your on-disk SessionMap is horrendously out of date. Did your cluster get full at some point? In any case, we're working on tools to repair this now but they aren't ready for use yet. Probably the only thing you could do is create an empty sessionmap with a higher version than t

[ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc

2014-04-24 Thread Mohd Bazli Ab Karim
Dear Ceph-devel, ceph-users, I am currently facing issue with my ceph mds server. Ceph-mds daemon does not want to bring up back. Tried running that manually with ceph-mds -i mon01 -d but it shows that it stucks at failed assert(session) line 1303 in mds/journal.cc and aborted. Can someone shed