On Tuesday, July 23, 2013 at 4:46 PM, pe...@2force.nl wrote:
> On 2013-07-22 18:20, Joao Eduardo Luis wrote:
> > On 07/22/2013 04:59 PM, pe...@2force.nl (mailto:pe...@2force.nl) wrote:
> > > Hi Joao,
> > >  
> > > I have sent you the link to the monitor files. I stopped one other
> > > monitor to have a consistent tarball but now it won't start, crashing
> > > with the same error message. I hope there is a trick to get it  
> > > working
> > > again because now I only have one monitor working and I don't want to
> > > end up losing data again (I had this happen once before).
> > >  
> >  
> >  
> > Thanks! This is the very next thing in my queue!
> >  
> > -Joao
>  
> Hi Joao,
>  
> Any update on this issue perhaps? It seems I'm not the only one with  
> this problem. Our cluster isn't working anymore (only 1 monitor left) so  
> I'd recommend anyone running 0.61.5 not to reboot or restart their  
> monitors until it is know what is going on :(
>  
>  


I just rebooted one mon server running 0.61.5 (had to!)
and it didn't crash (yet?). I guess I was lucky…

Cheers, Dan



  
>  
> > >  
> > > Thanks,
> > >  
> > > Peter
> > >  
> > > On 2013-07-22 17:31, Joao Eduardo Luis wrote:
> > > > On 07/22/2013 12:33 PM, pe...@2force.nl (mailto:pe...@2force.nl) wrote:
> > > > > Hello,
> > > > >  
> > > > > After a reboot one of our monitors is unable to start. We did an  
> > > > > upgrade
> > > > > from 0.61.4 to 0.61.5 last week without problems (the monitor  
> > > > > restarted
> > > > > just fine).
> > > > >  
> > > > > We are getting the following error (I think it is the same as:
> > > > > http://tracker.ceph.com/issues/5704). I might have missed it on the  
> > > > > list
> > > > > though. If you want I can send the contents of the monitor  
> > > > > directory.
> > > > >  
> > > >  
> > > >  
> > > > That monitor store would be greatly appreciated! If you could  
> > > > bundle
> > > > the store of two other monitors it would be great.
> > > >  
> > > > -Joao
> > > >  
> > > >  
> > > > > 2013-07-22 13:24:02.183558 7fd06127e780 0 ceph version 0.61.5
> > > > > (8ee10dc4bb73bdd918873f29c70eedc3c7ef1979), process ceph-mon, pid  
> > > > > 28540
> > > > > 2013-07-22 13:24:02.251205 7fd05d320700 -1 asok(0x207e000)  
> > > > > AdminSocket:
> > > > > request 'mon_status' not defined
> > > > > 2013-07-22 13:24:02.357287 7fd06127e780 1 mon.narr9@-1(probing) e1
> > > > > preinit fsid 97e515bb-d334-4fa7-8b53-7d85615809fd
> > > > > 2013-07-22 13:24:02.374158 7fd06127e780 -1 mon/OSDMonitor.cc 
> > > > > (http://OSDMonitor.cc): In
> > > > > function 'virtual void OSDMonitor::update_from_paxos(bool*)' thread
> > > > > 7fd06127e780 time 2013-07-22 13:24:02.373344
> > > > > mon/OSDMonitor.cc (http://OSDMonitor.cc): 132: FAILED 
> > > > > assert(latest_bl.length() != 0)
> > > > >  
> > > > > ceph version 0.61.5 (8ee10dc4bb73bdd918873f29c70eedc3c7ef1979)
> > > > > 1: /usr/bin/ceph-mon() [0x5073d6]
> > > > > 2: (PaxosService::refresh(bool*)+0x19b) [0x4edd4b]
> > > > > 3: (Monitor::refresh_from_paxos(bool*)+0x57) [0x48e5a7]
> > > > > 4: (Monitor::init_paxos()+0xf5) [0x48e755]
> > > > > 5: (Monitor::preinit()+0x6ac) [0x4a4e7c]
> > > > > 6: (main()+0x1c19) [0x483559]
> > > > > 7: (__libc_start_main()+0xed) [0x7fd05f4da76d]
> > > > > 8: /usr/bin/ceph-mon() [0x485e7d]
> > > > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> > > > > needed to interpret this.
> > > > >  
> > > > > --- begin dump of recent events ---
> > > > > -26> 2013-07-22 13:24:02.181870 7fd06127e780 5 asok(0x207e000)
> > > > > register_command perfcounters_dump hook 0x2073010
> > > > > -25> 2013-07-22 13:24:02.181908 7fd06127e780 5 asok(0x207e000)
> > > > > register_command 1 hook 0x2073010
> > > > > -24> 2013-07-22 13:24:02.181915 7fd06127e780 5 asok(0x207e000)
> > > > > register_command perf dump hook 0x2073010
> > > > > -23> 2013-07-22 13:24:02.181929 7fd06127e780 5 asok(0x207e000)
> > > > > register_command perfcounters_schema hook 0x2073010
> > > > > -22> 2013-07-22 13:24:02.181939 7fd06127e780 5 asok(0x207e000)
> > > > > register_command 2 hook 0x2073010
> > > > > -21> 2013-07-22 13:24:02.181941 7fd06127e780 5 asok(0x207e000)
> > > > > register_command perf schema hook 0x2073010
> > > > > -20> 2013-07-22 13:24:02.181945 7fd06127e780 5 asok(0x207e000)
> > > > > register_command config show hook 0x2073010
> > > > > -19> 2013-07-22 13:24:02.181954 7fd06127e780 5 asok(0x207e000)
> > > > > register_command config set hook 0x2073010
> > > > > -18> 2013-07-22 13:24:02.181957 7fd06127e780 5 asok(0x207e000)
> > > > > register_command log flush hook 0x2073010
> > > > > -17> 2013-07-22 13:24:02.181959 7fd06127e780 5 asok(0x207e000)
> > > > > register_command log dump hook 0x2073010
> > > > > -16> 2013-07-22 13:24:02.181964 7fd06127e780 5 asok(0x207e000)
> > > > > register_command log reopen hook 0x2073010
> > > > > -15> 2013-07-22 13:24:02.183558 7fd06127e780 0 ceph version  
> > > > > 0.61.5
> > > > > (8ee10dc4bb73bdd918873f29c70eedc3c7ef1979), process ceph-mon, pid  
> > > > > 28540
> > > > > -14> 2013-07-22 13:24:02.186703 7fd06127e780 5 asok(0x207e000)  
> > > > > init
> > > > > /var/run/ceph/ceph-mon.narr9.asok
> > > > > -13> 2013-07-22 13:24:02.186734 7fd06127e780 5 asok(0x207e000)
> > > > > bind_and_listen /var/run/ceph/ceph-mon.narr9.asok
> > > > > -12> 2013-07-22 13:24:02.186780 7fd06127e780 5 asok(0x207e000)
> > > > > register_command 0 hook 0x20720b0
> > > > > -11> 2013-07-22 13:24:02.186790 7fd06127e780 5 asok(0x207e000)
> > > > > register_command version hook 0x20720b0
> > > > > -10> 2013-07-22 13:24:02.186798 7fd06127e780 5 asok(0x207e000)
> > > > > register_command git_version hook 0x20720b0
> > > > > -9> 2013-07-22 13:24:02.186806 7fd06127e780 5 asok(0x207e000)
> > > > > register_command help hook 0x20730d0
> > > > > -8> 2013-07-22 13:24:02.186850 7fd05d320700 5 asok(0x207e000)
> > > > > entry start
> > > > > -7> 2013-07-22 13:24:02.251205 7fd05d320700 -1 asok(0x207e000)
> > > > > AdminSocket: request 'mon_status' not defined
> > > > > -6> 2013-07-22 13:24:02.357202 7fd06127e780 1 --
> > > > > 10.255.0.25:6789/0 learned my addr 10.255.0.25:6789/0
> > > > > -5> 2013-07-22 13:24:02.357215 7fd06127e780 1
> > > > > accepter.accepter.bind my_inst.addr is 10.255.0.25:6789/0  
> > > > > need_addr=0
> > > > > -4> 2013-07-22 13:24:02.357242 7fd06127e780 5 adding auth
> > > > > protocol: cephx
> > > > > -3> 2013-07-22 13:24:02.357245 7fd06127e780 5 adding auth
> > > > > protocol: cephx
> > > > > -2> 2013-07-22 13:24:02.357287 7fd06127e780 1
> > > > > mon.narr9@-1(probing) e1 preinit fsid
> > > > > 97e515bb-d334-4fa7-8b53-7d85615809fd
> > > > > -1> 2013-07-22 13:24:02.372987 7fd06127e780 4
> > > > > mon.narr9@-1(probing).mds e182116 new map
> > > > > 0> 2013-07-22 13:24:02.374158 7fd06127e780 -1  
> > > > > mon/OSDMonitor.cc (http://OSDMonitor.cc):
> > > > > In function 'virtual void OSDMonitor::update_from_paxos(bool*)'  
> > > > > thread
> > > > > 7fd06127e780 time 2013-07-22 13:24:02.373344
> > > > > mon/OSDMonitor.cc (http://OSDMonitor.cc): 132: FAILED 
> > > > > assert(latest_bl.length() != 0)
> > > > >  
> > > > > ceph version 0.61.5 (8ee10dc4bb73bdd918873f29c70eedc3c7ef1979)
> > > > > 1: /usr/bin/ceph-mon() [0x5073d6]
> > > > > 2: (PaxosService::refresh(bool*)+0x19b) [0x4edd4b]
> > > > > 3: (Monitor::refresh_from_paxos(bool*)+0x57) [0x48e5a7]
> > > > > 4: (Monitor::init_paxos()+0xf5) [0x48e755]
> > > > > 5: (Monitor::preinit()+0x6ac) [0x4a4e7c]
> > > > > 6: (main()+0x1c19) [0x483559]
> > > > > 7: (__libc_start_main()+0xed) [0x7fd05f4da76d]
> > > > > 8: /usr/bin/ceph-mon() [0x485e7d]
> > > > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> > > > > needed to interpret this.
> > > > >  
> > > > > --- logging levels ---
> > > > > 0/ 5 none
> > > > > 0/ 1 lockdep
> > > > > 0/ 1 context
> > > > > 1/ 1 crush
> > > > > 1/ 5 mds
> > > > > 1/ 5 mds_balancer
> > > > > 1/ 5 mds_locker
> > > > > 1/ 5 mds_log
> > > > > 1/ 5 mds_log_expire
> > > > > 1/ 5 mds_migrator
> > > > > 0/ 1 buffer
> > > > > 0/ 1 timer
> > > > > 0/ 1 filer
> > > > > 0/ 1 striper
> > > > > 0/ 1 objecter
> > > > > 0/ 5 rados
> > > > > 0/ 5 rbd
> > > > > 0/ 5 journaler
> > > > > 0/ 5 objectcacher
> > > > > 0/ 5 client
> > > > > 0/ 5 osd
> > > > > 0/ 5 optracker
> > > > > 0/ 5 objclass
> > > > > 1/ 3 filestore
> > > > > 1/ 3 journal
> > > > > 0/ 5 ms
> > > > > 1/ 5 mon
> > > > > 0/10 monc
> > > > > 0/ 5 paxos
> > > > > 0/ 5 tp
> > > > > 1/ 5 auth
> > > > > 1/ 5 crypto
> > > > > 1/ 1 finisher
> > > > > 1/ 5 heartbeatmap
> > > > > 1/ 5 perfcounter
> > > > > 1/ 5 rgw
> > > > > 1/ 5 hadoop
> > > > > 1/ 5 javaclient
> > > > > 1/ 5 asok
> > > > > 1/ 1 throttle
> > > > > -2/-2 (syslog threshold)
> > > > > -1/-1 (stderr threshold)
> > > > > max_recent 10000
> > > > > max_new 1000
> > > > > log_file /var/log/ceph/ceph-mon.narr9.log
> > > > > --- end dump of recent events ---
> > > > > 2013-07-22 13:24:02.376004 7fd06127e780 -1 *** Caught signal
> > > > > (Aborted) **
> > > > > in thread 7fd06127e780
> > > > >  
> > > > > ceph version 0.61.5 (8ee10dc4bb73bdd918873f29c70eedc3c7ef1979)
> > > > > 1: /usr/bin/ceph-mon() [0x59743a]
> > > > > 2: (()+0xfcb0) [0x7fd060919cb0]
> > > > > 3: (gsignal()+0x35) [0x7fd05f4ef425]
> > > > > 4: (abort()+0x17b) [0x7fd05f4f2b8b]
> > > > > 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d)  
> > > > > [0x7fd05fe4169d]
> > > > > 6: (()+0xb5846) [0x7fd05fe3f846]
> > > > > 7: (()+0xb5873) [0x7fd05fe3f873]
> > > > > 8: (()+0xb596e) [0x7fd05fe3f96e]
> > > > > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> > > > > const*)+0x1df) [0x64f6ef]
> > > > > 10: /usr/bin/ceph-mon() [0x5073d6]
> > > > > 11: (PaxosService::refresh(bool*)+0x19b) [0x4edd4b]
> > > > > 12: (Monitor::refresh_from_paxos(bool*)+0x57) [0x48e5a7]
> > > > > 13: (Monitor::init_paxos()+0xf5) [0x48e755]
> > > > > 14: (Monitor::preinit()+0x6ac) [0x4a4e7c]
> > > > > 15: (main()+0x1c19) [0x483559]
> > > > > 16: (__libc_start_main()+0xed) [0x7fd05f4da76d]
> > > > > 17: /usr/bin/ceph-mon() [0x485e7d]
> > > > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> > > > > needed to interpret this.
> > > > >  
> > > > > --- begin dump of recent events ---
> > > > > 0> 2013-07-22 13:24:02.376004 7fd06127e780 -1 *** Caught  
> > > > > signal
> > > > > (Aborted) **
> > > > > in thread 7fd06127e780
> > > > >  
> > > > > ceph version 0.61.5 (8ee10dc4bb73bdd918873f29c70eedc3c7ef1979)
> > > > > 1: /usr/bin/ceph-mon() [0x59743a]
> > > > > 2: (()+0xfcb0) [0x7fd060919cb0]
> > > > > 3: (gsignal()+0x35) [0x7fd05f4ef425]
> > > > > 4: (abort()+0x17b) [0x7fd05f4f2b8b]
> > > > > 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d)  
> > > > > [0x7fd05fe4169d]
> > > > > 6: (()+0xb5846) [0x7fd05fe3f846]
> > > > > 7: (()+0xb5873) [0x7fd05fe3f873]
> > > > > 8: (()+0xb596e) [0x7fd05fe3f96e]
> > > > > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> > > > > const*)+0x1df) [0x64f6ef]
> > > > > 10: /usr/bin/ceph-mon() [0x5073d6]
> > > > > 11: (PaxosService::refresh(bool*)+0x19b) [0x4edd4b]
> > > > > 12: (Monitor::refresh_from_paxos(bool*)+0x57) [0x48e5a7]
> > > > > 13: (Monitor::init_paxos()+0xf5) [0x48e755]
> > > > > 14: (Monitor::preinit()+0x6ac) [0x4a4e7c]
> > > > > 15: (main()+0x1c19) [0x483559]
> > > > > 16: (__libc_start_main()+0xed) [0x7fd05f4da76d]
> > > > > 17: /usr/bin/ceph-mon() [0x485e7d]
> > > > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> > > > > needed to interpret this.
> > > > >  
> > > > > --- logging levels ---
> > > > > 0/ 5 none
> > > > > 0/ 1 lockdep
> > > > > 0/ 1 context
> > > > > 1/ 1 crush
> > > > > 1/ 5 mds
> > > > > 1/ 5 mds_balancer
> > > > > 1/ 5 mds_locker
> > > > > 1/ 5 mds_log
> > > > > 1/ 5 mds_log_expire
> > > > > 1/ 5 mds_migrator
> > > > > 0/ 1 buffer
> > > > > 0/ 1 timer
> > > > > 0/ 1 filer
> > > > > 0/ 1 striper
> > > > > 0/ 1 objecter
> > > > > 0/ 5 rados
> > > > > 0/ 5 rbd
> > > > > 0/ 5 journaler
> > > > > 0/ 5 objectcacher
> > > > > 0/ 5 client
> > > > > 0/ 5 osd
> > > > > 0/ 5 optracker
> > > > > 0/ 5 objclass
> > > > > 1/ 3 filestore
> > > > > 1/ 3 journal
> > > > > 0/ 5 ms
> > > > > 1/ 5 mon
> > > > > 0/10 monc
> > > > > 0/ 5 paxos
> > > > > 0/ 5 tp
> > > > > 1/ 5 auth
> > > > > 1/ 5 crypto
> > > > > 1/ 1 finisher
> > > > > 1/ 5 heartbeatmap
> > > > > 1/ 5 perfcounter
> > > > > 1/ 5 rgw
> > > > > 1/ 5 hadoop
> > > > > 1/ 5 javaclient
> > > > > 1/ 5 asok
> > > > > 1/ 1 throttle
> > > > > -2/-2 (syslog threshold)
> > > > > -1/-1 (stderr threshold)
> > > > > max_recent 10000
> > > > > max_new 1000
> > > > > log_file /var/log/ceph/ceph-mon.narr9.log
> > > > > --- end dump of recent events ---
> > > > >  
> > > > > Cheers,
> > > > >  
> > > > > Peter
> > > > >  
> > > > > _______________________________________________
> > > > > ceph-users mailing list
> > > > > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com)
> > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > > >  
> > > >  
> > > >  
> > >  
> > > _______________________________________________
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com)
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >  
> >  
>  
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com)
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>  
>  


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to