Sam, Thanks for taking a look. It does seem to fit my issue. Would just removing the 5.0_head directory be appropriate or would using ceph-objectstore-tool be better?
Thanks, Berant On Mon, May 18, 2015 at 1:47 PM, Samuel Just <sj...@redhat.com> wrote: > You have most likely hit http://tracker.ceph.com/issues/11429. There are > some workarounds in the bugs marked as duplicates of that bug, or you can > wait for the next hammer point release. > -Sam > > ----- Original Message ----- > From: "Berant Lemmenes" <ber...@lemmenes.com> > To: ceph-users@lists.ceph.com > Sent: Monday, May 18, 2015 10:24:38 AM > Subject: [ceph-users] OSD unable to start (giant -> hammer) > > Hello all, > > I've encountered a problem when upgrading my single node home cluster from > giant to hammer, and I would greatly appreciate any insight. > > I upgraded the packages like normal, then proceeded to restart the mon and > once that came back restarted the first OSD (osd.3). However it > subsequently won't start and crashes with the following failed assertion: > > > > osd/OSD.h: 716: FAILED assert(ret) > > ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff) > > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x7f) [0xb1784f] > > 2: (OSD::load_pgs()+0x277b) [0x6850fb] > > 3: (OSD::init()+0x1448) [0x6930b8] > > 4: (main()+0x26b9) [0x62fd89] > > 5: (__libc_start_main()+0xed) [0x7f2345bc976d] > > 6: ceph-osd() [0x635679] > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed > to interpret this. > > > > > --- logging levels --- > > 0/ 5 none > > 0/ 1 lockdep > > 0/ 1 context > > 1/ 1 crush > > 1/ 5 mds > > 1/ 5 mds_balancer > > 1/ 5 mds_locker > > 1/ 5 mds_log > > 1/ 5 mds_log_expire > > 1/ 5 mds_migrator > > 0/ 1 buffer > > 0/ 1 timer > > 0/ 1 filer > > 0/ 1 striper > > 0/ 1 objecter > > 0/ 5 rados > > 0/ 5 rbd > > 0/ 5 rbd_replay > > 0/ 5 journaler > > 0/ 5 objectcacher > > 0/ 5 client > > 0/ 5 osd > > 0/ 5 optracker > > 0/ 5 objclass > > 1/ 3 filestore > > 1/ 3 keyvaluestore > > 1/ 3 journal > > 0/ 5 ms > > 1/ 5 mon > > 0/10 monc > > 1/ 5 paxos > > 0/ 5 tp > > 1/ 5 auth > > 1/ 5 crypto > > 1/ 1 finisher > > 1/ 5 heartbeatmap > > 1/ 5 perfcounter > > 1/ 5 rgw > > 1/10 civetweb > > 1/ 5 javaclient > > 1/ 5 asok > > 1/ 1 throttle > > 0/ 0 refs > > 1/ 5 xio > > -2/-2 (syslog threshold) > > 99/99 (stderr threshold) > > max_recent 10000 > > max_new 1000 > > log_file > > --- end dump of recent events --- > > terminate called after throwing an instance of 'ceph::FailedAssertion' > > *** Caught signal (Aborted) ** > > in thread 7f2347f71780 > > ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff) > > 1: ceph-osd() [0xa1fe55] > > 2: (()+0xfcb0) [0x7f2346fb1cb0] > > 3: (gsignal()+0x35) [0x7f2345bde0d5] > > 4: (abort()+0x17b) [0x7f2345be183b] > > 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f234652f69d] > > 6: (()+0xb5846) [0x7f234652d846] > > 7: (()+0xb5873) [0x7f234652d873] > > 8: (()+0xb596e) [0x7f234652d96e] > > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x259) [0xb17a29] > > 10: (OSD::load_pgs()+0x277b) [0x6850fb] > > 11: (OSD::init()+0x1448) [0x6930b8] > > 12: (main()+0x26b9) [0x62fd89] > > 13: (__libc_start_main()+0xed) [0x7f2345bc976d] > > 14: ceph-osd() [0x635679] > > 2015-05-18 13:02:33.643064 7f2347f71780 -1 *** Caught signal (Aborted) ** > > in thread 7f2347f71780 > > > > > ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff) > > 1: ceph-osd() [0xa1fe55] > > 2: (()+0xfcb0) [0x7f2346fb1cb0] > > 3: (gsignal()+0x35) [0x7f2345bde0d5] > > 4: (abort()+0x17b) [0x7f2345be183b] > > 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f234652f69d] > > 6: (()+0xb5846) [0x7f234652d846] > > 7: (()+0xb5873) [0x7f234652d873] > > 8: (()+0xb596e) [0x7f234652d96e] > > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x259) [0xb17a29] > > 10: (OSD::load_pgs()+0x277b) [0x6850fb] > > 11: (OSD::init()+0x1448) [0x6930b8] > > 12: (main()+0x26b9) [0x62fd89] > > 13: (__libc_start_main()+0xed) [0x7f2345bc976d] > > 14: ceph-osd() [0x635679] > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed > to interpret this. > > > > > --- begin dump of recent events --- > > 0> 2015-05-18 13:02:33.643064 7f2347f71780 -1 *** Caught signal (Aborted) > ** > > in thread 7f2347f71780 > > > > > ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff) > > 1: ceph-osd() [0xa1fe55] > > 2: (()+0xfcb0) [0x7f2346fb1cb0] > > 3: (gsignal()+0x35) [0x7f2345bde0d5] > > 4: (abort()+0x17b) [0x7f2345be183b] > > 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f234652f69d] > > 6: (()+0xb5846) [0x7f234652d846] > > 7: (()+0xb5873) [0x7f234652d873] > > 8: (()+0xb596e) [0x7f234652d96e] > > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x259) [0xb17a29] > > 10: (OSD::load_pgs()+0x277b) [0x6850fb] > > 11: (OSD::init()+0x1448) [0x6930b8] > > 12: (main()+0x26b9) [0x62fd89] > > 13: (__libc_start_main()+0xed) [0x7f2345bc976d] > > 14: ceph-osd() [0x635679] > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed > to interpret this. > > > > > --- logging levels --- > > 0/ 5 none > > 0/ 1 lockdep > > 0/ 1 context > > 1/ 1 crush > > 1/ 5 mds > > 1/ 5 mds_balancer > > 1/ 5 mds_locker > > 1/ 5 mds_log > > 1/ 5 mds_log_expire > > 1/ 5 mds_migrator > > 0/ 1 buffer > > 0/ 1 timer > > 0/ 1 filer > > 0/ 1 striper > > 0/ 1 objecter > > 0/ 5 rados > > 0/ 5 rbd > > 0/ 5 rbd_replay > > 0/ 5 journaler > > 0/ 5 objectcacher > > 0/ 5 client > > 0/ 5 osd > > 0/ 5 optracker > > 0/ 5 objclass > > 1/ 3 filestore > > 1/ 3 keyvaluestore > > 1/ 3 journal > > 0/ 5 ms > > 1/ 5 mon > > 0/10 monc > > 1/ 5 paxos > > 0/ 5 tp > > 1/ 5 auth > > 1/ 5 crypto > > 1/ 1 finisher > > 1/ 5 heartbeatmap > > 1/ 5 perfcounter > > 1/ 5 rgw > > 1/10 civetweb > > 1/ 5 javaclient > > 1/ 5 asok > > 1/ 1 throttle > > 0/ 0 refs > > 1/ 5 xio > > -2/-2 (syslog threshold) > > 99/99 (stderr threshold) > > max_recent 10000 > > max_new 1000 > > log_file > > --- end dump of recent events --- > > > I've included a 'ceph osd dump' here: > http://pastebin.com/RKbaY7nv > > ceph osd tree: > > > ceph osd tree > > ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > > -1 24.14000 root default > > -3 0 rack unknownrack > > -2 0 host ceph-test > > -4 24.14000 host ceph01 > > 0 1.50000 osd.0 down 0 1.00000 > > 2 1.50000 osd.2 down 0 1.00000 > > 3 1.50000 osd.3 down 1.00000 1.00000 > > 5 2.00000 osd.5 up 1.00000 1.00000 > > 6 2.00000 osd.6 up 1.00000 1.00000 > > 7 2.00000 osd.7 up 1.00000 1.00000 > > 8 2.00000 osd.8 up 1.00000 1.00000 > > 9 2.00000 osd.9 up 1.00000 1.00000 > > 10 2.00000 osd.10 up 1.00000 1.00000 > > 4 4.00000 osd.4 up 1.00000 1.00000 > > 1 3.64000 osd.1 up 1.00000 1.00000 > > > > > Note that osd.0 and osd.2 were down prior to the upgrade and the cluster > was healthy (these are failed disks that have been out for some time just > not removed from CRUSH. > > I've also included a log with OSD debugging set to 20 here: > > https://dl.dropboxusercontent.com/u/1043493/osd.3.log.gz > > > Looking through that file, it appears the last pg that it loads > successfully is 2.3f6 then it moves to 5.0 > > -3> 2015-05-18 12:25:24.292091 7f6f407f9780 10 osd.3 39533 load_pgs loaded > pg[2.3f6( v 39533'289849 (37945'286848,39533'289849] local-les=39532 n=99 > ec=1 les/c 39532/39532 39531/39531/39523) [5,4,3] r=2 lpr=39533 > pi=34961-39530/34 crt=39533'289846 lcod 0'0 inactive NOTIFY] > log((37945'286848,39533'289849], crt=39533'289846) > > -2> 2015-05-18 12:25:24.292100 7f6f407f9780 10 osd.3 39533 pgid 5.0 coll > 5.0_head > > -1> 2015-05-18 12:25:24.570188 7f6f407f9780 20 osd.3 0 get_map 34144 - > loading and decoding 0x411fd80 > > 0> 2015-05-18 12:26:02.758914 7f6f407f9780 -1 osd/OSD.h: In function > 'OSDMapRef OSDService::get_map(epoch_t)' thread 7f6f407f9780 time > 2015-05-18 12:25:24.620468 > > > > osd/OSD.h: 716: FAILED assert(ret) > > [snip] > > Which I don't see 5.0 in a pg dump. > > > > > Thanks in advance! > > Berant > > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com