You are running into https://tracker.ceph.com/issues/24423 I've fixed it here: https://github.com/ceph/ceph/pull/22585
The fix has already been backported and will be in 13.2.1 Paul 2018-06-27 8:40 GMT+02:00 Steffen Winther Sørensen <ste...@gmail.com>: > List, > > Had a failed disk behind an OSD in a Mimic Cluster 13.2.0, so I tried > following the doc on removal of an OSD. > > I did: > > # ceph osd crush reweight osd.19 0 > waited for rebalancing to finish and cont.: > # ceph osd out 19 > # systemctl stop ceph-osd@19 > # ceph osd purge 19 --yes-i-really-mean-it > > verified that osd.19 was out of map w/ ceph osd tree > > Still found this tmpfs mounted though to my surprise: > tmpfs 7.8G 48K 7.8G 1% /var/lib/ceph/osd/ceph-19 > > Replaced the failed drive and then attempted: > > # ceph-volume lvm zap /dev/sdh > # ceph-volume lvm create --osd-id 19 --data /dev/sdh > Running command: /bin/ceph-authtool --gen-print-key > Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd > --keyring > /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json > Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd > --keyring > /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new > 5352d594-aa19-4147-a884-ca > 2c5775aa1b > Running command: /usr/sbin/vgcreate --force --yes > ceph-a2ebf47b-fa4a-43ce-b087-1 > 2dbafb5796e /dev/sdh > stderr: WARNING: Device for PV CdiFOZ-n89Z-G5EF-JBBV-GFfU-bDRV-VJQHho > not found > or rejected by a filter. > stderr: WARNING: Device for PV CdiFOZ-n89Z-G5EF-JBBV-GFfU-bDRV-VJQHho > not found > or rejected by a filter. > stderr: /dev/ceph-a6541e3f-0a7f-4268-823c-668c515b5edc/osd-block- > efae9323-b934- > 408e-a4f9-1e1f62d88f2d: read failed after 0 of 4096 at 0: Input/output > error > /dev/ceph-a6541e3f-0a7f-4268-823c-668c515b5edc/osd-block- > efae9323-b934-408e-a4 > f9-1e1f62d88f2d: read failed after 0 of 4096 at 146775408640: Input/output > error > /dev/ceph-a6541e3f-0a7f-4268-823c-668c515b5edc/osd-block- > efae9323-b934-408e-a4 > f9-1e1f62d88f2d: read failed after 0 of 4096 at 146775465984: Input/output > error > stderr: /dev/ceph-a6541e3f-0a7f-4268-823c-668c515b5edc/osd-block- > efae9323-b934- > 408e-a4f9-1e1f62d88f2d: read failed after 0 of 4096 at 4096: Input/output > error > stderr: WARNING: Device for PV CdiFOZ-n89Z-G5EF-JBBV-GFfU-bDRV-VJQHho > not found > or rejected by a filter. > stdout: Physical volume "/dev/sdh" successfully created. > stdout: Volume group "ceph-a2ebf47b-fa4a-43ce-b087-12dbafb5796e" > successfully created > Running command: /usr/sbin/lvcreate --yes -l 100%FREE -n > osd-block-5352d594-aa19 > -4147-a884-ca2c5775aa1b ceph-a2ebf47b-fa4a-43ce-b087-12dbafb5796e > stdout: Logical volume "osd-block-5352d594-aa19-4147-a884-ca2c5775aa1b" > created > . > Running command: /bin/ceph-authtool --gen-print-key > Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-19 > Running command: /bin/chown -R ceph:ceph /dev/dm-9 > Running command: /bin/ln -s /dev/ceph-a2ebf47b-fa4a-43ce- > b087-12dbafb5796e/osd-b > lock-5352d594-aa19-4147-a884-ca2c5775aa1b /var/lib/ceph/osd/ceph-19/block > Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd > --keyring > /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o > /var/lib/ceph/osd/ceph-19/activate.monmap > stderr: got monmap epoch 1 > Running command: /bin/ceph-authtool /var/lib/ceph/osd/ceph-19/keyring > --create-k > eyring --name osd.19 --add-key AQBY1TBbN8I+HxAAMHGWKLgJugmtzdqllQh5sA== > stdout: creating /var/lib/ceph/osd/ceph-19/keyring > stdout: added entity osd.19 auth auth(auid = 18446744073709551615 > key=AQBY1TBbN > 8I+HxAAMHGWKLgJugmtzdqllQh5sA== with 0 caps) > Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-19/keyring > Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-19/ > Running command: /bin/ceph-osd --cluster ceph --osd-objectstore bluestore > --mkfs > -i 19 --monmap /var/lib/ceph/osd/ceph-19/activate.monmap --keyfile - > --osd-data > /var/lib/ceph/osd/ceph-19/ --osd-uuid 5352d594-aa19-4147-a884-ca2c5775aa1b > --se > tuser ceph --setgroup ceph > --> ceph-volume lvm prepare successful for: /dev/sdh > Running command: /bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir > --dev /de > v/ceph-a2ebf47b-fa4a-43ce-b087-12dbafb5796e/osd-block- > 5352d594-aa19-4147-a884-ca > 2c5775aa1b --path /var/lib/ceph/osd/ceph-19 > Running command: /bin/ln -snf /dev/ceph-a2ebf47b-fa4a-43ce- > b087-12dbafb5796e/osd > -block-5352d594-aa19-4147-a884-ca2c5775aa1b /var/lib/ceph/osd/ceph-19/ > block > Running command: /bin/chown -R ceph:ceph /dev/dm-9 > Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-19 > Running command: /bin/systemctl enable ceph-volume@lvm-19-5352d594- > aa19-4147-a88 > 4-ca2c5775aa1b > stderr: Created symlink from /etc/systemd/system/multi- > user.target.wants/ceph-v > olume@lvm-19-5352d594-aa19-4147-a884-ca2c5775aa1b.service to > /usr/lib/systemd/sy > stem/ceph-volume@.service. > Running command: /bin/systemctl start ceph-osd@19 > --> ceph-volume lvm activate successful for osd ID: 19 > --> ceph-volume lvm create successful for: /dev/sdh > > verified that osd.19 was in the map with: > # ceph osd tree > ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF > -1 3.20398 root default > -9 0.80099 host n1 > 18 hdd 0.13350 osd.18 up 1.00000 1.00000 > 19 hdd 0.13350 osd.19 down 0 1.00000 > 20 hdd 0.13350 osd.20 up 1.00000 1.00000 > 21 hdd 0.13350 osd.21 up 1.00000 1.00000 > 22 hdd 0.13350 osd.22 up 1.00000 1.00000 > 23 hdd 0.13350 osd.23 up 1.00000 1.00000 > > Only it fails to launch > # systemctl start ceph-osd@19 > # systemctl status ceph-osd@19 > â ceph-osd@19.service - Ceph object storage daemon osd.19 > Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; disabled; > vendor preset: disabled) > Active: activating (auto-restart) (Result: signal) since Mon 2018-06-25 > 13:44:35 CEST; 3s ago > Process: 2046453 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} > --id %i --setuser ceph --setgroup ceph (code=killed, signal=ABRT) > Process: 2046447 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh > --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS) > Main PID: 2046453 (code=killed, signal=ABRT) > > Jun 25 13:44:35 n1.sprawl.dk ceph-osd[2046453]: 8: > (OSD::handle_osd_map(MOSDMap*)+0x1020) [0x56353eac71f0] > Jun 25 13:44:35 n1.sprawl.dk ceph-osd[2046453]: 9: > (OSD::_dispatch(Message*)+0xa1) [0x56353eac9d21] > Jun 25 13:44:35 n1.sprawl.dk ceph-osd[2046453]: 10: > (OSD::ms_dispatch(Message*)+0x56) [0x56353eaca066] > Jun 25 13:44:35 n1.sprawl.dk ceph-osd[2046453]: 11: > (DispatchQueue::entry()+0xb5a) [0x7f302acce74a] > Jun 25 13:44:35 n1.sprawl.dk ceph-osd[2046453]: 12: > (DispatchQueue::DispatchThread::entry()+0xd) > [0x7f302ad6ef2d] > Jun 25 13:44:35 n1.sprawl.dk ceph-osd[2046453]: 13: (()+0x7e25) > [0x7f30277b0e25] > Jun 25 13:44:35 n1.sprawl.dk ceph-osd[2046453]: 14: (clone()+0x6d) > [0x7f30268a1bad] > Jun 25 13:44:35 n1.sprawl.dk ceph-osd[2046453]: NOTE: a copy of the > executable, or `objdump -rdS <executable>` is needed to interpret this. > Jun 25 13:44:35 n1.sprawl.dk systemd[1]: Unit ceph-osd@19.service entered > failed state. > Jun 25 13:44:35 n1.sprawl.dk systemd[1]: ceph-osd@19.service failed. > > osd.19 log show: > > --- begin dump of recent events --- > 0> 2018-06-25 13:48:47.139 7fc6b91c5700 -1 *** Caught signal > (Aborted) ** > in thread 7fc6b91c5700 thread_name:ms_dispatch > > ceph version 13.2.0 (79a10589f1f80dfe21e8f9794365ed98143071c4) mimic > (stable) > 1: (()+0x8e1870) [0x55da2ff6e870] > 2: (()+0xf6d0) [0x7fc6c97ba6d0] > 3: (gsignal()+0x37) [0x7fc6c87db277] > 4: (abort()+0x148) [0x7fc6c87dc968] > 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x25d) [0x7fc6ccc5a69d] > 6: (()+0x286727) [0x7fc6ccc5a727] > 7: (OSDService::get_map(unsigned int)+0x4a) [0x55da2faa3dda] > 8: (OSD::handle_osd_map(MOSDMap*)+0x1020) [0x55da2fa511f0] > 9: (OSD::_dispatch(Message*)+0xa1) [0x55da2fa53d21] > 10: (OSD::ms_dispatch(Message*)+0x56) [0x55da2fa54066] > 11: (DispatchQueue::entry()+0xb5a) [0x7fc6cccd074a] > 12: (DispatchQueue::DispatchThread::entry()+0xd) [0x7fc6ccd70f2d] > 13: (()+0x7e25) [0x7fc6c97b2e25] > 14: (clone()+0x6d) [0x7fc6c88a3bad] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed > to interpret this. > > Any hints would be appreciated, TIA! > > /Steffen > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com