I did some digging around and yes, it is exactly as you said: systemd files remained to boot up the previous OSDs. We removed them and now it works properly. Thank you for the help.
Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc. > Le 29 oct. 2019 à 15:34, Bryan Stillwell <bstillw...@godaddy.com> a écrit : > > On Oct 29, 2019, at 11:23 AM, Jean-Philippe Méthot > <jp.met...@planethoster.info> wrote: >> A few months back, we had one of our OSD node motherboards die. At the time, >> we simply waited for recovery and purged the OSDs that were on the dead >> node. We just replaced that node and added back the drives as new OSDs. At >> the ceph administration level, everything looks fine, no duplicate OSDs when >> I execute map commands or ask Ceph to list what OSDs are on the node. >> However, on the OSD node, in /var/log/ceph/ceph-volume, I see that every >> time the server boots, ceph-volume tries to search for OSD fsids that don’t >> exist. Here’s the error: >> >> [2019-10-29 13:12:02,864][ceph_volume][ERROR ] exception caught by decorator >> Traceback (most recent call last): >> File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 59, >> in newfunc >> return f(*a, **kw) >> File "/usr/lib/python2.7/site-packages/ceph_volume/main.py", line 148, in >> main >> terminal.dispatch(self.mapper, subcommand_args) >> File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, >> in dispatch >> instance.main() >> File "/usr/lib/python2.7/site-packages/ceph_volume/devices/lvm/main.py", >> line 40, in main >> terminal.dispatch(self.mapper, self.argv) >> File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, >> in dispatch >> instance.main() >> File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 16, >> in is_root >> return func(*a, **kw) >> File "/usr/lib/python2.7/site-packages/ceph_volume/devices/lvm/trigger.py", >> line 70, in main >> Activate(['--auto-detect-objectstore', osd_id, osd_uuid]).main() >> File >> "/usr/lib/python2.7/site-packages/ceph_volume/devices/lvm/activate.py", line >> 339, in main >> self.activate(args) >> File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 16, >> in is_root >> return func(*a, **kw) >> File >> "/usr/lib/python2.7/site-packages/ceph_volume/devices/lvm/activate.py", line >> 249, in activate >> raise RuntimeError('could not find osd.%s with fsid %s' % (osd_id, >> osd_fsid)) >> RuntimeError: could not find osd.213 with fsid >> 22800a80-2445-41a3-8643-69b4b84d598a >> >> Of course this fsid ID isn’t listed anywhere in Ceph. Where does ceph-volume >> get this fsid from? Even when looking at the code, it’s not particularly >> obvious. This is ceph mimic running on CentOS 7 and bluestore. > > That's not the cluster fsid, but the osd fsid. Try running this command on > your OSD node to get more details: > > ceph-volume inventory --format json-pretty > > My guess is there are some systemd files laying around for the old OSDs, or > you were using 'ceph-volume simple' in the past (check for /etc/ceph/osd/). > > Bryan >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com