On Oct 29, 2019, at 11:23 AM, Jean-Philippe Méthot <jp.met...@planethoster.info> wrote: > A few months back, we had one of our OSD node motherboards die. At the time, > we simply waited for recovery and purged the OSDs that were on the dead node. > We just replaced that node and added back the drives as new OSDs. At the ceph > administration level, everything looks fine, no duplicate OSDs when I execute > map commands or ask Ceph to list what OSDs are on the node. However, on the > OSD node, in /var/log/ceph/ceph-volume, I see that every time the server > boots, ceph-volume tries to search for OSD fsids that don’t exist. Here’s the > error: > > [2019-10-29 13:12:02,864][ceph_volume][ERROR ] exception caught by decorator > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 59, > in newfunc > return f(*a, **kw) > File "/usr/lib/python2.7/site-packages/ceph_volume/main.py", line 148, in > main > terminal.dispatch(self.mapper, subcommand_args) > File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, > in dispatch > instance.main() > File "/usr/lib/python2.7/site-packages/ceph_volume/devices/lvm/main.py", > line 40, in main > terminal.dispatch(self.mapper, self.argv) > File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, > in dispatch > instance.main() > File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 16, > in is_root > return func(*a, **kw) > File "/usr/lib/python2.7/site-packages/ceph_volume/devices/lvm/trigger.py", > line 70, in main > Activate(['--auto-detect-objectstore', osd_id, osd_uuid]).main() > File > "/usr/lib/python2.7/site-packages/ceph_volume/devices/lvm/activate.py", line > 339, in main > self.activate(args) > File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 16, > in is_root > return func(*a, **kw) > File > "/usr/lib/python2.7/site-packages/ceph_volume/devices/lvm/activate.py", line > 249, in activate > raise RuntimeError('could not find osd.%s with fsid %s' % (osd_id, > osd_fsid)) > RuntimeError: could not find osd.213 with fsid > 22800a80-2445-41a3-8643-69b4b84d598a > > Of course this fsid ID isn’t listed anywhere in Ceph. Where does ceph-volume > get this fsid from? Even when looking at the code, it’s not particularly > obvious. This is ceph mimic running on CentOS 7 and bluestore.
That's not the cluster fsid, but the osd fsid. Try running this command on your OSD node to get more details: ceph-volume inventory --format json-pretty My guess is there are some systemd files laying around for the old OSDs, or you were using 'ceph-volume simple' in the past (check for /etc/ceph/osd/). Bryan _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com