On Oct 29, 2019, at 11:23 AM, Jean-Philippe Méthot 
<jp.met...@planethoster.info> wrote:
> A few months back, we had one of our OSD node motherboards die. At the time, 
> we simply waited for recovery and purged the OSDs that were on the dead node. 
> We just replaced that node and added back the drives as new OSDs. At the ceph 
> administration level, everything looks fine, no duplicate OSDs when I execute 
> map commands or ask Ceph to list what OSDs are on the node. However, on the 
> OSD node, in /var/log/ceph/ceph-volume, I see that every time the server 
> boots, ceph-volume tries to search for OSD fsids that don’t exist. Here’s the 
> error:
> 
> [2019-10-29 13:12:02,864][ceph_volume][ERROR ] exception caught by decorator
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 59, 
> in newfunc
>     return f(*a, **kw)
>   File "/usr/lib/python2.7/site-packages/ceph_volume/main.py", line 148, in 
> main
>     terminal.dispatch(self.mapper, subcommand_args)
>   File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, 
> in dispatch
>     instance.main()
>   File "/usr/lib/python2.7/site-packages/ceph_volume/devices/lvm/main.py", 
> line 40, in main
>     terminal.dispatch(self.mapper, self.argv)
>   File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, 
> in dispatch
>     instance.main()
>   File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 16, 
> in is_root
>     return func(*a, **kw)
>   File "/usr/lib/python2.7/site-packages/ceph_volume/devices/lvm/trigger.py", 
> line 70, in main
>     Activate(['--auto-detect-objectstore', osd_id, osd_uuid]).main()
>   File 
> "/usr/lib/python2.7/site-packages/ceph_volume/devices/lvm/activate.py", line 
> 339, in main
>     self.activate(args)
>   File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 16, 
> in is_root
>     return func(*a, **kw)
>   File 
> "/usr/lib/python2.7/site-packages/ceph_volume/devices/lvm/activate.py", line 
> 249, in activate
>     raise RuntimeError('could not find osd.%s with fsid %s' % (osd_id, 
> osd_fsid))
> RuntimeError: could not find osd.213 with fsid 
> 22800a80-2445-41a3-8643-69b4b84d598a
> 
> Of course this fsid ID isn’t listed anywhere in Ceph. Where does ceph-volume 
> get this fsid from? Even when looking at the code, it’s not particularly 
> obvious. This is ceph mimic running on CentOS 7 and bluestore.

That's not the cluster fsid, but the osd fsid.  Try running this command on 
your OSD node to get more details:

ceph-volume inventory --format json-pretty

My guess is there are some systemd files laying around for the old OSDs, or you 
were using 'ceph-volume simple' in the past (check for /etc/ceph/osd/).

Bryan

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to