I did some digging around and yes, it is exactly as you said: systemd files 
remained to boot up the previous OSDs. We removed them and now it works 
properly. Thank you for the help.


Jean-Philippe Méthot
Openstack system administrator
Administrateur système Openstack
PlanetHoster inc.




> Le 29 oct. 2019 à 15:34, Bryan Stillwell <bstillw...@godaddy.com> a écrit :
> 
> On Oct 29, 2019, at 11:23 AM, Jean-Philippe Méthot 
> <jp.met...@planethoster.info> wrote:
>> A few months back, we had one of our OSD node motherboards die. At the time, 
>> we simply waited for recovery and purged the OSDs that were on the dead 
>> node. We just replaced that node and added back the drives as new OSDs. At 
>> the ceph administration level, everything looks fine, no duplicate OSDs when 
>> I execute map commands or ask Ceph to list what OSDs are on the node. 
>> However, on the OSD node, in /var/log/ceph/ceph-volume, I see that every 
>> time the server boots, ceph-volume tries to search for OSD fsids that don’t 
>> exist. Here’s the error:
>> 
>> [2019-10-29 13:12:02,864][ceph_volume][ERROR ] exception caught by decorator
>> Traceback (most recent call last):
>>  File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 59, 
>> in newfunc
>>    return f(*a, **kw)
>>  File "/usr/lib/python2.7/site-packages/ceph_volume/main.py", line 148, in 
>> main
>>    terminal.dispatch(self.mapper, subcommand_args)
>>  File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, 
>> in dispatch
>>    instance.main()
>>  File "/usr/lib/python2.7/site-packages/ceph_volume/devices/lvm/main.py", 
>> line 40, in main
>>    terminal.dispatch(self.mapper, self.argv)
>>  File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, 
>> in dispatch
>>    instance.main()
>>  File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 16, 
>> in is_root
>>    return func(*a, **kw)
>>  File "/usr/lib/python2.7/site-packages/ceph_volume/devices/lvm/trigger.py", 
>> line 70, in main
>>    Activate(['--auto-detect-objectstore', osd_id, osd_uuid]).main()
>>  File 
>> "/usr/lib/python2.7/site-packages/ceph_volume/devices/lvm/activate.py", line 
>> 339, in main
>>    self.activate(args)
>>  File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 16, 
>> in is_root
>>    return func(*a, **kw)
>>  File 
>> "/usr/lib/python2.7/site-packages/ceph_volume/devices/lvm/activate.py", line 
>> 249, in activate
>>    raise RuntimeError('could not find osd.%s with fsid %s' % (osd_id, 
>> osd_fsid))
>> RuntimeError: could not find osd.213 with fsid 
>> 22800a80-2445-41a3-8643-69b4b84d598a
>> 
>> Of course this fsid ID isn’t listed anywhere in Ceph. Where does ceph-volume 
>> get this fsid from? Even when looking at the code, it’s not particularly 
>> obvious. This is ceph mimic running on CentOS 7 and bluestore.
> 
> That's not the cluster fsid, but the osd fsid.  Try running this command on 
> your OSD node to get more details:
> 
> ceph-volume inventory --format json-pretty
> 
> My guess is there are some systemd files laying around for the old OSDs, or 
> you were using 'ceph-volume simple' in the past (check for /etc/ceph/osd/).
> 
> Bryan
> 

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to