Is there anything in the OSD logs? I might misremember, but when I fiddle with my test clusters, the podman/systemd control sometimes breaks, but after a reboot it's usually fine. Does it stop if you simply run 'systemctl stop ceph-{CEPH_FSID}@osd.14'? I would also inspect the 'podman ps' or 'docker ps' output on that node, maybe the daemon is in some error state although it's running?

Zitat von Alan Murrell <a...@t-net.ca>:

 You can always fail the mgr (ceph mgr fail) and retry.

I failed the current active mgr (cephnode01) and the secondary took over, but running the 'ceph orch daemon stop osd.14' still did not work.

Something *seems* to wonky with this OSD< but everything appears healthy. I even ran a SMART test against the HDD itself and it reported healthy.

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to