Ubuntu 20.04.3, Octopus 152.13, cephadm + podman

After a routine reboot, all OSDs on a host did not come up, after a few
iterations of cephadm deploy, and fixing the missing config file, the
daemons remain in the error state but neither journalctl / systemctl  show
any log errors other than exit status error. I notice that the
/var/lib/ceph/* directories no longer have consistent owner:group settings,
but across the cluster ownership it is not set correctly however all other
hosts are working.

Where to find more detailed logs? or do I need to adjust a log-level first?
thanks.

root@rnk-00:~# ceph health detail
HEALTH_WARN 9 failed cephadm daemon(s)
[WRN] CEPHADM_FAILED_DAEMON: 9 failed cephadm daemon(s)
    daemon osd.62 on rnk-06 is in error state
    daemon osd.54 on rnk-06 is in error state
    daemon osd.60 on rnk-06 is in error state
    daemon osd.57 on rnk-06 is in error state
    daemon osd.56 on rnk-06 is in error state
    daemon osd.61 on rnk-06 is in error state
    daemon osd.58 on rnk-06 is in error state
    daemon osd.59 on rnk-06 is in error state
    daemon osd.55 on rnk-06 is in error state
root@rnk-00:~#
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to