Hi, Am 6/30/25 um 16:50 schrieb Robert Sander:
With marking the MDS as repaired you mean the command "ceph mds repaired storage_cluster:0", right?
It looks like we hit this bug: https://tracker.ceph.com/issues/65094 From the MDS log: No subtrees found for root MDS rank! Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.0.1753398 handle_mds_map i am now mds.0.1753398 Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.0.1753398 handle_mds_map state change up:reconnect --> up:rejoin Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.0.1753398 rejoin_start Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.0.1753398 rejoin_joint_start Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.0.1753398 rejoin_done Jul 01 08:50:56 sn04 ceph-mds[1563881]: log_channel(cluster) log [ERR] : No subtrees found for root MDS rank! Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.beacon.storage_cluster.sn04.cbvzzu set_want_state: up:rejoin -> down:damaged Jul 01 08:50:56 sn04 ceph-mds[1563881]: log_client log_queue is 1 last_log 1 sent 0 num 1 unsent 1 sending 1 Jul 01 08:50:56 sn04 ceph-mds[1563881]: log_client will send 2025-07-01T06:50:56.151556+0000 mds.storage_cluster.sn04.cbvzzu (mds.0) 1 : cluster [ERR] No subtrees found for root MDS rank! Jul 01 08:50:56 sn04 ceph-mds[1563881]: monclient: _send_mon_message to mon.sn03 at v2:192.168.91.53:3300/0 Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.beacon.storage_cluster.sn04.cbvzzu Sending beacon down:damaged seq 11440 Jul 01 08:50:56 sn04 ceph-mds[1563881]: monclient: _send_mon_message to mon.sn03 at v2:192.168.91.53:3300/0 Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.beacon.storage_cluster.sn04.cbvzzu received beacon reply up:rejoin seq 11439 rtt 1.011 Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.beacon.storage_cluster.sn04.cbvzzu received beacon reply down:damaged seq 11440 rtt 0.0880002 Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu respawn! Jul 01 08:50:56 sn04 ceph-28ca2bfa-d87e-11ed-83a3-1070fddda30f-mds-storage_cluster-sn04-cbvzzu[1563770]: -9999> 2025-07-01T06:50:56.150+0000 7f5daca13640 -1 log_channel(cluster) log [ERR] : No subtrees found for root MDS rank! Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu e: '/usr/bin/ceph-mds' Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu 0: '/usr/bin/ceph-mds' Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu 1: '-n' Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu 2: 'mds.storage_cluster.sn04.cbvzzu' Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu 3: '-f' Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu 4: '--setuser' Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu 5: 'ceph' Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu 6: '--setgroup' Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu 7: 'ceph' Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu 8: '--default-log-to-file=false' Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu 9: '--default-log-to-journald=true' Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu 10: '--default-log-to-stderr=false' Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu respawning with exe /usr/bin/ceph-mds Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu exe_path /proc/self/exe Jul 01 08:50:56 sn04 ceph-28ca2bfa-d87e-11ed-83a3-1070fddda30f-mds-storage_cluster-sn04-cbvzzu[1563770]: ignoring --setuser ceph since I am not root Jul 01 08:50:56 sn04 ceph-28ca2bfa-d87e-11ed-83a3-1070fddda30f-mds-storage_cluster-sn04-cbvzzu[1563770]: ignoring --setgroup ceph since I am not root Jul 01 08:50:56 sn04 ceph-mds[1563881]: ceph version 18.2.4 (e7ad5345525c7aa95470c26863873b581076945d) reef (stable), process ceph-mds, pid 2 Jul 01 08:50:56 sn04 ceph-mds[1563881]: main not setting numa affinity Jul 01 08:50:56 sn04 ceph-mds[1563881]: pidfile_write: ignore empty --pid-file Jul 01 08:50:56 sn04 ceph-28ca2bfa-d87e-11ed-83a3-1070fddda30f-mds-storage_cluster-sn04-cbvzzu[1563770]: starting mds.storage_cluster.sn04.cbvzzu at Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu Updating MDS map to version 1753401 from mon.2 Jul 01 08:50:57 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu Updating MDS map to version 1753402 from mon.2 Jul 01 08:50:57 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu Monitors have assigned me to become a standby. And that's it. The MDS journal integrity seems to be OK: # /usr/bin/cephfs-journal-tool --rank=storage_cluster:all journal inspect Overall journal integrity: OK How do we get this filesystem online again? Regards -- Robert Sander Linux Consultant Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: +49 30 405051 - 0 Fax: +49 30 405051 - 19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io