Hi,

Am 6/30/25 um 16:50 schrieb Robert Sander:

With marking the MDS as repaired you mean the command "ceph mds repaired 
storage_cluster:0", right?

It looks like we hit this bug: https://tracker.ceph.com/issues/65094

From the MDS log: No subtrees found for root MDS rank!

Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.0.1753398 handle_mds_map i am now 
mds.0.1753398
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.0.1753398 handle_mds_map state change 
up:reconnect --> up:rejoin
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.0.1753398 rejoin_start
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.0.1753398 rejoin_joint_start
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.0.1753398 rejoin_done
Jul 01 08:50:56 sn04 ceph-mds[1563881]: log_channel(cluster) log [ERR] : No 
subtrees found for root MDS rank!
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.beacon.storage_cluster.sn04.cbvzzu 
set_want_state: up:rejoin -> down:damaged
Jul 01 08:50:56 sn04 ceph-mds[1563881]: log_client  log_queue is 1 last_log 1 
sent 0 num 1 unsent 1 sending 1
Jul 01 08:50:56 sn04 ceph-mds[1563881]: log_client  will send 
2025-07-01T06:50:56.151556+0000 mds.storage_cluster.sn04.cbvzzu (mds.0) 1 : 
cluster [ERR] No subtrees found for root MDS rank!
Jul 01 08:50:56 sn04 ceph-mds[1563881]: monclient: _send_mon_message to 
mon.sn03 at v2:192.168.91.53:3300/0
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.beacon.storage_cluster.sn04.cbvzzu 
Sending beacon down:damaged seq 11440
Jul 01 08:50:56 sn04 ceph-mds[1563881]: monclient: _send_mon_message to 
mon.sn03 at v2:192.168.91.53:3300/0
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.beacon.storage_cluster.sn04.cbvzzu 
received beacon reply up:rejoin seq 11439 rtt 1.011
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.beacon.storage_cluster.sn04.cbvzzu 
received beacon reply down:damaged seq 11440 rtt 0.0880002
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu respawn!
Jul 01 08:50:56 sn04 
ceph-28ca2bfa-d87e-11ed-83a3-1070fddda30f-mds-storage_cluster-sn04-cbvzzu[1563770]:
 -9999> 2025-07-01T06:50:56.150+0000 7f5daca13640 -1 log_channel(cluster) log 
[ERR] : No subtrees found for root MDS rank!
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu  e: 
'/usr/bin/ceph-mds'
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu  0: 
'/usr/bin/ceph-mds'
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu  1: '-n'
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu  2: 
'mds.storage_cluster.sn04.cbvzzu'
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu  3: '-f'
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu  4: 
'--setuser'
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu  5: 
'ceph'
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu  6: 
'--setgroup'
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu  7: 
'ceph'
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu  8: 
'--default-log-to-file=false'
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu  9: 
'--default-log-to-journald=true'
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu  10: 
'--default-log-to-stderr=false'
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu 
respawning with exe /usr/bin/ceph-mds
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu  
exe_path /proc/self/exe
Jul 01 08:50:56 sn04 
ceph-28ca2bfa-d87e-11ed-83a3-1070fddda30f-mds-storage_cluster-sn04-cbvzzu[1563770]:
 ignoring --setuser ceph since I am not root
Jul 01 08:50:56 sn04 
ceph-28ca2bfa-d87e-11ed-83a3-1070fddda30f-mds-storage_cluster-sn04-cbvzzu[1563770]:
 ignoring --setgroup ceph since I am not root
Jul 01 08:50:56 sn04 ceph-mds[1563881]: ceph version 18.2.4 
(e7ad5345525c7aa95470c26863873b581076945d) reef (stable), process ceph-mds, pid 
2
Jul 01 08:50:56 sn04 ceph-mds[1563881]: main not setting numa affinity
Jul 01 08:50:56 sn04 ceph-mds[1563881]: pidfile_write: ignore empty --pid-file
Jul 01 08:50:56 sn04 
ceph-28ca2bfa-d87e-11ed-83a3-1070fddda30f-mds-storage_cluster-sn04-cbvzzu[1563770]:
 starting mds.storage_cluster.sn04.cbvzzu at
Jul 01 08:50:56 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu 
Updating MDS map to version 1753401 from mon.2
Jul 01 08:50:57 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu 
Updating MDS map to version 1753402 from mon.2
Jul 01 08:50:57 sn04 ceph-mds[1563881]: mds.storage_cluster.sn04.cbvzzu 
Monitors have assigned me to become a standby.

And that's it. The MDS journal integrity seems to be OK:

# /usr/bin/cephfs-journal-tool --rank=storage_cluster:all journal inspect
Overall journal integrity: OK

How do we get this filesystem online again?

Regards
--
Robert Sander
Linux Consultant

Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: +49 30 405051 - 0
Fax: +49 30 405051 - 19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to