Hi Christophe, 

You could but it won't be of any help since the journal is empty. What you can 
do to fix the fs metadata is to run the below commands from the 
disaster-recovery-experts documentation [1] in this particular order: 

#Prevent access to the fs and set it down. 
ceph fs set cfs_irods_test refuse_client_session true 
ceph fs set cfs_irods_test joinable false 
ceph fs set cfs_irods_test down true 

# Reset maps and journal 
cephfs-table-tool cfs_irods_test:0 reset session 
cephfs-table-tool cfs_irods_test:0 reset snap 
cephfs-table-tool cfs_irods_test:0 reset inode 

cephfs-journal-tool --rank cfs_irods_test:0 journal reset --force 
cephfs-data-scan init --force-init --filesystem cfs_irods_test 

# Rescan data and fix metadata (leaving the below commands commented for 
information on how to // these scan tasks) 
#for i in {0..15} ; do cephfs-data-scan scan_frags --filesystem cfs_irods_test 
--force-corrupt --worker_n $i --worker_m 16 & done 
#for i in {0..15} ; do cephfs-data-scan scan_extents --filesystem 
cfs_irods_test --worker_n $i --worker_m 16 & done 
#for i in {0..15} ; do cephfs-data-scan scan_inodes --filesystem cfs_irods_test 
--force-corrupt --worker_n $i --worker_m 16 & done 
#for i in {0..15} ; do cephfs-data-scan scan_links --filesystem cfs_irods_test 
--worker_n $i --worker_m 16 & done 

cephfs-data-scan scan_frags --filesystem cfs_irods_test --force-corrupt 
cephfs-data-scan scan_extents --filesystem cfs_irods_test 
cephfs-data-scan scan_inodes --filesystem cfs_irods_test --force-corrupt 
cephfs-data-scan scan_links --filesystem cfs_irods_test 
cephfs-data-scan cleanup --filesystem cfs_irods_test 

#ceph mds repaired 0 <---- should not be necessary 

# Set the fs back online and accessible 
ceph fs set cfs_irods_test down false 
ceph fs set cfs_irods_test joinable true 
ceph fs set cfs_irods_test refuse_client_session false 

An MDS should now start, if not then use 'ceph orch daemon restart mds.xxxxx' 
to start a MDS. After remounting the fs you should be able to access /testdir1 
and /testdir2 in the fs root. 

# scrub the fs again to check that if everything is OK. 
ceph tell mds.cfs_irods_test:0 scrub start / recursive,repair,force 

Regards, 
Frédéric. 

[1] https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/ 

----- Le 22 Avr 25, à 10:21, Christophe DIARRA <christophe.dia...@idris.fr> a 
écrit : 

> Hello Frédéric,

> Thank your for your help.

> Following is output you asked for:

> [root@fidrcmon-01 ~]# date
> Tue Apr 22 10:09:10 AM CEST 2025
> [root@fidrcmon-01 ~]# ceph tell mds.cfs_irods_test:0 scrub start /
> recursive,repair,force
> 2025-04-22T10:09:12.796+0200 7f43f6ffd640 0 client.86553 ms_handle_reset on
> v2:130.84.80.10:6800/3218663047
> 2025-04-22T10:09:12.818+0200 7f43f6ffd640 0 client.86559 ms_handle_reset on
> v2:130.84.80.10:6800/3218663047
> {
> "return_code": 0,
> "scrub_tag": "12e537bb-bb39-4f3b-ae09-e0a1ae6ce906",
> "mode": "asynchronous"
> }
> [root@fidrcmon-01 ~]# ceph tell mds.cfs_irods_test:0 scrub status
> 2025-04-22T10:09:31.760+0200 7f3f0f7fe640 0 client.86571 ms_handle_reset on
> v2:130.84.80.10:6800/3218663047
> 2025-04-22T10:09:31.781+0200 7f3f0f7fe640 0 client.86577 ms_handle_reset on
> v2:130.84.80.10:6800/3218663047
> {
> "status": "no active scrubs running",
> "scrubs": {}
> }
> [root@fidrcmon-01 ~]# cephfs-journal-tool --rank cfs_irods_test:0 event
> recover_dentries list
> 2025-04-16T18:24:56.802960+0200 0x7c334a SUBTREEMAP: ()
> [root@fidrcmon-01 ~]#

> Based on this output, can I run the other three commands provided in your
> message :
> ceph tell mds.0 flush journal
> ceph mds fail 0
> ceph tell mds.cfs_irods_test:0 scrub start / recursive

> Thanks, Christophe

> On 19/04/2025 12:55, Frédéric Nass wrote:

>> Hi Christophe, Hi David,

>> Could you share the ouptut of the below command after running the scrubbing 
>> with
>> recursive,repair,force?

>> cephfs-journal-tool --rank cfs_irods_test:0 event recover_dentries list

>> Could be that the MDS recovered these 2 dentries in its journal already but 
>> the
>> status of the filesystem was not updated yet. I've seen this happening 
>> before.
>> If that the case, you could try a flush, fail and re-scrub:

>> ceph tell mds.0 flush journal
>> ceph mds fail 0
>> ceph tell mds.cfs_irods_test:0 scrub start / recursive

>> This might clear the HEALTH_ERR. If not, then it will be easy to fix by
>> rebuilding / fixing the metadata from the data pools since this fs is empty.

>> Let us know,

>> Regards,
>> Frédéric.

>> ----- Le 18 Avr 25, à 9:51, David [ mailto:david.cas...@aevoo.fr |
>> david.cas...@aevoo.fr ] a écrit :

>>> I also tend to think that the disk has nothing to do with the problem.

>>> My reading is that the inode associated with the dentry is missing.
>>> Can anyone correct me?

>>> Christophe informed me that the directories were emptied before the
>>> incident.

>>> I don't understand why scrubbing doesn't repair the meta data.
>>> Perhaps because the directory is empty ?

>>> Le jeu. 17 avr. 2025 à 19:06, Anthony D'Atri [ 
>>> mailto:anthony.da...@gmail.com |
>>> <anthony.da...@gmail.com> ] a
>>> écrit :

>>>> HPE rebadges drives from manufacturers.  A quick search supports the idea
>>>> that this SKU is fulfilled at least partly by Kioxia, so not likely a PLP
>>>> issue.

>>>>> On Apr 17, 2025, at 11:39 AM, Christophe DIARRA <

>>>> [ mailto:christophe.dia...@idris.fr | christophe.dia...@idris.fr ] > wrote:

>>>>> Hello David,

>>>>> The SSD model is VO007680JWZJL.

>>>>> I will delay the 'ceph tell mds.cfs_irods_test:0 damage rm 241447932'

>>>> for the moment. If any other solution is found I will be obliged to use
>>>> this command.

>>>>> I found 'dentry' in the logs when the cephfs cluster started:

>>>>>> Apr 16 17:29:53 mon-02 ceph-mds[2367]: mds.cfs_irods_test.mon-02.awuygq

>>>> Updating MDS map to version 15613 from mon.2

>>>>>> Apr 16 17:29:53 mon-02 ceph-mds[2367]: mds.0.15612 handle_mds_map i am

>>>> now mds.0.15612

>>>>>> Apr 16 17:29:53 mon-02 ceph-mds[2367]: mds.0.15612 handle_mds_map state

>>>> change up:starting --> up:active

>>>>>> Apr 16 17:29:53 mon-02 ceph-mds[2367]: mds.0.15612 active_start
>>>>>> Apr 16 17:29:53 mon-02 ceph-mds[2367]: mds.0.cache.den(0x1 testdir2)

>>>> loaded already *corrupt dentry*: [dentry #0x1/testdir2 [2,head] [ 
>>>> mailto:rep@0.0
>>>> | rep@0.0 ] NULL (dversion lock) pv=0 v=4442 ino=(n

>>>>>> il) state=0 0x5617e18c8280]
>>>>>> Apr 16 17:29:53 mon-02 ceph-mds[2367]: mds.0.cache.den(0x1 testdir1)

>>>> loaded already *corrupt dentry*: [dentry #0x1/testdir1 [2,head] [ 
>>>> mailto:rep@0.0
>>>> | rep@0.0 ] NULL (dversion lock) pv=0 v=4442 ino=(n

>>>>>> il) state=0 0x5617e18c8500]
>>>>>> Apr 16 17:29:53 mon-02 ceph-mon[2288]: Health check failed: 1

>>>> filesystem is offline (MDS_ALL_DOWN)

>>>>>> Apr 16 17:29:53 mon-02 ceph-mon[2288]: Health check failed: 1

>>>> filesystem is online with fewer MDS than max_mds (MDS_UP_LESS_THAN_MAX)

>>>>>> Apr 16 17:29:53 mon-02 ceph-mon[2288]: from='client.?

>>>> xx.xx.xx.8:0/3820885518' entity='client.admin' cmd='[{"prefix": "fs set",
>>>> "fs_name": "cfs_irods_test", "var": "down", "val":

>>>>>> "false"}]': finished
>>>>>> Apr 16 17:29:53 mon-02 ceph-mon[2288]: daemon

>>>> mds.cfs_irods_test.mon-02.awuygq assigned to filesystem cfs_irods_test as
>>>> rank 0 (now has 1 ranks)

>>>>>> Apr 16 17:29:53 mon-02 ceph-mon[2288]: Health check cleared:

>>>> MDS_ALL_DOWN (was: 1 filesystem is offline)

>>>>>> Apr 16 17:29:53 mon-02 ceph-mon[2288]: Health check cleared:

>>>> MDS_UP_LESS_THAN_MAX (was: 1 filesystem is online with fewer MDS than
>>>> max_mds)

>>>>>> Apr 16 17:29:53 mon-02 ceph-mon[2288]: daemon

>>>> mds.cfs_irods_test.mon-02.awuygq is now active in filesystem cfs_irods_test
>>>> as rank 0

>>>>>> Apr 16 17:29:54 mon-02 ceph-mgr[2444]: log_channel(cluster) log [DBG] :

>>>> pgmap v1721: 4353 pgs: 4346 active+clean, 7 active+clean+scrubbing+deep;
>>>> 3.9 TiB data, 417 TiB used, 6.4 P

>>>>>> iB / 6.8 PiB avail; 1.4 KiB/s rd, 1 op/s

>>>>> If you need more extract from the log file please let me know.

>>>>> Thanks for your help,

>>>>> Christophe

>>>>> On 17/04/2025 13:39, David C. wrote:

>>>>>> If I'm not mistaken, this is a fairly rare situation.

>>>>>> The fact that it's the result of a power outage makes me think of a bad

>>>> SSD (like "S... Pro").

>>>>>> Does a grep of the dentry id in the MDS logs return anything?
>>>>>> Maybe some interesting information around this grep

>>>>>> In the heat of the moment, I have no other idea than to delete the

>>>> dentry.

>>>>>> ceph tell mds.cfs_irods_test:0 damage rm 241447932

>>>>>> However, in production, this results in the content (of dir

>>>> /testdir[12]) being abandoned.

>>>>>> Le jeu. 17 avr. 2025 à 12:44, Christophe DIARRA <

>>>> [ mailto:christophe.dia...@idris.fr | christophe.dia...@idris.fr ] > a 
>>>> écrit :

>>>>>> Hello David,

>>>>>>    Thank you for the tip about the scrubbing. I have tried the
>>>>>>    commands found in the documentation but it seems to have no effect:

>>>>>>    [root@mon-01 ~]#*ceph tell mds.cfs_irods_test:0 scrub start /

>>>> recursive,repair,force*

>>>>>> 2025-04-17T12:07:20.958+0200 7fd4157fa640  0 client.86301

>>>> ms_handle_reset on v2:130.84.80.10:6800/3218663047
>>>> 2025-04-17T12:07:20.979+0200 [
>>>> http://130.84.80.10:6800/32186630472025-04-17T12:07:20.979+0200 | <
>>>> http://130.84.80.10:6800/32186630472025-04-17T12:07:20.979+0200> ] 
>>>> 7fd4157fa640
>>>>  0 client.86307 ms_handle_reset on v2:
>>>> 130.84.80.10:6800/3218663047 [ http://130.84.80.10:6800/3218663047 |
>>>> <http://130.84.80.10:6800/3218663047> ]

>>>>>> {
>>>>>>         "return_code": 0,
>>>>>>         "scrub_tag": "733b1c6d-a418-4c83-bc8e-b28b556e970c",
>>>>>>         "mode": "asynchronous"
>>>>>>    }

>>>>>>    [root@mon-01 ~]#*ceph tell mds.cfs_irods_test:0 scrub status*
>>>>>>    2025-04-17T12:07:30.734+0200 7f26cdffb640  0 client.86319

>>>> ms_handle_reset on v2:130.84.80.10:6800/3218663047
>>>> 2025-04-17T12:07:30.753+0200 [
>>>> http://130.84.80.10:6800/32186630472025-04-17T12:07:30.753+0200 | <
>>>> http://130.84.80.10:6800/32186630472025-04-17T12:07:30.753+0200> ] 
>>>> 7f26cdffb640
>>>>  0 client.86325 ms_handle_reset on v2:
>>>> 130.84.80.10:6800/3218663047 [ http://130.84.80.10:6800/3218663047 |
>>>> <http://130.84.80.10:6800/3218663047> ]

>>>>>> {
>>>>>>         "status": "no active scrubs running",
>>>>>>         "scrubs": {}
>>>>>>    }
>>>>>>    [root@mon-01 ~]# ceph -s
>>>>>>       cluster:
>>>>>>         id:     b87276e0-1d92-11ef-a9d6-507c6f66ae2e
>>>>>>         *health: HEALTH_ERR             1 MDSs report damaged metadata*
>>>>>>             services:
>>>>>>         mon: 3 daemons, quorum mon-01,mon-03,mon-02 (age 19h)
>>>>>>         mgr: mon-02.mqaubn(active, since 19h), standbys: mon-03.gvywio,

>>>> mon-01.xhxqdi

>>>>>> mds: 1/1 daemons up, 2 standby
>>>>>>         osd: 368 osds: 368 up (since 18h), 368 in (since 3w)
>>>>>>             data:
>>>>>>         volumes: 1/1 healthy
>>>>>>         pools:   10 pools, 4353 pgs
>>>>>>         objects: 1.25M objects, 3.9 TiB
>>>>>>         usage:   417 TiB used, 6.4 PiB / 6.8 PiB avail
>>>>>>         pgs:     4353 active+clean

>>>>>>    Did I miss something ?

>>>>>>    The server didn't crash. I don't understand what you are meaning
>>>>>>    by "there may be a design flaw in the infrastructure (insecure
>>>>>>    cache, for example)".
>>>>>>    How to know if we have a design problem ? What should we check ?

>>>>>>    Best regards,

>>>>>>    Christophe

>>>>>>    On 17/04/2025 11:07, David C. wrote:

>>>>>>> Hello Christophe,

>>>>>>>   Check the file system scrubbing procedure => [
>>>>>>>   https://docs.ceph.com/en/latest/cephfs/scrub/ |
>>>>>>>    https://docs.ceph.com/en/latest/cephfs/scrub/ ] But this doesn't
>>>>>>>    guarantee data recovery.

>>>>>>>    Was the cluster crashed?
>>>>>>>    Ceph should be able to handle it; there may be a design flaw in
>>>>>>>    the infrastructure (insecure cache, for example).

>>>>>>>    David

>>>>>>>   Le jeu. 17 avr. 2025 à 10:44, Christophe DIARRA [
>>>>>>>    mailto:christophe.dia...@idris.fr | <christophe.dia...@idris.fr> ] a 
>>>>>>> écrit :

>>>>>>>        Hello,

>>>>>>>        After an electrical maintenance I restarted our ceph cluster
>>>>>>>        but it
>>>>>>>        remains in an unhealthy state: HEALTH_ERR 1 MDSs report
>>>>>>>        damaged metadata.

>>>>>>>        How to repair this damaged metadata ?

>>>>>>>        To bring down the cephfs cluster I unmounted the fs from the
>>>>>>>        client
>>>>>>>        first and then did: ceph fs set cfs_irods_test down true

>>>>>>>        To bring up the cephfs cluster I did: ceph fs set
>>>>>>>        cfs_irods_test down false

>>>>>>>        Fortunately the cfs_irods_test fs is almost empty and is a fs
>>>>>>>        for
>>>>>>>        tests.The ceph cluster is not in production yet.

>>>>>>>        Following is the current status:

>>>>>>>        [root@mon-01 ~]# ceph health detail
>>>>>>>        HEALTH_ERR 1 MDSs report damaged metadata
>>>>>>>        *[ERR] MDS_DAMAGE: 1 MDSs report damaged metadata
>>>>>>>             mds.cfs_irods_test.mon-03.vlmeuz(mds.0): Metadata damage
>>>>>>>        detected*

>>>>>>>        [root@mon-01 ~]# ceph -s
>>>>>>>           cluster:
>>>>>>>             id:     b87276e0-1d92-11ef-a9d6-507c6f66ae2e
>>>>>>>             health: HEALTH_ERR
>>>>>>>                     1 MDSs report damaged metadata

>>>>>>>           services:
>>>>>>>             mon: 3 daemons, quorum mon-01,mon-03,mon-02 (age 17h)
>>>>>>>             mgr: mon-02.mqaubn(active, since 17h), standbys:
>>>>>>>        mon-03.gvywio,
>>>>>>>        mon-01.xhxqdi
>>>>>>>             mds: 1/1 daemons up, 2 standby
>>>>>>>             osd: 368 osds: 368 up (since 17h), 368 in (since 3w)

>>>>>>>           data:
>>>>>>>             volumes: 1/1 healthy
>>>>>>>             pools:   10 pools, 4353 pgs
>>>>>>>             objects: 1.25M objects, 3.9 TiB
>>>>>>>             usage:   417 TiB used, 6.4 PiB / 6.8 PiB avail
>>>>>>>             pgs:     4353 active+clean

>>>>>>>        [root@mon-01 ~]# ceph fs ls
>>>>>>>        name: cfs_irods_test, metadata pool: cfs_irods_md_test, data
>>>>>>>        pools:
>>>>>>>        [cfs_irods_def_test cfs_irods_data_test ]

>>>>>>>        [root@mon-01 ~]# ceph mds stat
>>>>>>>        cfs_irods_test:1 {0=cfs_irods_test.mon-03.vlmeuz=up:active} 2
>>>>>>>        up:standby

>>>>>>>        [root@mon-01 ~]# ceph fs status
>>>>>>>        cfs_irods_test - 0 clients
>>>>>>>        ==============
>>>>>>>        RANK  STATE MDS                    ACTIVITY DNS
>>>>>>>        INOS   DIRS   CAPS
>>>>>>>          0    active  cfs_irods_test.mon-03.vlmeuz Reqs:    0 /s
>>>>>>>        12     15
>>>>>>>        14      0
>>>>>>>                 POOL           TYPE     USED  AVAIL
>>>>>>>          cfs_irods_md_test   metadata  11.4M  34.4T
>>>>>>>          cfs_irods_def_test    data       0   34.4T
>>>>>>>        cfs_irods_data_test    data       0   4542T
>>>>>>>                    STANDBY MDS
>>>>>>>        cfs_irods_test.mon-01.hitdem
>>>>>>>        cfs_irods_test.mon-02.awuygq
>>>>>>>        MDS version: ceph version 18.2.2
>>>>>>>        (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable)
>>>>>>>        [root@mon-01 ~]#

>>>>>>>        [root@mon-01 ~]# ceph tell mds.cfs_irods_test:0 damage ls
>>>>>>>        2025-04-17T10:23:31.849+0200 7f4b87fff640  0 client.86181
>>>>>>>       ms_handle_reset on v2:130.84.80.10:6800/3218663047 [
>>>>>>>       http://130.84.80.10:6800/3218663047 | 
>>>>>>> <http://130.84.80.10:6800/3218663047> ]
>>>>>>>        2025-04-17T10:23:31.866+0200 7f4b87fff640  0 client.86187
>>>>>>>       ms_handle_reset on v2:130.84.80.10:6800/3218663047 [
>>>>>>>        http://130.84.80.10:6800/3218663047 | 
>>>>>>> <http://130.84.80.10:6800/3218663047> ] [
>>>>>>>             {
>>>>>>>        *"damage_type": "dentry",*
>>>>>>>                 "id": 241447932,
>>>>>>>                 "ino": 1,
>>>>>>>                 "frag": "*",
>>>>>>>                 "dname": "testdir2",
>>>>>>>                 "snap_id": "head",
>>>>>>>                 "path": "/testdir2"
>>>>>>>             },
>>>>>>>             {
>>>>>>>        *"damage_type": "dentry"*,
>>>>>>>                 "id": 2273238993,
>>>>>>>                 "ino": 1,
>>>>>>>                 "frag": "*",
>>>>>>>                 "dname": "testdir1",
>>>>>>>                 "snap_id": "head",
>>>>>>>                 "path": "/testdir1"
>>>>>>>             }
>>>>>>>        ]
>>>>>>>        [root@mon-01 ~]#

>>>>>>>        Any help will be appreciated,

>>>>>>>        Thanks,

>>>>>>>        Christophe
>>>>>>>        _______________________________________________
>>>>>>>       ceph-users mailing list -- [ mailto:ceph-users@ceph.io | 
>>>>>>> ceph-users@ceph.io ] To
>>>>>>>       unsubscribe send an email to [ mailto:ceph-users-le...@ceph.io |
>>>>>>>        ceph-users-le...@ceph.io ]

>>>>> _______________________________________________
>>>>> ceph-users mailing list -- [ mailto:ceph-users@ceph.io | 
>>>>> ceph-users@ceph.io ] To
>>>>> unsubscribe send an email to [ mailto:ceph-users-le...@ceph.io |
>>>>> ceph-users-le...@ceph.io ]

>>> _______________________________________________
>>> ceph-users mailing list -- [ mailto:ceph-users@ceph.io | ceph-users@ceph.io 
>>> ] To
>>> unsubscribe send an email to [ mailto:ceph-users-le...@ceph.io |
>>> ceph-users-le...@ceph.io ]

>> _______________________________________________
>> ceph-users mailing list -- [ mailto:ceph-users@ceph.io | ceph-users@ceph.io 
>> ] To
>> unsubscribe send an email to [ mailto:ceph-users-le...@ceph.io |
>> ceph-users-le...@ceph.io ]
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to