Hi Venky,thanks for your hint at https://tracker.ceph.com/issues/36349. We finished the „scan_links“ procedure with thousands of lines in the console log.
<---- Example: Start ----> #/:> cephfs-data-scan scan_links --filesystem cephfs2025-05-23T03:36:05.228+0200 7f1c46321840 -1 datascan.scan_links: Remove duplicated ino 0x0x20009f7bcee from 0x100215e1933/tmpesktg21e
2025-05-23T03:36:05.228+0200 7f1c46321840 -1 datascan.scan_links: Remove duplicated ino 0x0x2000a0d086a from 0x2000a0d0869/part-00000-of-00001.data-00000-of-00001.tempstate181943161020189849
<---- Example: End ---->This points out that there are many broken things. Hopefully all of the broken things could be repaired. We did not finished yet, but we are working on it.
Related to issue, you pointed us on, the bug was first noticed 6 years ago and you stopped working on it. The reason was „can’t reproduce“. Are there any hints on what this bug was triggered? Is there may be an cluster configuration that could trigger the CephFS bug behaviour (MDS in stand-by replay, Multi MDS configuration, ...).
Regards, Michael Am 22.05.25 um 07:29 schrieb Venky Shankar:
Hi Michael, On Wed, May 21, 2025 at 10:09 PM Michael Götting <m...@techfak.uni-bielefeld.de> wrote:Hi all, we have the following problem with our CephFS Setup (Ceph version 19.2.2). Today our two active MDS nodes failed and then the nodes that were in „stand-by replay“ took over and failed as well. The CephFS system is equipped as follows: - 3 monitor nodes - 4 MDS nodes (2 active/ 2 stand-by) - 2 active - 2 stand-by replay - CephFS Pool - max_mds = 2 - 1x Meta data pool - 2x data pools (hdd_pool, ssd_pool) << ----------------- Ceph fs status output START----------------- >> cephfs - 0 clients RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 failed 1 failed << ----------------- Ceph fs status output END----------------- >> To restore the service, we used the documentation https://docs.ceph.com/en/quincy/cephfs/disaster-recovery-experts/?highlight=mds+repair we carried out the steps up to and including "MDS table wipes". We did not carry out the MDS MAP RESET step as we were not sure that we would then lose all the data from RANK 1. We also carried out the steps under "Avoiding recovery roadblocks" https://docs.ceph.com/en/quincy/cephfs/troubleshooting/#avoiding-recovery-roadblocks. Parameters set of the MDS nodes: mds advanced mds_abort_on_newly_corrupt_dentry false mds advanced mds_bal_interval 0 mds basic mds_cache_memory_limit 274877906944 mds advanced mds_cache_trim_threshold 524288 mds advanced mds_go_bad_corrupt_dentry false mds advanced mds_heartbeat_grace 3600. 000000 mds advanced mds_min_caps_working_set 60000 mds advanced mds_oft_prefetch_dirfrags false * After trying the recovery steps (truncating the journal) the MDS daemons are in a crash -> restart loop behavior. << ----------------- Example log file mds-1: START ----------------- >> -14> 2025-05-21T17:49:27.003+0200 7f36cfc7e640 1 mds.0.42052 active_start -13> 2025-05-21T17:49:27.003+0200 7f36d2c84640 10 monclient: get_auth_request con 0x559a76e8a400 auth_method 0 -12> 2025-05-21T17:49:27.003+0200 7f36d2483640 10 monclient: get_auth_request con 0x559a7e041400 auth_method 0 -11> 2025-05-21T17:49:27.003+0200 7f36d3485640 10 monclient: get_auth_request con 0x559a76e8b000 auth_method 0 -10> 2025-05-21T17:49:27.003+0200 7f36d3485640 10 monclient: get_auth_request con 0x559ab89cb800 auth_method 0 -9> 2025-05-21T17:49:27.003+0200 7f36d2c84640 10 monclient: get_auth_request con 0x559a7bcc6400 auth_method 0 -8> 2025-05-21T17:49:27.015+0200 7f36cfc7e640 1 mds.0.42052 cluster recovered. -7> 2025-05-21T17:49:27.015+0200 7f36cfc7e640 4 mds.0.42052 set_osd_epoch_barrier: epoch=492573 -6> 2025-05-21T17:49:27.015+0200 7f36cfc7e640 5 quiesce.mds.0 <quiesce_cluster_update> epoch:42055 me:7764062 leader:7764062 members:7764062 -5> 2025-05-21T17:49:27.015+0200 7f36cfc7e640 5 quiesce.mgr.0 <update_membership> starting the db mgr thread at epoch: 42055 -4> 2025-05-21T17:49:27.015+0200 7f36c5c6a640 5 quiesce.mgr.0 <quiesce_db_thread_main> Entering the main thread -3> 2025-05-21T17:49:27.015+0200 7f36c5c6a640 5 quiesce.mgr.0 <membership_upkeep> a reset of the db has been requested -2> 2025-05-21T17:49:27.015+0200 7f36c9471640 -1 mds.0.cache.den(0x1 techfak) newly corrupt dentry to be committed: [dentry #0x1/techfak [c,head] auth (dversion lock) pv=0 v=52947746 ino=0x1000a58d072 state=1073741824 | inodepin=1 0x559a755b2c80] -1> 2025-05-21T17:49:27.015+0200 7f36c9471640 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/19.2.2/rpm/el9/BUILD/ceph-19.2.2/src/mds/MDCache.cc: In function 'void MDCache::journal_cow_dentry(MutationImpl*, EMetaBlob*, CDentry*, snapid_t, CInode**, CDentry::linkage_t*)' thread 7f36c9471640 time 2025-05-21T17:49:27.020101+0200 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/19.2.2/rpm/el9/BUILD/ceph-19.2.2/src/mds/MDCache.cc: 1687: FAILED ceph_assert(follows >= realm->get_newest_seq())You are running into https://tracker.ceph.com/issues/36349 which got closed since it wasn't reproducible and there wasn't any more debug information to make progress. To recover from this situation, please refer here https://tracker.ceph.com/issues/36349#note-5ceph version 19.2.2 (0eceb0defba60152a8182f7bd87d164b639885b8) squid (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x121) [0x7f36d5709cf9] 2: /usr/lib64/ceph/libceph-common.so.2(+0x182eb8) [0x7f36d5709eb8] 3: (MDCache::journal_cow_dentry(MutationImpl*, EMetaBlob*, CDentry*, snapid_t, CInode**, CDentry::linkage_t*)+0xac3) [0x559a51ca4583] 4: (MDCache::journal_dirty_inode(MutationImpl*, EMetaBlob*, CInode*, snapid_t)+0xbd) [0x559a51ca50cd] 5: (MDCache::predirty_journal_parents(boost::intrusive_ptr<MutationImpl>, EMetaBlob*, CInode*, CDir*, int, int, snapid_t)+0xe71) [0x559a51cab8f1] 6: (Locker::check_inode_max_size(CInode*, bool, unsigned long, unsigned long, utime_t)+0x473) [0x559a51d55a33] 7: (RecoveryQueue::_recovered(CInode*, int, unsigned long, utime_t)+0x390) [0x559a51d2e750] 8: (MDSContext::complete(int)+0x5c) [0x559a51e4617c] 9: (MDSIOContextBase::complete(int)+0x34c) [0x559a51e4884c] 10: /usr/bin/ceph-mds(+0x4f5970) [0x559a51eed970] 11: /usr/bin/ceph-mds(+0x160f0d) [0x559a51b58f0d] 12: (Finisher::finisher_thread_entry()+0x17d) [0x7f36d57c885d] 13: /lib64/libc.so.6(+0x8a0ca) [0x7f36d50a30ca] 14: /lib64/libc.so.6(+0x10f150) [0x7f36d5128150] 0> 2025-05-21T17:49:27.019+0200 7f36c9471640 -1 *** Caught signal (Aborted) ** in thread 7f36c9471640 thread_name: ceph version 19.2.2 (0eceb0defba60152a8182f7bd87d164b639885b8) squid (stable) 1: /lib64/libc.so.6(+0x3ebf0) [0x7f36d5057bf0] 2: /lib64/libc.so.6(+0x8be0c) [0x7f36d50a4e0c] 3: raise() 4: abort() 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x17b) [0x7f36d5709d53] 6: /usr/lib64/ceph/libceph-common.so.2(+0x182eb8) [0x7f36d5709eb8] 7: (MDCache::journal_cow_dentry(MutationImpl*, EMetaBlob*, CDentry*, snapid_t, CInode**, CDentry::linkage_t*)+0xac3) [0x559a51ca4583] 8: (MDCache::journal_dirty_inode(MutationImpl*, EMetaBlob*, CInode*, snapid_t)+0xbd) [0x559a51ca50cd] 9: (MDCache::predirty_journal_parents(boost::intrusive_ptr<MutationImpl>, EMetaBlob*, CInode*, CDir*, int, int, snapid_t)+0xe71) [0x559a51cab8f1] 10: (Locker::check_inode_max_size(CInode*, bool, unsigned long, unsigned long, utime_t)+0x473) [0x559a51d55a33] 11: (RecoveryQueue::_recovered(CInode*, int, unsigned long, utime_t)+0x390) [0x559a51d2e750] 12: (MDSContext::complete(int)+0x5c) [0x559a51e4617c] 13: (MDSIOContextBase::complete(int)+0x34c) [0x559a51e4884c] 14: /usr/bin/ceph-mds(+0x4f5970) [0x559a51eed970] 15: /usr/bin/ceph-mds(+0x160f0d) [0x559a51b58f0d] 16: (Finisher::finisher_thread_entry()+0x17d) [0x7f36d57c885d] 17: /lib64/libc.so.6(+0x8a0ca) [0x7f36d50a30ca] 18: /lib64/libc.so.6(+0x10f150) [0x7f36d5128150] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. << ----------------- Example log file mds-1: END ----------------- >> << ----------------- Ceph fs fs dump output START----------------- >> e41371 btime 2025-05-21T16:19:27:085643+0200 enable_multiple, ever_enabled_multiple: 1,1 default compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2} legacy client fscid: 3 Filesystem 'cephfs' (3) fs_name cephfs epoch 41370 flags 73 allow_snaps allow_multimds_snaps allow_standby_replay refuse_client_session created 2024-03-31T23:36:25.302389+0200 modified 2025-05-21T16:19:09.237977+0200 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 max_xattr_size 65536 required_client_features {} last_failure 0 last_failure_osd_epoch 492429 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2,11=minor log segments,12=quiesce subvolumes} max_mds 2 in 0,1 up {1=7761302} failed 0 damaged stopped data_pools [5,3] metadata_pool 4 inline_data disabled balancer bal_rank_mask -1 standby_count_wanted 2 qdb_cluster leader: 7761302 members: 7761302 [mds.mds-1{1:7761302} state up:active seq 5 addr [v2:[2001:638:504:2011:9:3:1:1]:6800/2463826788,v1:[2001:638:504:2011:9:3:1:1]:6801/2463826788] compat {c=[1],r=[1],i=[1fff]}] Standby daemons: [mds.mds-2{-1:7771506} state up:standby seq 1 addr [v2:[2001:638:504:2011:6:3:2:2]:6800/2809657192,v1:[2001:638:504:2011:6:3:2:2]:6801/2809657192] compat {c=[1],r=[1],i=[1fff]}] dumped fsmap epoch 41371 << ----------------- Ceph fs fs dump output END———————— >> But to be honest, out of all those things we tried, I don't know what to provide exactly. We can provide much more but ... We really need the service back online, so help will be very much appreciated. Regards, Michael _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io