On Sun, Jan 16, 2022 at 8:28 PM Frank Schilder <fr...@dtu.dk> wrote: > > I seem to have a problem. I cannot dump the mds tree: > > [root@ceph-08 ~]# ceph daemon mds.ceph-08 dump tree '~mdsdir/stray0' > root inode is not in cache > [root@ceph-08 ~]# ceph daemon mds.ceph-08 dump tree '~mds0/stray0' > root inode is not in cache > [root@ceph-08 ~]# ceph daemon mds.ceph-08 dump tree '~mds0' 0 > root inode is not in cache > [root@ceph-08 ~]# ceph daemon mds.ceph-08 dump tree '~mdsdir' 0 > root inode is not in cache > > [root@ceph-08 ~]# ceph daemon mds.ceph-08 get subtrees | grep path > "path": "", > "path": "~mds0", > > Any idea what I can do?
This was fixed recently: https://github.com/ceph/ceph/pull/44313 > > Thanks! > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Frank Schilder > Sent: 16 January 2022 14:16:16 > To: 胡 玮文; Dan van der Ster > Cc: ceph-users > Subject: Re: [Warning Possible spam] Re: cephfs: [ERR] loaded dup inode > > That looks great! I think we suffer from the same issue. I will try it out. I > assume running the script on a read-only mount will be enough? > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: 胡 玮文 <huw...@outlook.com> > Sent: 14 January 2022 17:22:35 > To: Frank Schilder; Dan van der Ster > Cc: ceph-users > Subject: [Warning Possible spam] Re: cephfs: [ERR] loaded dup inode > > Hi Frank, > > I just studied the exact same issue that conda generates a lot of strays. And > I created a Python script[1] to trigger reintegration efficiently. > > This script invokes cephfs python binding and do not rely on the kernel ceph > client. And should also bypass your sssd. > This script works by reading the “stray_prior_path” extracted from MDS, and > guess all possible path that this file might be linked to according to the > logic of conda. > > And if you still want to use shell, I have tested that `find /path/to/conda > -printf '%n\n'` is enough to trigger the reintegration. But that is still too > slow for us. > > Feel free to contact me for more info. > > Weiwen Hu > > [1]: https://gist.github.com/huww98/91cbff0782ad4f6673dcffccce731c05 > > 发件人: Frank Schilder<mailto:fr...@dtu.dk> > 发送时间: 2022年1月14日 20:04 > 收件人: Dan van der Ster<mailto:d...@vanderster.com> > 抄送: ceph-users<mailto:ceph-users@ceph.io> > 主题: [ceph-users] Re: cephfs: [ERR] loaded dup inode > > Hi Dan, > > thanks a lot! I will try this. We have lost of users using lots of hard-links > (for example, python anaconda packages create thousands of them). > > Is there a command that forces "reintegration" without having to stat the > file? "ls -lR" will stat the file and this is very slow as we use sssd with > AD for user IDs. What operation is required to trigger a re-integration? I > could probably run a find with suitable arguments. > > Thanks a lot for any hints. > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Dan van der Ster <d...@vanderster.com> > Sent: 14 January 2022 12:30:51 > To: Frank Schilder > Cc: ceph-users > Subject: Re: [ceph-users] Re: cephfs: [ERR] loaded dup inode > > Hi Frank, > > We had this long ago related to a user generating lots of hard links. > Snapshots will have a similar effect. > (in these cases, if a user deletes the original file, the file goes > into stray until it is "reintegrated"). > > If you can find the dir where they're working, `ls -lR` will force > those to reintegrate (you will see because the num strays will drop > back down). > You might have to ls -lR in a snap directory, or in the current tree > -- you have to browse around and experiment. > > pacific does this re-integration automatically. > > -- dan > > On Fri, Jan 14, 2022 at 12:24 PM Frank Schilder <fr...@dtu.dk> wrote: > > > > Hi Venky, > > > > thanks for your reply. I think the first type of messages was a race > > condition. A user was running rm and find on the same folder at the same > > time. The second type of message (duplicate inode in stray) might point to > > an a bit more severe issue. For a while now I observe that > > ".mds_cache.num_strays" is really large and, on average, constantly > > increasing: > > > > # ssh ceph-08 'ceph daemon mds.$(hostname -s) perf dump | jq > > .mds_cache.num_strays' > > 1081531 > > > > This is by no means justified by people deleting files. Our snapshots > > rotate completely every 3 days and the stray buckets should get purged > > regularly. I have 2 questions: > > > > 1) Would a "cephfs-data-scan scan_links" detect and potentially resolve > > this problem (orphaned inodes in stray bucket)? > > 2) For a file system of our size, how long would a "cephfs-data-scan > > scan_links" run approximately (I need to estimate downtime)? I think I can > > execute up to 35-40 workers. The fs size is: > > > > ceph.dir.rbytes="2078289930815425" > > ceph.dir.rentries="278320382" > > > > Thanks for your help! > > > > Best regards, > > ================= > > Frank Schilder > > AIT Risø Campus > > Bygning 109, rum S14 > > > > ________________________________________ > > From: Venky Shankar <vshan...@redhat.com> > > Sent: 12 January 2022 12:24 > > To: Frank Schilder > > Cc: ceph-users > > Subject: Re: [ceph-users] cephfs: [ERR] loaded dup inode > > > > On Tue, Jan 11, 2022 at 6:07 PM Frank Schilder <fr...@dtu.dk> wrote: > > > > > > Hi all, > > > > > > I found a bunch of error messages like below in our ceph log (2 different > > > types). How bad is this and should I do something? > > > > > > Ceph version is 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic > > > (stable). > > > > > > 2022-01-11 11:49:47.687010 [ERR] loaded dup inode 0x10011bac31c > > > [4f8,head] v1046724308 at ~mds0/stray1/10011bac31c, but inode > > > 0x10011bac31c.head v1046760378 already exists at > > > [...]/miniconda3/envs/ffpy_gwa3/lib/python3.6/site-packages/python_dateutil-2.8.0.dist-info/INSTALLER > > > > > > 2022-01-11 11:49:47.682346 [ERR] loaded dup inode 0x10011bac7fc > > > [4f8,head] v1046725418 at ~mds0/stray1/10011bac7fc, but inode > > > 0x10011bac7fc.head v1046760674 already exists at ~mds0/stray2/10011bac7fc > > > > I've seen this earlier. Not sure how we end up with an inode in two > > stray directories, but it doesn't look serious. > > > > You could try stopping all MDSs and run `cephfs-data-scan scan_links` > > (courtesy Zheng) to see if the errors go away. > > > > > > > > Best regards, > > > ================= > > > Frank Schilder > > > AIT Risø Campus > > > Bygning 109, rum S14 > > > _______________________________________________ > > > ceph-users mailing list -- ceph-users@ceph.io > > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > > > > > > > -- > > Cheers, > > Venky > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io -- Cheers, Venky _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io