You could run debugfs on that OST and use "ls -l" to examine the O/*/d* directories for large objects, then "stat" any suspicious objects within debugfs to dump the parent FID, and "lfs fid2path" on a client to determine the path.
Alternately, see "lctl-lfsck-start.8" man page for options to link orphan objects to the .lustre/lost+found directory if you think there are no files referencing those objects. Cheers, Andreas > On Sep 4, 2021, at 00:54, Alastair Basden <[email protected]> wrote: > > Ah, of course - has to be done on a client. > > None of these files are on the dodgy OST. > > Any further suggestions? Essentially we have what seems to be a full OST > with nothing on it. > > Thanks, > Alastair. > >> On Sat, 4 Sep 2021, Andreas Dilger wrote: >> >> [EXTERNAL EMAIL] >> $ man lfs-fid2path.1 >> lfs-fid2path(1) user utilities >> lfs-fid2path(1) >> >> NAME >> lfs fid2path - print the pathname(s) for a file identifier >> >> SYNOPSIS >> lfs fid2path [OPTION]... <FSNAME|MOUNT_POINT> <FID>... >> >> DESCRIPTION >> lfs fid2path maps a numeric Lustre File IDentifier (FID) to one or >> more pathnames >> that have hard links to that file. This allows resolving filenames for >> FIDs used in console >> error messages, and resolving all of the pathnames for a file that has >> multiple hard links. >> Pathnames are resolved relative to the MOUNT_POINT specified, or >> relative to the >> filesystem mount point if FSNAME is provided. >> >> OPTIONS >> -f, --print-fid >> Print the FID with the path. >> >> -c, --print-link >> Print the current link number with each pathname or parent >> directory. >> >> -l, --link=LINK >> If a file has multiple hard links, then print only the specified >> LINK, starting at link 0. >> If multiple FIDs are given, but only one pathname is needed for >> each file, use --link=0. >> >> EXAMPLES >> $ lfs fid2path /mnt/testfs [0x200000403:0x11f:0x0] >> /mnt/testfs/etc/hosts >> >> >> On Sep 3, 2021, at 14:51, Alastair Basden >> <[email protected]<mailto:[email protected]>> wrote: >> >> Hi, >> >> lctl get_param mdt.*.exports.*.open_files returns: >> [email protected]_files= >> [0x20000b90e:0x10aa:0x0] >> [email protected]_files= >> [0x20000b90e:0x21b3:0x0] >> [email protected]_files= >> [0x20000b90e:0x21b3:0x0] >> [0x20000b90e:0x21b4:0x0] >> [0x20000b90c:0x1574:0x0] >> [0x20000b90c:0x1575:0x0] >> [0x20000b90c:0x1576:0x0] >> >> Doesn't seem to be many open, so I don't think it's a problem of open files. >> >> Not sure which bit of this I need to use with lfs fid2path either... >> >> Cheers, >> Alastair. >> >> >> On Fri, 3 Sep 2021, Andreas Dilger wrote: >> >> [EXTERNAL EMAIL] >> You can also check "mdt.*.exports.*.open_files" on the MDTs for a list of >> FIDs open on each client, and use "lfs fid2path" to resolve them to a >> pathname. >> >> On Sep 3, 2021, at 02:09, Degremont, Aurelien via lustre-discuss >> <[email protected]<mailto:[email protected]><mailto:[email protected]>> >> wrote: >> >> Hi >> >> It could be a bug, but most of the time, this is due to an open-unlinked >> file, typically a log file which is still in use and some processes keep >> writing to it until it fills the OSTs it is using. >> >> Look for such files on your clients (use lsof). >> >> Aurélien >> >> >> Le 03/09/2021 09:50, « lustre-discuss au nom de Alastair Basden » >> <[email protected]<mailto:[email protected]><mailto:[email protected]> >> au nom de >> [email protected]<mailto:[email protected]><mailto:[email protected]>> >> a écrit : >> >> CAUTION: This email originated from outside of the organization. Do not >> click links or open attachments unless you can confirm the sender and know >> the content is safe. >> >> >> >> Hi, >> >> We have a file system where each OST is a single SSD. >> >> One of those is reporting as 100% full (lfs df -h /snap8): >> snap8-OST004d_UUID 5.8T 2.0T 3.5T 37% /snap8[OST:77] >> snap8-OST004e_UUID 5.8T 5.5T 7.5G 100% /snap8[OST:78] >> snap8-OST004f_UUID 5.8T 2.0T 3.4T 38% /snap8[OST:79] >> >> However, I can't find any files on it: >> lfs find --ost snap8-OST004e /snap8/ >> returns nothing. >> >> I guess that it has filled up, and that there is some bug or other that is >> now preventing proper behaviour - but I could be wrong. >> >> Does anyone have any suggestions? >> >> Essentially, I'd like to find some of the files and delete or migrate >> some, and thus return it to useful production. >> >> Cheers, >> Alastair. >> _______________________________________________ >> lustre-discuss mailing list >> [email protected]<mailto:[email protected]><mailto:[email protected]> >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >> >> _______________________________________________ >> lustre-discuss mailing list >> [email protected]<mailto:[email protected]><mailto:[email protected]> >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >> >> Cheers, Andreas >> -- >> Andreas Dilger >> Lustre Principal Architect >> Whamcloud >> >> >> >> >> >> >> >> >> Cheers, Andreas >> -- >> Andreas Dilger >> Lustre Principal Architect >> Whamcloud >> >> >> >> >> >> >> _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
