Rick,

Thanks for sharing your experience. I did notice the note in the docs about the 
ost pools which I've not configured for this file system and i saved a copy of 
the llog output for each device.

I did run a zpool scrub on the MDT and it came back clean. It's definitely a 
mystery how this happened! 

I'll move forward with rewriting the configs.

Jesse


________________________________________
From: Mohr, Rick <[email protected]>
Sent: Tuesday, June 17, 2025 9:27 AM
To: Jesse Stroik; [email protected]
Subject: Re: [EXTERNAL] Re: [lustre-discuss] MDT refuses to mount: "no more 
free slots in catalog" "can't initialize llog"

Jesse,

In general, rewriting the configuration logs shouldn't cause you to lose data.  
A file's layout info is stored on the mdt, and if a layout references an ost 
object on OST001f, then after the config rewrite the client should be able to 
locate the object because it will know how to connect to the corresponding OSS 
server.  I have rewritten Lustre config logs on several file systems in the 
past, and I never lost any data.  Your experience may differ depending on how 
your config logs got messed up in the first place.  If you had some kind of 
corruption on your mdt that messed up the config log, then I suppose it's 
possible that other data on the mdt could have gotten corrupted which may cause 
problems accessing files.  But if the file system mounts, you can always run 
lfsck to detect/fix problems.  Just remember that rewriting the logs will erase 
some config info (like ost pools) so you will need to rerun those commands 
afterwards.

--Rick


On 6/16/25, 8:41 AM, "lustre-discuss on behalf of Jesse Stroik via 
lustre-discuss" <[email protected] 
<mailto:[email protected]> on behalf of 
[email protected] <mailto:[email protected]>> wrote:

Hi Lustre users, When reviewing the configuration logs prior to performing this 
work, I noticed that one of the OSTs in use is not listed in the configuration 
log for the MDT. The logs go from OST001e to OST0020 skipping OST001f with no 
mention of OST001f in the log. The OST configuration log looks normal and was 
roughly as full as any of the other OSTs so it was getting data stored on it. 
This now raises a concern for me: is it likely that one of the OST will have 
data we cannot recover if i rewrite these logs? At this point the file system 
cannot mount so I believe rewriting the logs is necessary in any case.

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to