Are you using ZFS for that MDT? ZFS allocates inodes dynamically, but will stop when it runs out of space. You could expand your zpool with additional disks.
If you are using ZFS, and don't have extra space you can allocate it might also be worth checking your ashift value on your MDT. For 512e disks, the default ashift value will be 12 corresponding to inodes that use at least 4k of space each. It would be possible to use zfs snapshot to backup your pool, recreate it with ashift=9, and restore it. Jesse ________________________________________ From: lustre-discuss <lustre-discuss-boun...@lists.lustre.org> on behalf of Ihsan Ur Rahman <ihsanur...@gmail.com> Sent: Tuesday, January 7, 2025 2:28 AM To: lustre-discuss@lists.lustre.org Subject: [lustre-discuss] inodes are full for one of the MDT Hello lustre folks, New to this form and lustre as well. We have a lustre system and the users are getting an error that no space is left on the device. after checking, we have realised that the inodes are full for one of the MDT. lfs df -ihv /mnt/lustre/ UUID Inodes IUsed IFree IUse% Mounted on lustre-MDT0000_UUID 894.0M 894.0M 58 100% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 894.0M 313 894.0M 1% /mnt/lustre[MDT:1] lustre-MDT0002_UUID 894.0M 313 894.0M 1% /mnt/lustre[MDT:2] lustre-OST0000_UUID 4.0G 26.2M 4.0G 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 4.0G 26.1M 4.0G 1% /mnt/lustre[OST:1] lustre-OST0002_UUID 4.0G 28.1M 4.0G 1% /mnt/lustre[OST:2] lustre-OST0003_UUID 4.0G 26.6M 4.0G 1% /mnt/lustre[OST:3] lustre-OST0004_UUID 4.0G 28.2M 4.0G 1% /mnt/lustre[OST:4] lustre-OST0005_UUID 4.0G 27.3M 4.0G 1% /mnt/lustre[OST:5] lustre-OST0006_UUID 4.0G 27.5M 4.0G 1% /mnt/lustre[OST:6] lustre-OST0007_UUID 4.0G 28.0M 4.0G 1% /mnt/lustre[OST:7] lustre-OST0008_UUID 4.0G 27.5M 4.0G 1% /mnt/lustre[OST:8] lustre-OST0009_UUID 4.0G 26.4M 4.0G 1% /mnt/lustre[OST:9] lustre-OST000a_UUID 4.0G 27.9M 4.0G 1% /mnt/lustre[OST:10] lustre-OST000b_UUID 4.0G 28.4M 4.0G 1% /mnt/lustre[OST:11] lustre-OST000c_UUID 4.0G 28.3M 4.0G 1% /mnt/lustre[OST:12] lustre-OST000d_UUID 4.0G 27.8M 4.0G 1% /mnt/lustre[OST:13] lustre-OST000e_UUID 4.0G 27.6M 4.0G 1% /mnt/lustre[OST:14] lustre-OST000f_UUID 4.0G 27.1M 4.0G 1% /mnt/lustre[OST:15] lustre-OST0010_UUID 4.0G 26.5M 4.0G 1% /mnt/lustre[OST:16] lustre-OST0011_UUID 4.0G 27.3M 4.0G 1% /mnt/lustre[OST:17] lustre-OST0012_UUID 4.0G 27.1M 4.0G 1% /mnt/lustre[OST:18] lustre-OST0013_UUID 4.0G 28.8M 4.0G 1% /mnt/lustre[OST:19] lustre-OST0014_UUID 4.0G 28.2M 4.0G 1% /mnt/lustre[OST:20] lustre-OST0015_UUID 4.0G 26.1M 4.0G 1% /mnt/lustre[OST:21] lustre-OST0016_UUID 4.0G 27.2M 4.0G 1% /mnt/lustre[OST:22] lustre-OST0017_UUID 4.0G 28.7M 4.0G 1% /mnt/lustre[OST:23] lustre-OST0018_UUID 4.0G 28.5M 4.0G 1% /mnt/lustre[OST:24] lustre-OST0019_UUID 4.0G 28.3M 4.0G 1% /mnt/lustre[OST:25] lustre-OST001a_UUID 4.0G 27.3M 4.0G 1% /mnt/lustre[OST:26] lustre-OST001b_UUID 4.0G 27.0M 4.0G 1% /mnt/lustre[OST:27] lustre-OST001c_UUID 4.0G 28.8M 4.0G 1% /mnt/lustre[OST:28] lustre-OST001d_UUID 4.0G 28.5M 4.0G 1% /mnt/lustre[OST:29] filesystem_summary: 2.6G 894.0M 1.7G 34% /mnt/lustre After some search on google I have found that there may be some open files on the compute which can lead to consuming of the inodes. With the command below I have got the list of the nodes where the files were open. lctl get_param mdt.*.exports.*.open_files I login to each server and with lsof, I figure out the files and kill all those files, but still it does not work for us. Our primary goal is to bring the usage of inodes from 100% to below 90%, and then we can share the load of inodes on the other two MDT nodes. Need your guidance and support. regards, Ihsan _______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org