Re: [lustre-discuss] changing inode size on MDT

Hebenstreit, Michael Thu, 03 Oct 2019 04:03:39 -0700

So you are saying on a zfs based Lustre there is no way to increase the number 
of available inodes? I have 8TB MDT with roughly 17G inodes


[root@elfsa1m1 ~]# df -h
Filesystem       Size  Used Avail Use% Mounted on
mdt0000          8.3T  256K  8.3T   1% /mdt0000

[root@elfsa1m1 ~]# df -i
Filesystem           Inodes  IUsed       IFree IUse% Mounted on
mdt0000         17678817874      6 17678817868    1% /mdt0000

Formating under Lustre 2.10.8

mkfs.lustre --mdt --backfstype=zfs --fsname=lfsarc01 --index=0 
--mgsnid="36.101.92.22@tcp" --reformat mdt0000/mdt0000

this translates to only 948M inodes on the Lustre FS.

[root@elfsa1m1 ~]# df -i
Filesystem           Inodes  IUsed       IFree IUse% Mounted on
mdt0000         17678817874      6 17678817868    1% /mdt0000
mdt0000/mdt0000   948016092    263   948015829    1% /lfs/lfsarc01/mdt

[root@elfsa1m1 ~]# df -h
Filesystem       Size  Used Avail Use% Mounted on
mdt0000          8.3T  256K  8.3T   1% /mdt0000
mdt0000/mdt0000  8.2T   24M  8.2T   1% /lfs/lfsarc01/mdt

and there is no reasonable option to provide more file entries except for 
adding another MDT?

Thanks
Michael

From: Andreas Dilger <[email protected]>
Sent: Wednesday, October 02, 2019 18:49
To: Hebenstreit, Michael <[email protected]>
Cc: Mohr Jr, Richard Frank <[email protected]>; [email protected]
Subject: Re: [lustre-discuss] changing inode size on MDT

There are several confusing/misleading comments on this thread that need to be 
clarified...

On Oct 2, 2019, at 13:45, Hebenstreit, Michael 
<[email protected]<mailto:[email protected]>> wrote:

http://wiki.lustre.org/Lustre_Tuning#Number_of_Inodes_for_MDS

Note that I've updated this page to reflect current defaults.  The Lustre 
Operations Manual has a much better description of these parameters.


and I'd like to use --mkfsoptions='-i 1024' to have more inodes in the MDT. We 
already run out of inodes on that FS (probably due to an ZFS bug in early IEEL 
version) - so I'd like to increase #inodes if possible.

The "-i 1024" option (bytes-per-inode ratio) is only needed for ldiskfs since 
it statically allocates the inodes at mkfs time, it is not relevant for ZFS 
since ZFS dynamically allocates inodes and blocks as needed.

On Oct 2, 2019, at 14:00, Colin Faber 
<[email protected]<mailto:[email protected]>> wrote:
With 1K inodes you won't have space to accommodate new features, IIRC the 
current minimal limit on modern lustre is 2K now. If you're running out of MDT 
space you might consider DNE and multiple MDT's to accommodate that larger name 
space.

To clarify, since Lustre 2.10 any new ldiskfs MDT will allocate 1024 bytes for 
the inode itself (-I 1024).  That allows enough space *within* the inode to 
efficiently store xattrs for more complex layouts (PFL, FLR, DoM).  If xattrs 
do not fit inside the inode itself then they will be stored in an external 4KB 
inode block.

The MDT is formatted with a bytes-per-inode *ratio* of 2.5KB, which means 
(approximately) one inode will be created for every 2.5kB of the total MDT 
size.  That 2.5KB of space includes the 1KB for the inode itself, plus space 
for a directory entry (or multiple if hard-linked), extra xattrs, the journal 
(up to 4GB for large MDTs), Lustre recovery logs, ChangeLogs, etc.  Each 
directory inode will have at least one 4KB block allocated.

So, it is _possible_ to reduce the inode *ratio* below 2.5KB if you know what 
you are doing (e.g. 2KB/inode or 1.5KB/inode, this can be an arbitrary number 
of bytes, it doesn't have to be an even multiple of anything) but it definitely 
isn't possible to have 1KB inode size and 1KB per inode ratio, as there 
wouldn't be *any* space left for directories, log files, journal, etc.

Cheers, Andreas
--
Andreas Dilger
Principal Lustre Architect
Whamcloud

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] changing inode size on MDT

Reply via email to