What OST index (number) were you trying to add?

Andreas is right:
Note that your "--index=0051" value is probably interpreted as an octal number 
"41", it should be "--index=0x0051" or "--index=0x51" (hex, to match the OST 
device name) or "--index=81" (decimal).

And you said:
I'm aware that index 51 actually translates to hex 33 (local-OST0033_UUID).

Ok, 0051 (in octal by way of the leading zeros*) translates to decimal 41 as 
Andreas pointed out, but that’s 0x29 in hexadecimal, not 0x33.  Assuming you 
wanted to use decimal 51 then you’d have tried to mkfs.lustre the wrong index.  
So, if you wanted to use decimal 51, you’d have use –index=0x33 or –index=0063.

-Cory

p.s.
(*) BTW, the convention with leading zeros for octal can be googled or read 
about at https://en.wikipedia.org/wiki/Octal.


On 7/6/21, 12:35 AM, "lustre-discuss on behalf of David Cohen" 
<[email protected]<mailto:[email protected]>
 on behalf of 
[email protected]<mailto:[email protected]>> wrote:

Thanks Andreas,
I'm aware that index 51 actually translates to hex 33 (local-OST0033_UUID).
I don't believe that's the reason for the failed mount as it is only an index 
that I increase for every new OST and there are no duplicates.

lctl dk show tens of thousands of lines repeating the same error after 
attempting to mount the OST:

00100000:10000000:26.0:1625546374.322973:0:248211:0:(osd_scrub.c:2039:osd_ios_scan_one())
 local-OST0033: fail to set LMA for init OI scrub: rc = -30
00100000:10000000:26.0:1625546374.322974:0:248211:0:(osd_scrub.c:2039:osd_ios_scan_one())
 local-OST0033: fail to set LMA for init OI scrub: rc = -30
00100000:10000000:26.0:1625546374.322975:0:248211:0:(osd_scrub.c:2039:osd_ios_scan_one())
 local-OST0033: fail to set LMA for init OI scrub: rc = -30

in /var/log/messages I see the following corresponding to dm21 which is the new 
OST:

Jul  6 07:38:37 oss03 kernel: LDISKFS-fs warning (device dm-21): 
ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please 
wait.
Jul  6 07:39:19 oss03 kernel: LDISKFS-fs (dm-21): file extents enabled, maximum 
tree depth=5
Jul  6 07:39:19 oss03 kernel: LDISKFS-fs warning (device dm-21): 
ldiskfs_clear_journal_err:4862: Filesystem error recorded from previous mount: 
IO failure
Jul  6 07:39:19 oss03 kernel: LDISKFS-fs warning (device dm-21): 
ldiskfs_clear_journal_err:4863: Marking fs in need of filesystem check.
Jul  6 07:39:19 oss03 kernel: LDISKFS-fs (dm-21): warning: mounting fs with 
errors, running e2fsck is recommended
Jul  6 07:39:22 oss03 kernel: LDISKFS-fs (dm-21): recovery complete
Jul  6 07:39:22 oss03 kernel: LDISKFS-fs (dm-21): mounted filesystem with 
ordered data mode. Opts: user_xattr,errors=remount-ro,acl,no_mbcache,nodelalloc
Jul  6 07:39:22 oss03 kernel: LDISKFS-fs error (device dm-21): 
htree_dirblock_to_tree:1278: inode #2: block 21233: comm mount.lustre: bad 
entry in directory: rec_len is too small for name_len - offset=4084(4084), 
inode=0, rec_len=12
, name_len=0
Jul  6 07:39:22 oss03 kernel: Aborting journal on device dm-21-8.
Jul  6 07:39:22 oss03 kernel: LDISKFS-fs (dm-21): Remounting filesystem 
read-only
Jul  6 07:39:24 oss03 kernel: LDISKFS-fs warning (device dm-21): kmmpd:187: 
kmmpd being stopped since filesystem has been remounted as readonly.
Jul  6 07:44:22 oss03 kernel: LDISKFS-fs (dm-21): error count since last fsck: 6
Jul  6 07:44:22 oss03 kernel: LDISKFS-fs (dm-21): initial error at time 
1625367384: htree_dirblock_to_tree:1278: inode 2: block 21233
Jul  6 07:44:22 oss03 kernel: LDISKFS-fs (dm-21): last error at time 
1625546362: htree_dirblock_to_tree:1278: inode 2: block 21233

As I mentioned before mount never completes so the only way out of that is 
force reboot.

Thanks,
David

On Tue, Jul 6, 2021 at 8:07 AM Andreas Dilger 
<[email protected]<mailto:[email protected]>> wrote:



On Jul 5, 2021, at 09:05, David Cohen 
<[email protected]<mailto:[email protected]>> wrote:

Hi,
I'm using Lustre 2.10.5 and lately tried to add a new OST.
The OST was formatted with the command below, which other than the index is the 
exact same one used for all the other OSTs in the system.

mkfs.lustre --reformat --mkfsoptions="-t ext4 -T huge" --ost --fsname=local  
--index=0051 --param ost.quota_type=ug 
--mountfsoptions='errors=remount-ro,extents,mballoc' --mgsnode=10.0.0.3@tcp 
--mgsnode=10.0.0.1@tc
p --mgsnode=10.0.0.2@tcp --servicenode=10.0.0.3@tcp --servicenode=10.0.0.1@tcp 
--servicenode=10.0.0.2@tcp /dev/mapper/OST0051

Note that your "--index=0051" value is probably interpreted as an octal number 
"41", it should be "--index=0x0051" or "--index=0x51" (hex, to match the OST 
device name) or "--index=81" (decimal).



When trying to mount the with:
mount.lustre /dev/mapper/OST0051 /Lustre/OST0051

The system stays on 100% CPU (one core) forever and the mount never completes, 
not even after a week.

I tried tunefs.lustre --writeconf --erase-params on the MDS and all the other 
targets, but the behaviour remains the same.

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud






_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to