I don't know the problem here, but you might want to look for connectivity issues from the client to the OSS(s) that house those two missing OSTs. I would image the lustre.log would show such errors in bulk. I've seen where an IB subnet manager gets in a weird state such that some nodes can no longer find a path to certain other nodes.

Cameron

On 10/7/21 4:54 PM, Sid Young via lustre-discuss wrote:
G'Day all,

I have an odd situation where 1 compute node, mounts /home and /lustre but only half the OST's are present, while all the other nodes are fine.... not sure where to start on this one?

Good node:
[root@n02 ~]# lfs df
UUID                   1K-blocks        Used   Available Use% Mounted on
home-MDT0000_UUID     4473970688    30695424  4443273216   1% /home[MDT:0]
home-OST0000_UUID    51097721856 39839794176 11257662464  78% /home[OST:0]
home-OST0001_UUID    51097897984 40967138304 10130627584  81% /home[OST:1]
home-OST0002_UUID    51097705472 37731089408 13366449152  74% /home[OST:2]
home-OST0003_UUID    51097773056 41447411712  9650104320  82% /home[OST:3]

filesystem_summary:  204391098368 159985433600 44404843520  79% /home

UUID                   1K-blocks        Used   Available Use% Mounted on
lustre-MDT0000_UUID   5368816128    28246656  5340567424   1% /lustre[MDT:0]
lustre-OST0000_UUID  51098352640 10144093184 40954257408  20% /lustre[OST:0]
lustre-OST0001_UUID  51098497024  9584398336 41514096640  19% /lustre[OST:1]
lustre-OST0002_UUID  51098414080 11683002368 39415409664  23% /lustre[OST:2]
lustre-OST0003_UUID  51098514432 10475310080 40623202304  21% /lustre[OST:3]
lustre-OST0004_UUID  51098506240 11505326080 39593178112  23% /lustre[OST:4]
lustre-OST0005_UUID  51098429440  9272059904 41826367488  19% /lustre[OST:5]

filesystem_summary:  306590713856 62664189952 243926511616  21% /lustre

[root@n02 ~]#



The bad Node:

 [root@n04 ~]# lfs df
UUID                   1K-blocks        Used   Available Use% Mounted on
home-MDT0000_UUID     4473970688    30726400  4443242240   1% /home[MDT:0]
home-OST0002_UUID    51097703424 37732352000 13363446784  74% /home[OST:2]
home-OST0003_UUID    51097778176 41449634816  9646617600  82% /home[OST:3]

filesystem_summary:  102195481600 79181986816 23010064384  78% /home

UUID                   1K-blocks        Used   Available Use% Mounted on
lustre-MDT0000_UUID   5368816128    28246656  5340567424   1% /lustre[MDT:0]
lustre-OST0003_UUID  51098514432 10475310080 40623202304  21% /lustre[OST:3]
lustre-OST0004_UUID  51098511360 11505326080 39593183232  23% /lustre[OST:4]
lustre-OST0005_UUID  51098429440  9272059904 41826367488  19% /lustre[OST:5]

filesystem_summary:  153295455232 31252696064 122042753024  21% /lustre

[root@n04 ~]#



Sid Young
Translational Research Institute 


_______________________________________________
lustre-discuss mailing list
[email protected]
https://urldefense.us/v3/__http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org__;!!G2kpM7uM-TzIFchu!nEjUA49bGioXfgNynjj0MPhe-SucZDvI3_iVk8BGgkI-ZEL4s6xX3Ow51T_fkSY$ 

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to