Hi,

We had an issue a few months ago with the underlying zpool for one of our OSTs. 
I managed to get it mounted in read only mode and migrated all of the files off 
it with lfs migrate, then recreated the OST and reintroduced it. This all went 
pretty smoothly - at the same time I updated our progressive file layout using 
the following command:

lfs find . -type d -print0 | xargs -0 lfs setstripe -E 256M -c 1 -E eof -c -1

I then ran an lfs find to find all the files bigger than 256M and migrated them 
to this new layout.

I have since noticed that the OST that was reintroduced has been filling up 
more rapidly than the others, to the point where it is now full:

UUID                       bytes        Used   Available Use% Mounted on
scratchc-MDT0000_UUID        1.4T      108.0G        1.3T   8% 
/mnt/scratchc[MDT:0]
scratchc-OST0000_UUID       55.2T       55.2T       42.0M 100% 
/mnt/scratchc[OST:0]
scratchc-OST0001_UUID       55.2T       22.5T       32.7T  41% 
/mnt/scratchc[OST:1]
scratchc-OST0002_UUID       46.0T       19.3T       26.7T  43% 
/mnt/scratchc[OST:2]
scratchc-OST0003_UUID       46.0T       19.4T       26.6T  43% 
/mnt/scratchc[OST:3]
scratchc-OST0004_UUID       46.0T       19.5T       26.5T  43% 
/mnt/scratchc[OST:4]
scratchc-OST0005_UUID       55.2T       22.8T       32.5T  42% 
/mnt/scratchc[OST:5]

filesystem_summary:       303.8T      158.8T      145.0T  53% /mnt/scratchc

For reference, I marked the OST as inactive to migrate the files off by using 
the command:

lctl set_param osp.scratchc-OST0000-osc-MDT0000.max_create_count=0

As per the manual. To reactivate it after having rebuilt it, I copied the count 
from the other OSTs:

~]# lctl get_param osp.scratchc-*.max_create_count
osp.scratchc-OST0000-osc-MDT0000.max_create_count=20000
osp.scratchc-OST0001-osc-MDT0000.max_create_count=20000
osp.scratchc-OST0002-osc-MDT0000.max_create_count=20000
osp.scratchc-OST0003-osc-MDT0000.max_create_count=20000
osp.scratchc-OST0004-osc-MDT0000.max_create_count=20000
osp.scratchc-OST0005-osc-MDT0000.max_create_count=20000

As far as I can tell I haven't told lustre to preferentially use the one OST, 
so I'm a little stumped as to why this has happened - it is possible that 
someone has changed the default layout on some of their folders but I'm 
struggling to think of a quick way of checking this.

Has anyone else run into similar problems? I'm hoping there is something 
incredibly obvious that I've missed somewhere!

Thanks in advance!


Jon Marshall

High Performance Computing Specialist



IT and Scientific Computing Team



Cancer Research UK Cambridge Institute

Li Ka Shing Centre | Robinson Way | Cambridge | CB2 0RE

Web<http://www.cruk.cam.ac.uk/> | 
Facebook<http://www.facebook.com/cancerresearchuk> | 
Twitter<http://twitter.com/CR_UK>



[Description: CRI Logo]<http://www.cruk.cam.ac.uk/>

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to