Hi, We had an issue a few months ago with the underlying zpool for one of our OSTs. I managed to get it mounted in read only mode and migrated all of the files off it with lfs migrate, then recreated the OST and reintroduced it. This all went pretty smoothly - at the same time I updated our progressive file layout using the following command:
lfs find . -type d -print0 | xargs -0 lfs setstripe -E 256M -c 1 -E eof -c -1 I then ran an lfs find to find all the files bigger than 256M and migrated them to this new layout. I have since noticed that the OST that was reintroduced has been filling up more rapidly than the others, to the point where it is now full: UUID bytes Used Available Use% Mounted on scratchc-MDT0000_UUID 1.4T 108.0G 1.3T 8% /mnt/scratchc[MDT:0] scratchc-OST0000_UUID 55.2T 55.2T 42.0M 100% /mnt/scratchc[OST:0] scratchc-OST0001_UUID 55.2T 22.5T 32.7T 41% /mnt/scratchc[OST:1] scratchc-OST0002_UUID 46.0T 19.3T 26.7T 43% /mnt/scratchc[OST:2] scratchc-OST0003_UUID 46.0T 19.4T 26.6T 43% /mnt/scratchc[OST:3] scratchc-OST0004_UUID 46.0T 19.5T 26.5T 43% /mnt/scratchc[OST:4] scratchc-OST0005_UUID 55.2T 22.8T 32.5T 42% /mnt/scratchc[OST:5] filesystem_summary: 303.8T 158.8T 145.0T 53% /mnt/scratchc For reference, I marked the OST as inactive to migrate the files off by using the command: lctl set_param osp.scratchc-OST0000-osc-MDT0000.max_create_count=0 As per the manual. To reactivate it after having rebuilt it, I copied the count from the other OSTs: ~]# lctl get_param osp.scratchc-*.max_create_count osp.scratchc-OST0000-osc-MDT0000.max_create_count=20000 osp.scratchc-OST0001-osc-MDT0000.max_create_count=20000 osp.scratchc-OST0002-osc-MDT0000.max_create_count=20000 osp.scratchc-OST0003-osc-MDT0000.max_create_count=20000 osp.scratchc-OST0004-osc-MDT0000.max_create_count=20000 osp.scratchc-OST0005-osc-MDT0000.max_create_count=20000 As far as I can tell I haven't told lustre to preferentially use the one OST, so I'm a little stumped as to why this has happened - it is possible that someone has changed the default layout on some of their folders but I'm struggling to think of a quick way of checking this. Has anyone else run into similar problems? I'm hoping there is something incredibly obvious that I've missed somewhere! Thanks in advance! Jon Marshall High Performance Computing Specialist IT and Scientific Computing Team Cancer Research UK Cambridge Institute Li Ka Shing Centre | Robinson Way | Cambridge | CB2 0RE Web<http://www.cruk.cam.ac.uk/> | Facebook<http://www.facebook.com/cancerresearchuk> | Twitter<http://twitter.com/CR_UK> [Description: CRI Logo]<http://www.cruk.cam.ac.uk/>
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
