Hello list, after upgrading a filesystem (with ZFS backend) from 2.10 to 2.12, I've started to see errors stat()ing some of the older files (lstat() hangs; after disabling auto-scrub on the OSSes, it returns with -1 EREMCHG), and I'm wondering why nobody else seems to be having this issue. Errors like the following appear on OST1 (but not OST0):
00080000:00020000:4.0:1585213596.566408:0:36486:0:(osd_object.c:481:osd _check_lma()) fs-OST0001: FID-in-LMA [0x100000000:0x145:0x0] does not match the object self-fid [0x100010000:0x145:0x0] Has anyone else been seeing these? The check that's failing here has been added in commit 89ead21 (LU- 7585 zfs: OI scrub for ZFS), which is included since Lustre 2.11, so I'm guessing that the inconsistency has simply been ignored by Lustre 2.10. In lustre/osp/osp_internal.h I found the following comment: > In 2.6+ ost_idx is packed into IDIF FID, while in 2.4 and 2.5 IDIF is > always FID_SEQ_IDIF(0x100000000ULL), which does not include OST index > in the seq. Looking at the inaccessible files (and the OSS logs), it seems that the issue can be traced to lookup failures of objects on OST 1 with FID-in- LMA sequence number 0x100000000 (i.e. written by Lustre 2.4/2.5 according to the above, which is a reasonable assumption for the filesystem and files in question), where Lustre erroneously adds the OST index to the self-fid (or erroneously compares the new-style self- fid to the old-style FID-in-LMA). If this is true, this error should occur for basically all files written by Lustre 2.4/2.5 (except if they have a stripe count of 1 and only reside on OST 0). Does this make sense? Should osd_check_lma simply be more lenient in its check in order to allow for old-style FIDs? Should I manually change the affected trusted.lma EAs on OST1 to include the OST index in the sequence number? Or would either of these cause issues in other places? More details at https://jira.whamcloud.com/projects/LU/issues/LU-13392. Kind regards, Knut Franke -- Knut Franke Systems Engineer science + computing ag Teamline: +49 7071 94 57 680 Hagellocher Weg 73 D-72070 Tübingen Website: https://www.atos.net/ _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
