Thanks for that info, Michael. So sounds like I could go ahead and
get the 2.8 upgrades done on everything, and then hold off on any
further upgrades. We don't have a very urgent user base, so that
would work okay for us.

Hopefully this is still an "active" issue that might be solved...

Thanks again,
Patrick


On 6/24/20 11:43 AM, Hebenstreit, Michael wrote:
I would not plan a direct upgrade until Whamcloud fixes the underlying issue. 
Currently the only viable way seem to be a step by step upgrade. I imagine 
you'd first upgrade to 2.10.8, and then copy all old file to a new place 
(something like: mkdir .new_copy; rsync -a  * .new_copy; rm -rf *; mv 
.new_copy/* .; rmdir .new_copy) so that all files have been re-created with 
correct information. Knut's script is a hack and last minute resort.

-----Original Message-----
From: lustre-discuss <[email protected]> On Behalf Of 
Patrick Shopbell
Sent: Wednesday, June 24, 2020 12:36
To: [email protected]
Subject: Re: [lustre-discuss] problem after upgrading 2.10.4 to 2.12.4


Hello all,
I have been following this discussion with interest, as we are in the process 
of a long-overdue upgrade of our small Lustre system. We are moving everything 
from

RHEL 6 + Lustre 2.5.2

to

RHEL 7 + Lustre 2.8.0

We are taking this route merely because 2.8.0 supported both RHEL 6 and 7, and 
so we could keep running, to some extent. (In reality, we have found that v2.8 
clients crash our v2.5 MGS on a pretty regular basis.)

Once our OS upgrades are done, the plan is to then take everything to

RHEL 7 + Lustre 2.12.x

  From what I gather on this thread, however... I should expect to have some 
difficulty reading most of my files, since we have been running 2.5 for a long 
time. And so I should plan on running Knut's 'update_25_objects' on all of my 
OSTs? Is that correct? Should I need to do that at Lustre 2.8.0, or not until I 
get to v2.12? Also, I assume this issue is irrelevant of underlying filesystem 
- we are still running lustrefs on our 12 OSTs, rather than ZFS.

Thanks so much. This list is always very helpful and interesting.
--
Patrick


On 6/24/20 1:16 AM, Franke, Knut wrote:
Am Dienstag, den 23.06.2020, 20:03 +0000 schrieb Hebenstreit, Michael:
Is there any way to stop the scans on the OSTs?
Yes, by re-mounting them with -o noscrub. This doesn't fix the issue
though.

Is there any way to force the file system checks?
As shown in your second mail, the scrubs are already running.
Unfortunately, they don't (as of Lustre 2.12.4) fix the issue.

Has anyone found a workaround for the FID sequence errors?
Yes, see the script attached to LU-13392. In short:

0. Make sure you have a backup. This might eat your lunch and fry your
cat for afters.
1. Enable the canmount property on the backend filesystem. For example:
     [oss]# zfs set canmount=on mountpoint=/mnt/ostX ${fsname}-ost/ost
2. Mount the target as 'zfs'. For example:
     [oss]# zfs mount ${fsname}-ost/ost 3. update_25_objects /mnt/ostX
4. unmount and remount the OST as 'lustre'

This will rewrite the extended attributes of OST objects created by
Lustre 2.4/2.5 to a format compatible with 2.12.

Can I downgrade from 2.12.4 to 2.10.8 without destroying the FS?
We've done this successfully, but again - no guarantees.

Has the error described in https://jira.whamcloud.com/browse/LU-13392
   been fixed in 2.12.5?
I don't think so.

Cheers,
Knut



--

*--------------------------------------------------------------------*
| Patrick Shopbell               Department of Astronomy             |
| [email protected]          Mail Code 249-17                    |
| (626) 395-4097                 California Institute of Technology  |
| (626) 568-9352  (FAX)          Pasadena, CA  91125                 |
| WWW: http://www.astro.caltech.edu/~pls/                            |
*--------------------------------------------------------------------*

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to