Scott,

I’d like to strongly second all of Jongwoo’s advice, particularly that about 
adding new OSTs rather than replacing existing ones, if possible.  That 
procedure is so much simpler and involves a lot less messing around “under the 
hood”.  It takes you from a complex procedure with many steps to, essentially, 
copying a bunch of data around while your file system remains up, and adding 
and removing a few OSTs at either end.

It would also be non-destructive for your existing data.  One of the scary 
things about the original proposed process is that if something goes wrong 
partway through, the original data is already gone (or at least very hard to 
get).

Regards,
- Patrick
________________________________
From: lustre-discuss <[email protected]> on behalf of 
Jongwoo Han <[email protected]>
Sent: Thursday, February 28, 2019 5:36:54 AM
To: Scott Wood
Cc: [email protected]
Subject: Re: [lustre-discuss] Draining and replacing OSTs with larger volumes



On Thu, Feb 28, 2019 at 11:09 AM Scott Wood 
<[email protected]<mailto:[email protected]>> wrote:
Hi folks,

Big upgrade process in the works and I had some questions.  Our current 
infrastructure has 5 HA pairs of OSSs and arrays with an HA pair of management 
and metadata servers who also share an array, all running lustre 2.10.3.  
Pretty standard stuff.  Our upgrade plan is as follows:

1) Deploy a new HA pair of OSSs with arrays populated with OSTs that are twice 
the size of our originals.
2) Follow the process in section 14.9 of the lustre docs to drain all OSTs in 
one of existing the HA pairs' arrays
3) Repopulate the first old pair of deactivated and drained arrays with new 
larger drives
4) Upgrade the offline OSSs from 2.10.3 to 2.10.latest?
5) Return them to service
6) Repeat steps 2-4 for the other 4 old HA pairs of OSSs and OSTs

I'd expect this would be doable without downtime as we'd only be taking arrays 
offline that have no objects on them, and we've added new arrays and OSSs 
before with no issues.  I have a few questions before we begin the process:

1) My interpretation of the docs is that  we OK to install them with 2.10.6 (or 
2.10.7, if it's out), as rolling upgrades withing X.Y are supported.  Is that 
correct?

In theory, rolling upgrade should work, but generally recommended upgrade 
procedure is to stop filesystem and unmount all MDS and OSS, upgrade package 
and bring them up. This will prevent human errors during repeated per-server 
upgrade.
When it is done correctly, It will take not more than 2 hours.

2) Until the whole process is complete, we'll have imbalanced OSTs.  I know 
that's not ideal, but is it all that big an issue

Rolling upgrade will cause imbalance, but after long run, the files will be 
assigned will be evenly distributed. No need to worry about it on one-shot 
upgrade scenario.

3) When draining the OSTs of files, section 14.9.3, point 2.a. states that the 
lfs find |lfs migrate can take multiple OSTs as args, but I thought it would be 
better to run one instance of that per OST and distribute them across multiple 
clients .  Is that reasonable (and faster)?

Parallel redistribute is generally faster than one-by-one. If the MDT can 
endure scanning load, run multiple migrate processes each for against one OST
4) When the drives are replaced with bigger ones, can the original OST 
configuration files be restored to them as described in Docs section 14.9.5, or 
due the the size mismatch, will that be bad?

Since this process will treat objects as files, the configurations should go as 
same.

5) What questions should I be asking that I haven't thought of?


I do not know the size of OSTs to deal with, but I think 
migrate(empty)-replace-migrate-replace is really painful process as it will 
take long time. If circumtances allow, I suggest add all new OST arrays to OSS 
with new OST nums, migrate OST objects, deactivate and remove old OSTs.

If that all goes well, and we did upgrade the OSSs to a newer 2.10.x, we'd 
follow it up with a migration of the MGT and MDT to one of the management 
servers, upgrade the other, fail them back, upgrade the second, and rebalance 
the MDT and MGT services back across the two.  We'd expect the usual pause in 
services as those migrate but other than that, fingers crossed, should all be 
good.  Are we missing anything?


If this plan is forced, rolling migrate and upgrade should be planned 
carefully. It will be better to set up correct procedure checklist by 
practicing on a virtual environment with identical versions.

Cheers
Scott
_______________________________________________
lustre-discuss mailing list
[email protected]<mailto:[email protected]>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


--
Jongwoo Han
+82-505-227-6108
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to