Hi all,

In September we'll need to power down a CephFS cluster (currently
mimic) for a several-hour electrical intervention.

Having never done this before, I thought I'd check with the list.
Here's our planned procedure:

1. umounts /cephfs from all hpc clients.
2. ceph osd set noout
3. wait until there is zero IO on the cluster
4. stop all mds's (active + standby)
5. stop all osds.
(6. we won't stop all mon's as they are not affected by that
electrical intervention)
7. power off the cluster.
...
8. power on the cluster, osd's first, then mds's. wait for health_ok.
9. ceph osd unset noout

Seems pretty simple... Are there any gotchas I'm missing? Maybe
there's some special procedure to stop the mds's cleanly?

Cheers, dan
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to