Re: [pve-devel] [PATCH docs 10/11] Fix #1958: pveceph: add section Ceph maintenance

Aaron Lauterer Tue, 05 Nov 2019 02:34:50 -0800


On 11/4/19 2:52 PM, Alwin Antreich wrote:

Signed-off-by: Alwin Antreich <a.antre...@proxmox.com>
---
  pveceph.adoc | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++
  1 file changed, 54 insertions(+)

diff --git a/pveceph.adoc b/pveceph.adoc
index 087c4d0..127e3bb 100644
--- a/pveceph.adoc
+++ b/pveceph.adoc
@@ -331,6 +331,7 @@ network. In a Ceph cluster, you will usually have one OSD 
per physical disk.

NOTE: By default an object is 4 MiB in size.+[[pve_ceph_osd_create]]

  Creating OSDs
  ~~~~~~~~~~~~~

@@ -407,6 +408,7 @@ Starting with Ceph Nautilus, {pve} does not support creating such OSDs with

  ceph-volume lvm create --filestore --data /dev/sd[X] --journal /dev/sd[Y]
  ----

+[[pve_ceph_osd_destroy]]

  Destroying OSDs
  ~~~~~~~~~~~~~~~

@@ -724,6 +726,58 @@ pveceph pool destroy NAME

  ----

+Ceph maintenance

+----------------
+Replace OSDs
+~~~~~~~~~~~~
+One of the common maintenance tasks in Ceph is to replace a disk of an OSD. If

... the disk ...

+a disk already failed, you can go ahead and run through the steps in
+xref:pve_ceph_osd_destroy[Destroying OSDs]. As no data is accessible from the
+disk. Ceph will recreate those copies on the remaining OSDs if possible

... a disk is already in a failed state the data on it is not accessibleanymore and you can go/run through the steps inxref:pve_ceph_osd_destroy[Destroying OSDs]. Ceph will recreate themissing copies on the remaining OSDs if possible.

+
+For replacing a still functioning disk. From the GUI run through the steps as
+shown in xref:pve_ceph_osd_destroy[Destroying OSDs]. The only addition is to
+wait till the cluster shows 'HEALTH_OK' before stopping the OSD to destroy it.

To replace a still functioning disk via the GUI go/run through the stepsin xref:pve_ceph_osd_destroy[Destroying OSDs] with one addition: waituntil the cluster shows 'HEALTH_OK' before stopping the OSD to destroy it.

+
+On the command line use the below commands.

... use the following commands:

+----
+ceph osd out osd.<id>
+----
+
+You can check with the below command if the OSD can be already removed.

... with the command below if the OSD can be safely removed.
# or
... the following command if the OSD can be safely removed:

+----
+ceph osd safe-to-destroy osd.<id>
+----
+
+Once the above check tells you that it is save to remove the OSD, you can
+continue with below commands.

... continue with the following commands:

+----
+systemctl stop ceph-osd@<id>.service
+pveceph osd destroy <id>
+----
+
+Replace the old with the new disk and use the same procedure as described in
+xref:pve_ceph_osd_create[Creating OSDs].


Replace the old disk with the new one and use the same procedure...

+
+NOTE: With the default size/min_size (3/2) of a pool, recovery only starts when
+`size + 1` nodes are available.
+
+Run fstrim (discard)
+~~~~~~~~~~~~~~~~~~~~
+It is a good measure to run fstrim (discard) regularly on VMs or containers.
+This releases data blocks that the filesystem isn’t using anymore. It reduces
+data usage and the resource load.


... to run 'fstrim' (discard) ...

+
+Scrub & Deep Scrub
+~~~~~~~~~~~~~~~~~~
+Ceph insures data integrity by 'scrubbing' placement groups. Ceph check every

... Ceph checks every ...

+object in a PG for its health. There are two forms of Scrubbing, daily

# scrubbing lower case

+(metadata compare) and weekly. The latter reads the object and uses checksums

... The weekly scrub read the objects and uses checksums ...

+to ensure data integrity. If a running scrub interferes with business needs,
+you can adjust the time of execution of Scrub footnote:[Ceph scrubbing

.. adjust the time when scrubs are executed ...

+https://docs.ceph.com/docs/nautilus/rados/configuration/osd-config-ref/#scrubbing].
+
+
  Ceph monitoring and troubleshooting
  -----------------------------------
  A good start is to continuosly monitor the ceph health from the start of


_______________________________________________
pve-devel mailing list
pve-devel@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Re: [pve-devel] [PATCH docs 10/11] Fix #1958: pveceph: add section Ceph maintenance

Reply via email to