Existing information is slightly modified and retained. Add information: * Mention and link to the sections "Troubleshooting" and "Replace OSDs" * CLI commands (pveceph) must be executed on the affected node * Check in advance the "Used (%)" of OSDs to avoid blocked I/O * Check and wait until the OSD can be stopped safely * Use `pveceph stop` instead of `systemctl stop ceph-osd@<ID>.service` * Explain cleanup option a bit more
Signed-off-by: Alexander Zeidler <a.zeid...@proxmox.com> --- pveceph.adoc | 58 ++++++++++++++++++++++++++++------------------------ 1 file changed, 31 insertions(+), 27 deletions(-) diff --git a/pveceph.adoc b/pveceph.adoc index 4e1c1e2..754c401 100644 --- a/pveceph.adoc +++ b/pveceph.adoc @@ -502,33 +502,37 @@ ceph-volume lvm create --filestore --data /dev/sd[X] --journal /dev/sd[Y] Destroy OSDs ~~~~~~~~~~~~ -To remove an OSD via the GUI, first select a {PVE} node in the tree view and go -to the **Ceph -> OSD** panel. Then select the OSD to destroy and click the **OUT** -button. Once the OSD status has changed from `in` to `out`, click the **STOP** -button. Finally, after the status has changed from `up` to `down`, select -**Destroy** from the `More` drop-down menu. - -To remove an OSD via the CLI run the following commands. - -[source,bash] ----- -ceph osd out <ID> -systemctl stop ceph-osd@<ID>.service ----- - -NOTE: The first command instructs Ceph not to include the OSD in the data -distribution. The second command stops the OSD service. Until this time, no -data is lost. - -The following command destroys the OSD. Specify the '-cleanup' option to -additionally destroy the partition table. - -[source,bash] ----- -pveceph osd destroy <ID> ----- - -WARNING: The above command will destroy all data on the disk! +If you experience problems with an OSD or its disk, try to +xref:pve_ceph_mon_and_ts[troubleshoot] them first to decide if a +xref:pve_ceph_osd_replace[replacement] is needed. + +To destroy an OSD: + +. Either open the web interface and select any {pve} node in the tree +view, or open a shell on the node where the OSD to be deleted is +located. + +. Go to the __Ceph -> OSD__ panel (`ceph osd df tree`). If the OSD to +be deleted is still `up` and `in` (non-zero value at `AVAIL`), make +sure that all OSDs have their `Used (%)` value well below the +`nearfull_ratio` of default `85%`. In this way you can reduce the risk +from the upcoming rebalancing, which may cause OSDs to run full and +thereby blocking I/O on Ceph pools. + +. If the deletable OSD is not `out` yet, select the OSD and click on +**Out** (`ceph osd out <id>`). This will exclude it from data +distribution and starts a rebalance. + +. Click on **Stop**, and if a warning appears, click on **Cancel** and +try again shortly afterwards. When using the shell, check if it is +safe to stop by reading the output from `ceph osd ok-to-stop <id>`, +once true, run `pveceph stop --service osd.<id>` . + +. **Attention, this step removes the OSD from Ceph and deletes all +disk data.** To continue, first click on **More -> Destroy**. Use the +cleanup option to clean up the partition table and similar, enabling +an immediate reuse of the disk in {pve}. Finally, click on **Remove** +(`pveceph osd destroy <id> [--cleanup]`). [[pve_ceph_pools]] -- 2.39.5 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel