Hi, One way to avoid the OSD from being recreated with the faulty drive as soon as it was zapped (without disabling the OSD service entirely) is to set the _no_schedule label on the host with 'ceph orch host label add <hostname> _no_schedule' and remove the label after the drive has been replaced.
Best regards, Frédéric. Frédéric Nass Senior Ceph Engineer Ceph Ambassador, France +49 89 215252-751 <https://call.ctrlq.org/+49%2089%20215252-751> frederic.n...@clyso.com www.clyso.com Hohenzollernstr. 27, 80801 Munich Utting a. A. | HR: Augsburg | HRB: 25866 | USt. ID-Nr.: DE2754306 Le mer. 20 août 2025 à 10:42, Eugen Block <ebl...@nde.ag> a écrit : > Hi, > > I think I found the right place [0]: > > ---snip--- > if any_replace_params: > # mark destroyed in osdmap > if not osd.destroy(): > raise orchestrator.OrchestratorError( > f"Could not destroy {osd}") > logger.info( > f"Successfully destroyed old {osd} on > {osd.hostname}; ready for replacement") > if any_replace_params: > osd.zap = True > ... > > if osd.zap: > # throws an exception if the zap fails > logger.info(f"Zapping devices for {osd} on > {osd.hostname}") > osd.do_zap() > ---snip--- > > So if the replace flag is set, Ceph will zap the device(s). I compared > the versions, the change was between 19.2.0 and 19.2.1. > > On the one hand, I agree with OP, if Ceph immediately zaps the > drive(s) it will redeploy the destroyed OSD with the faulty disk. > On the other hand, if you don't let cephadm zap the drives you'll need > manual intervention during the actual disk replacement. So the OSD > will be purged, leaving the DB/WAL LVs on the disk. > > It would be interesting to learn what led to the decision to implement > it like this, but I also don't see an "optimal" way doing this. I > wonder if it could make sense to zap only DB/WAL devices, not the data > device in case of replacement. Then when the faulty disk gets > replaced, the orchestrator could redeploy an OSD since the data drive > should be clean and there should be space for DB/WAL. > > Regards, > Eugen > > > [0] > > https://github.com/ceph/ceph/blob/v19.2.3/src/pybind/mgr/cephadm/services/osd.py#L909 > > Zitat von Dmitrijs Demidovs <dmitrijs.demid...@carminered.eu>: > > > Hi List! > > > > We have Squid 19.2.2. It is cephadm/docker based deployment > > (recently upgraded from Pacific 16.2.15). > > We are using 8 SAS drives for Block and 2 SSD drives for DB on every > > OSD Host. > > > > > > Problem: > > > > One of the SAS Block drives failed on OSD Host and we need to replace it. > > When our Ceph cluster was running on Pacific, we usually performed > > drive replacement using this steps: > > > > 1) Edit -> MARK osd.xx as OUT [re-balancing starts, wait until it is > > completed] > > 2) Edit -> MARK osd.xx as DOWN > > 3) Edit -> DELETE osd.xx [put check on "Preserve OSD ID" and "Yes, I > > am sure"] > > 4) Edit -> DESTROY osd.xx > > 5) Edit -> PURGE osd.xx [re-balancing starts, wait until it is completed] > > 6) Set "noout" and "norebalance" flags. Put OSD Host in maintenance > > mode. Shutdown OSD Host. Replace failed drive. Start OSD Host. > > 7) Wipe old DB Logical Volume (LV) [dd if=/dev/zero > > of=/dev/ceph-xxx/osd-db-xxx bs=1M count=10 conv=fsync]. > > 8) Wipe new Block disk. Destroy old DB LV. Wait for automatic > > discovery and creation of new osd.xx instance. > > > > In Pacific 16.2.15, after execution of PURGE command, Ceph just > > removed old osd.xx instance from cluster without deletion/zapping of > > DB and Block LVs. > > > > Now in Squid 19.2.2 we see what Ceph behaves differently. > > Execution of step 3 (Edit -> DELETE osd.xx) automatically executes > > DESTROY and PURGE, and after that Ceph automatically performs > > zapping and deletion of DB LV and Block LV! > > And after that it's automatic discovery finds "clean" SAS disk + > > free space on SSD drive and happily forms new osd.xx instance form > > failed drive what we need to replace :) > > > > > > Questions: > > > > 1) What is correct procedure how-to replace failed Block drive in > > Ceph Squid 19.2.2? > > 2) Is it possible to disable Zapping? > > 3) Is it possible to temporary disable automatic discovery of new > > drives for OSD service? > > > > > > > > > > P.S. > > > > Here is our Pacement Specification for OSD service: > > > > [ceph: root@ceph-mon12 /]# ceph orch ls osd > > osd.dashboard-admin-1633624229976 --export > > service_type: osd > > service_id: dashboard-admin-1633624229976 > > service_name: osd.dashboard-admin-1633624229976 > > placement: > > host_pattern: '*' > > spec: > > data_devices: > > rotational: true > > db_devices: > > rotational: false > > db_slots: 4 > > filter_logic: AND > > objectstore: bluestore > > > > > > > > > > Logs from Ceph: > > > > 12/8/25 08:54 AM [INF] Cluster is now healthy > > 12/8/25 08:54 AM [INF] Health check cleared: PG_DEGRADED (was: > > Degraded data redundancy: 11/121226847 objects degraded (0.000%), 1 > > pg degraded) > > 12/8/25 08:54 AM [WRN] Health check update: Degraded data > > redundancy: 11/121226847 objects degraded (0.000%), 1 pg degraded > > (PG_DEGRADED) > > 12/8/25 08:54 AM [INF] Health check cleared: PG_AVAILABILITY (was: > > Reduced data availability: 1 pg peering) > > 12/8/25 08:54 AM [WRN] Health check failed: Degraded data > > redundancy: 12/121226847 objects degraded (0.000%), 2 pgs degraded > > (PG_DEGRADED) > > 12/8/25 08:54 AM [WRN] Health check failed: Reduced data > > availability: 1 pg peering (PG_AVAILABILITY) > > 12/8/25 08:54 AM [INF] osd.33 > > [v2:10.10.10.105:6824/297218036,v1:10.10.10.105:6825/297218036] boot > > 12/8/25 08:54 AM [WRN] OSD bench result of 1909.305644 IOPS is not > > within the threshold limit range of 50.000000 IOPS and 500.000000 > > IOPS for osd.33. IOPS capacity is unchanged at 315.000000 IOPS. The > > recommendation is to establish the osd's IOPS capacity using other > > benchmark tools (e.g. Fio) and then override > > osd_mclock_max_capacity_iops_[hdd|ssd]. > > 12/8/25 08:53 AM [INF] Deploying daemon osd.33 on ceph-osd15 > > 12/8/25 08:53 AM [INF] Found osd claims for drivegroup > > dashboard-admin-1633624229976 -> {'ceph-osd15': ['33']} > > 12/8/25 08:53 AM [INF] Found osd claims -> {'ceph-osd15': ['33']} > > 12/8/25 08:53 AM [INF] Detected new or changed devices on ceph-osd15 > > 12/8/25 08:52 AM [INF] Successfully zapped devices for osd.33 on > ceph-osd15 > > 12/8/25 08:52 AM [INF] Zapping devices for osd.33 on ceph-osd15 > > 12/8/25 08:52 AM [INF] Successfully destroyed old osd.33 on > > ceph-osd15; ready for replacement > > 12/8/25 08:52 AM [INF] Successfully removed osd.33 on ceph-osd15 > > 12/8/25 08:52 AM [INF] Removing key for osd.33 > > 12/8/25 08:52 AM [INF] Removing daemon osd.33 from ceph-osd15 -- ports [] > > > > > > > > > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io