[ceph-users] Proper procedure to replace DB/WAL SSD

Caspar Smit Fri, 23 Feb 2018 05:39:04 -0800

Hi All,

What would be the proper way to preventively replace a DB/WAL SSD (when it
is nearing it's DWPD/TBW limit and not failed yet).


It hosts DB partitions for 5 OSD's

Maybe something like:

1) ceph osd reweight 0 the 5 OSD's
2) let backfilling complete
3) destroy/remove the 5 OSD's
4) replace SSD
5) create 5 new OSD's with seperate DB partition on new SSD

When these 5 OSD's are big HDD's (8TB) a LOT of data has to be moved so i
thought maybe the following would work:

1) ceph osd set noout
2) stop the 5 OSD's (systemctl stop)
3) 'dd' the old SSD to a new SSD of same or bigger size
4) remove the old SSD
5) start the 5 OSD's (systemctl start)
6) let backfilling/recovery complete (only delta data between OSD stop and
now)
6) ceph osd unset noout

Would this be a viable method to replace a DB SSD? Any udev/serial nr/uuid
stuff preventing this to work?

Or is there another 'less hacky' way to replace a DB SSD without moving too
much data?

Kind regards,
Caspar

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Proper procedure to replace DB/WAL SSD

Reply via email to