On Jun 24, 2013, at 11:22 AM, Brian Candler <b.cand...@pobox.com> wrote:

> I'm just finding my way around the Ceph documentation. What I'm hoping to 
> build are servers with 24 data disks and one O/S disk. From what I've read, 
> the recommended configuration is to run 24 separate OSDs (or 23 if I have a 
> separate journal disk/SSD), and not have any sort of in-server RAID.
> 
> Obviously, disks are going to fail - and the documentation acknowledges this.
> 
> What I'm looking for is a documented procedure for replacing a failed disk, 
> but so far I have not been able to find one. Can you point me at the right 
> place please?
> 
> I'm looking for something step-by-step and as idiot-proof as possible :-)


The official documentation is maybe not %100 idiot-proof, but it is 
step-by-step:

http://ceph.com/docs/master/rados/operations/add-or-rm-osds/

If you lose a disk you want to remove the OSD associated with it. This will 
trigger a data migration so you are back to full redundancy as soon as it 
finishes. Whenever you get a replacement disk, you will add an OSD for it (the 
same as if you were adding an entirely new disk). This will also trigger a data 
migration so the new disk will be utilized immediately.

If you have a spare or replacement disk immediately after a disk goes bad, you 
could maybe save some data migration by doing the removal and re-adding within 
a short period of time, but otherwise "drive replacement" looks exactly like 
retiring an OSD and adding a new one that happens to use the same drive slot.

JN

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to