Am 13.11.2013 09:34, schrieb Martin B Nielsen:
Probably common sense but I was bitten by this once in a likewise situation..

If you run 3x replica and distribute them over 3x hosts (is that default now?) make sure that the disks on the host with the failed disk have space for it - the remaining two disks will have to hold the content of the failed disk and if they can't, your cluster will run full and halt.


I prefer to simply kill the osd deamon (marking it down), replace the hdd and start the osd daemon again. This way no data will be rebalanced and the new osd will automatically copy all the data its missing. If replacing the faulty hdd - after killing the osd - is taking more than 15 minutes, make sure to do "ceph osd set noout" first and don't foget to unset it again when done.


ceph-users mailing list

Reply via email to