Hi all,
We have a firefly ceph cluster (using Promxox VE, but I don't think this
is revelant), and found a OSD disk was having quite a high amount of
errors as reported by SMART, and also quite high wait time as reported
by munin, so we decided to replace it.
What I have done is down/out the osd, then remove it (removing
partitions). Replace the disk and create a new OSD, which was created
with the same ID as the removed one (as I was hoping to not change CRUSH
map).
So everything has worked as expected, except one minor non-issue:
- Original OSD journal was on a separate SSD disk, which had partitions
#1 and #2 (journals of 2 OSDs).
- Original journal partition (#1) was removed
- A new partition has been created as #1, but has been assigned space
after the last existing partition. So there is now hole of 5GB in the
beginning of SSD disk. Promox is using ceph-disk prepare for this, I
seen in the docs (http://ceph.com/docs/master/man/8/ceph-disk/) that
ceph-disk prepare creates a new partition in the journal block device.
What I'm afraid is that given enough OSD replacements, Proxmox wouldn't
find free space for new journals in that SSD disk? Although there would
be plenty in the beginning?
Maybe the journal-partition creation can be improved so that it can
detect free space also in the beginning and between existing partitions?
Cheers
Eneko
--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943575997
943493611
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com