Hi

We currently have a 3 DC setup with 6 HDD servers in each of 3 DCs, running replica 3 and EC 4+5 pools. This works fine for mostly dormant data but of course not so well for other stuff requiring low latency etc. Mostly big RBDs shared by kernel NFS with a bit of CephFS.

We want to start migrating some of these big RBD backed NFS shares to NVMe. We currently have 3 E3.S servers, one in each DC, with an EC 4+5 pool with DC->OSD crush selection. We are wondering how to cheapest increase redundancy short term since we do not have enterprise money.

With what we have we can lose 1 DC (aka host in this case) and 1 OSD and still be online, but there's nowhere to backfill and some hardware problems, like a faulty mainboard or CPU can take down an entire "DC". Ideally we would have several more hosts in each DC so we had DC->Host->OSD but that's not an option.

I am thinking it could make sense to add just 1 more server in a 4th DC and keep the 4+5 rule as is, simply giving us a buffer of 1 more failed host before we hit problems. Thoughts?

Mvh.

Torkil

--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
KettegÄrd Allé 30
DK-2650 Hvidovre
Denmark
Tel: +45 386 22828
E-mail: tor...@drcmr.dk
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to