Hello Ceph community,

We are provisioning a new Ceph cluster and would appreciate design advice. 
Below is our current hardware:

- 13 data servers each with 18 x 24 TB HDD and 2 x 3.5 TB SSD (6 servers in 
rack A, 7 in rack B)
- 3 control servers each with 2 x 960 GB SSD and 4 x 1.9 TB SSD (2 in rack A, 1 
in rack B)

The planned topology is:

- A single CephFS backed by a data pool
- The 3 control servers will run MON/MGR/MDS and other daemons. The 13 data 
servers will host OSDs and some extra MONs as needed.

Current idea (constraints: we cannot add hardware) is on each data server we 
will create a RAID1 from the two 3.5 TB SSDs and carve it into a 500 GB virtual 
disk for the OS and a 3 TB virtual disk for DB/WAL (Bluestore). Each HDD will 
be an OSD, giving 18 x 24 TB = 432 TB raw HDD per data node.

1) Is it acceptable to host WAL/DB on a RAID1 virtual disk made from the two 
SSDs, or is that a bad idea for performance/reliability?

2) Is 3 TB for DB/WAL per data node (~167 GB per OSD) likely to be sufficient 
for 18 x 24 TB OSDs, or should we expect to need more DB space given CephFS 
metadata/object counts? (We've seen a 2% rule cited, that would be ~8.6 TB per 
node which is much larger)

3) Alternative: keep WAL/DB on the HDDs (i.e., do not separate), and use the 3 
TB RAID1 SSD area plus the control servers' SSDs (4 x 1.9 TB each) as SSD OSDs 
for the CephFS metadata pool. Would that be a reasonable approach?

4) If we do SSD-backed metadata OSDs with 3x replication for the metadata pool, 
we were thinking of CRUSH domains: domain1 = 6 data servers in rack A (6 x 3 TB 
SSD OSDs), domain2 = 7 data servers in rack B (7 x 3 TB), domain3 = 3 control 
servers (12 x 1.9 TB SSD OSDs). Any pitfalls with that CRUSH layout?

Any suggestions would be highly appreciated.

Thank you,
Gustavo
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to