Hello Ceph community, We are provisioning a new Ceph cluster and would appreciate design advice. Below is our current hardware:
- 13 data servers each with 18 x 24 TB HDD and 2 x 3.5 TB SSD (6 servers in rack A, 7 in rack B) - 3 control servers each with 2 x 960 GB SSD and 4 x 1.9 TB SSD (2 in rack A, 1 in rack B) The planned topology is: - A single CephFS backed by a data pool - The 3 control servers will run MON/MGR/MDS and other daemons. The 13 data servers will host OSDs and some extra MONs as needed. Current idea (constraints: we cannot add hardware) is on each data server we will create a RAID1 from the two 3.5 TB SSDs and carve it into a 500 GB virtual disk for the OS and a 3 TB virtual disk for DB/WAL (Bluestore). Each HDD will be an OSD, giving 18 x 24 TB = 432 TB raw HDD per data node. 1) Is it acceptable to host WAL/DB on a RAID1 virtual disk made from the two SSDs, or is that a bad idea for performance/reliability? 2) Is 3 TB for DB/WAL per data node (~167 GB per OSD) likely to be sufficient for 18 x 24 TB OSDs, or should we expect to need more DB space given CephFS metadata/object counts? (We've seen a 2% rule cited, that would be ~8.6 TB per node which is much larger) 3) Alternative: keep WAL/DB on the HDDs (i.e., do not separate), and use the 3 TB RAID1 SSD area plus the control servers' SSDs (4 x 1.9 TB each) as SSD OSDs for the CephFS metadata pool. Would that be a reasonable approach? 4) If we do SSD-backed metadata OSDs with 3x replication for the metadata pool, we were thinking of CRUSH domains: domain1 = 6 data servers in rack A (6 x 3 TB SSD OSDs), domain2 = 7 data servers in rack B (7 x 3 TB), domain3 = 3 control servers (12 x 1.9 TB SSD OSDs). Any pitfalls with that CRUSH layout? Any suggestions would be highly appreciated. Thank you, Gustavo _______________________________________________ ceph-users mailing list -- [email protected] To unsubscribe send an email to [email protected]
