>> . > > At least 16 or 32 nodes > in 16 racks. > Erasure coding 8+3 with failure domain on rack level. > Ack.
> We initially selected 8+3 over 4+2 because we expect rebuilds to take very > long with nodes this big and we don't want to loose redundancy Fair enough. You get more nines with m=3 for sure, though the wider profile itself will mean slower scrubs and recovery. I suspect you set mon_osd_down_out_subtree_limit? > >>> >>> >>> Splitting JBOD logically into 2 servers isn't an issue for use because we >>> will replicate data on rack level and not host level. >>> >>> >>> Common specifications for all variants >>> >>> 5-6GB of RAM per 1 HDD >> Plus more for mons and other daemons? Especially MDS? > > Other daemons will be on some dedicated non-storage servers. Ack. Had to ask. > We aim for low RAM/HDD on storage nodes. Other daemons won't fit there. > >>> 2% of HDD capacity in NVMe devices for block.db (or none) >>> 2x 50Gb or 2x 100Gb Ethernet per server (active-backup bonded interfaces) >>> (CPU per OSD to be determined) >>> >>> >>> Variant A1 is very unlikely to happen but we are curious what network >>> interface speeds would you suggest for so many HDDs in one node. >> 100GE bonded at the least. Depends on your workload. >>> >>> Variant A2 is the most likely the one we will choose for large deployment. >>> >>> Variant B1/B2 for smaller deployments. >>> >>> Does anyone of you run ceph on similar setups? Did you find any pitfall >>> with it? >>> >>> What are your minimal recommendations for network speed per HDD, cpu per >>> HDD, etc? >>> >>> In our experience most of our servers, even in large clusters, never max >>> out the network interfaces or CPUs. We almost never rebuild or rebalance >>> whole servers. 27 HDD nodes of our biggest CephFS cluster with EC usually >>> have only 2-3Gbps of network traffic. >> Your workload is archival? > > Yes, mostly archival. > We have big demand for S3 and CephFS. > But we may move to pure s3 cluster in the future. > _______________________________________________ ceph-users mailing list -- [email protected] To unsubscribe send an email to [email protected]
