> On Apr 11, 2025, at 4:04 AM, gagan tiwari <gagan.tiw...@mathisys-india.com> > wrote: > > Hi Anthony, > Thanks for the reply! > > We will be using CephFS to access Ceph Storage from clients. So, this > will need MDS daemon also.
MDS is single-threaded, so unlike most Ceph daemons it benefits more from a high-frequency CPU than core count. > So, based on your advice, I am thinking of having 4 Dell PowerEdge servers > . 3 of them will run 3 Monitor daemons and one of them will run MDS > daemon. > > These Dell Servers will have following hardware :- > > 1. 4 cores ( 8 threads ) ( Can go for 8 core and 16 threads ) > > 2. 64G RAM > > 3. 2x4T Samsung SSD with RA!D 1 to install OS and run monitor and > metadata services. That probably suffices for a small cluster. Are those Samsungs enterprise? > OSD nodes will be upgraded to have 32 cores ( 64 threads ). Disk and RAM > will remain same ( 128G and 22X8T Samsung SSD ) Which Samsung SSD? Using client SKUs for OSDs has a way of leading to heartbreak. 64 threads would be better for a 22x OSD node, though still a bit light. Are these SATA or NVMe? > Actually , I want to use OSD nodes to run OSD damons and not any > other demons and which is why I am thinking of having 4 additional Dell > servers as mentioned above. Colocation of daemons is common these days, especially with smaller clusters. > > Please advise if this plan will be better. That’ll work, but unless you already have those quite-modest 4x non-OSD nodes sitting around idle you might consider just going with the OSD nodes and bumping the CPU again so you can colocate all the daemons. > > Thanks, > Gagan > > > > > > > On Wed, Apr 9, 2025 at 8:12 PM Anthony D'Atri <anthony.da...@gmail.com> > wrote: > >> >>> >>> We would start deploying Ceph with 4 hosts ( HP Proliant servers ) each >>> running RockyLinux 9. >>> >>> One of the hosts called ceph-adm will be smaller one and will have >>> following hardware :- >>> >>> 2x4T SSD with raid 1 to install OS on. >>> >>> 8 Core with 3600MHz freq. >>> >>> 64G RAM >>> >>> We are planning to run all Ceph daemons except OSD daemon like monitor , >>> metadata ,etc on this host. >> >> 8 core == 16 threads? Are you provisioning this node because you have it >> laying around idle? >> >> Note that you will want *at least* 3 Monitor (monitors) daemons, which >> must be on different nodes. 5 is better, but at least 3. You’ll also have >> Grafana, Prometheus, MDS (if you’re going to CephFS vs using S3 object >> storage or RBD block) >> >> 8c is likely on the light side for all of that. You would also benefit >> from not having that node be a single point of failure. I would suggest if >> you can raising this node to the spec of the planned 3x OSD nodes so you >> have 4x equivalent nodes, and spread that non-OSD daemons across them. >> >> Note also that your OSD nodes will also have node_exporter, crash, and >> other boilerplate daemons. >> >> >>> We will have 3 hosts to run OSD which will store actual data. >>> >>> Each OSD host will have following hardware >>> >>> 2x4T SSD with raid 1 to install OS on. >>> >>> 22X8T SSD to store data ( OSDs ) ( without partition ). We will use >> entire >>> disk without partitions >> >> SAS, SATA, or NVMe SSDs? Which specific model? You really want to avoid >> client (desktop) models for Ceph, but you likely do not need to pay for >> higher endurance mixed-use SKUs. >> >>> Each OSD host will have 128G RAM ( No swap space ) >> >> Thank you for skipping swap. Some people are really stuck in the past in >> that regard. >> >>> Each OSD host will have 16 cores. >> >> So 32 threads total? That is very light for 22 OSDs + other daemons. For >> HDD OSDs a common rule of thumb is at minimum 2x threads per, for SAS/SATA >> SSDs, 4, for NVMe SSDs 6. Plus margin for the OS and other processes. >> >>> All 4 hosts will connect to each via 10G nic. >> >> Two ports with bonding? Redundant switches? >> >>> The 500T data >> >> The specs you list above include 528 TB of *raw* space. Be advised that >> with three OSD nodes, you will necessarily be doing replication. For >> safety replication with size=3. Taking into consideration TB vs TiB and >> headroom, you’re looking at 133TiB of usable space. You could go with >> size=2 to get 300TB of usable space, but at increased risk of data >> unavailability or loss when drives/hosts fail or reboot. >> >> With at least 4 OSD nodes - even if they aren’t fully populated with >> capacity drives — you could do EC for a more favorable raw:usable ratio, at >> the expense of slower writes and recovery. With 4 nodes you could in >> theory do 2,2 EC for 200 TiB of usable space, with 5 you could do 3,2 for >> 240 TiB usable, etc. >> >>> will be accessed by the clients. We need to have >>> read performance as fast as possible. >> >> Hope your SSDs are enterprise NVMe. >> >>> We can't afford data loss and downtime. >> >> Then no size=2 for you. >> >>> So, we want to have a Ceph >>> deployment which serves our purpose. >>> >>> So, please advise me if the plan that I have designed will serve our >>> purpose. >>> Or is there a better way , please advise that. >>> >>> Thanks, >>> Gagan >>> >>> >>> >>> >>> >>> >>> We have a HP storage server with 12 SDD of 5T each and have set-up >> hardware >>> RAID6 on these disks. >>> >>> HP storage server has 64G RAM and 18 cores. >>> >>> So, please advise how I should go about setting up Ceph on it to have >> best >>> read performance. We need fastest read performance. >>> >>> >>> Thanks, >>> Gagan >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@ceph.io >>> To unsubscribe send an email to ceph-users-le...@ceph.io >> >> > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io