[ceph-users] Re: ceph deployment best practice

Anthony D'Atri Wed, 09 Apr 2025 07:43:30 -0700

> 
> We would start deploying Ceph with 4 hosts ( HP Proliant servers ) each
> running RockyLinux 9.
> 
> One of the hosts called ceph-adm will be smaller one and will have
> following hardware :-
> 
> 2x4T SSD  with raid 1 to install OS on.
> 
> 8 Core with 3600MHz freq.
> 
> 64G  RAM
> 
> We are planning to run all Ceph daemons except OSD daemon like monitor ,
> metadata ,etc on this host.


8 core == 16 threads? Are you provisioning this node because you have it laying 
around idle?

Note that you will want *at least* 3 Monitor (monitors) daemons, which must be 
on different nodes.  5 is better, but at least 3. You’ll also have Grafana, 
Prometheus, MDS (if you’re going to CephFS vs using S3 object storage or RBD 
block)

8c is likely on the light side for all of that.  You would also benefit from 
not having that node be a single point of failure.  I would suggest if you can 
raising this node to the spec of the planned 3x OSD nodes so you have 4x 
equivalent nodes, and spread that non-OSD daemons across them. 

Note also that your OSD nodes will also have node_exporter, crash, and other 
boilerplate daemons.


> We will have 3 hosts to run OSD which will store actual data.
> 
> Each OSD host will have following hardware
> 
> 2x4T SSD  with raid 1 to install OS on.
> 
> 22X8T SSD  to store data ( OSDs ) ( without partition ). We will use entire
> disk without partitions

SAS, SATA, or NVMe SSDs?  Which specific model?  You really want to avoid 
client (desktop) models for Ceph, but you likely do not need to pay for higher 
endurance mixed-use SKUs.

> Each OSD host will have 128G RAM  ( No swap space )

Thank you for skipping swap.  Some people are really stuck in the past in that 
regard.

> Each OSD host will have 16 cores.

So 32 threads total?  That is very light for 22 OSDs + other daemons.  For HDD 
OSDs a common rule of thumb is at minimum 2x threads per, for SAS/SATA SSDs, 4, 
for NVMe SSDs 6.  Plus margin for the OS and other processes.

> All 4 hosts will connect to each via 10G nic.

Two ports with bonding? Redundant switches?

> The 500T data

The specs you list above include 528 TB of *raw* space.  Be advised that with 
three OSD nodes, you will necessarily be doing replication.  For safety 
replication with size=3.  Taking into consideration TB vs TiB and headroom, 
you’re looking at 133TiB of usable space.  You could go with size=2 to get 
300TB of usable space, but at increased risk of data unavailability or loss 
when drives/hosts fail or reboot.

With at least 4 OSD nodes - even if they aren’t fully populated with capacity 
drives — you could do EC for a more favorable raw:usable ratio, at the expense 
of slower writes and recovery.  With 4 nodes you could in theory do 2,2 EC for 
200 TiB of usable space, with 5 you could do 3,2 for 240 TiB usable, etc.

> will be accessed by the clients. We need to have
> read performance as fast as possible.

Hope your SSDs are enterprise NVMe.

> We can't afford data loss and downtime.

Then no size=2 for you.

> So, we want to have a Ceph
> deployment  which serves our purpose.
> 
> So, please advise me if the plan that I have designed will serve our
> purpose.
> Or is there a better way , please advise that.
> 
> Thanks,
> Gagan
> 
> 
> 
> 
> 
> 
> We have a HP storage server with 12 SDD of 5T each and have set-up hardware
> RAID6 on these disks.
> 
> HP storage server has 64G RAM and 18 cores.
> 
> So, please advise how I should go about setting up Ceph on it to have best
> read performance. We need fastest read performance.
> 
> 
> Thanks,
> Gagan
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph deployment best practice

Reply via email to