[ceph-users] Re: ceph deployment best practice

gagan tiwari Mon, 21 Apr 2025 04:53:46 -0700

HI Anthony,
                          Based on your inputs and further digging into
Ceph documentation,  I am now thinking to go for 6 OSD nodes to have k=4
and m=2 EC set-up.


As I mentioned., we need maximum usable space and we are more concerned
about data safety and  best read performance from the cluster.  Writes
operation will be done on a separate storage solution via NFS.

So, with each OSD node having 22X4T Enterprise SSD  , we will have 88X6 =
528T Raw Space. With 4X2 EC , it will hopefully provide us with 390T usable
space.   So, that will be enough for us to start with.

So, I need to know what will be data safely level with the above set-up (
i.e.  6 OSDs with  4X2 EC  ). How many OSDs ( disks ) and nodes failure ,
above set-up can withstand.

Also,  if , later,  we need to add more OSD modes to get more usable
space,   will we need to add same size disks ( 4T ) or can we add nodes
with bigger size disks ( 8T or 15T )  ?


Beside OSDs server ,  going to have three Dell servers with 8 Core and 64G
RAM to run 3 monitor daemons one on  each server.

One 4 core and 64G RAM with high core freq ( 4800 MHz ) server to run MDS
daemon.

Please advise


Thanks,
Gagan










On Tue, Apr 15, 2025 at 8:14 PM Anthony D'Atri <anthony.da...@gmail.com>
wrote:

> It’s a function of your use-case.
>
>
> > On Apr 14, 2025, at 8:41 AM, Anthony Fecarotta <anth...@linehaul.ai>
> wrote:
> >
> >> MDS (if you’re going to CephFS vs using S3 object storage or RBD block)
> > Hi Anthony,
> >
> > Can you elaborate on this remark?
> >
> > Should one choose between using CephFS vs S3 Storage (as it pertains to
> best practices)?
> >
> > On Proxmox, I am. using both CephFS and RBD.
> >
> >
> > Regards,
> > [image]
> > Anthony Fecarotta
> > Founder & President
> > [image] anth...@linehaul.ai <mailto:anth...@linehaul.ai>
> > [image] 224-339-1182 [image] (855) 625-0300
> > [image] 1 Mid America Plz Flr 3 Oakbrook Terrace, IL 60181
> > [image] www.linehaul.ai <http://www.linehaul.ai/>
> > [image] <http://www.linehaul.ai/>
> > [image] <https://www.linkedin.com/in/anthony-fec/>
> >
> > On Sun Apr 13, 2025, 04:28 PM GMT, Anthony D'Atri <mailto:
> anthony.da...@gmail.com> wrote:
> >>
> >>> On Apr 13, 2025, at 12:00 PM, Brendon Baumgartner <bren...@netcal.com>
> wrote:
> >>>
> >>>
> >>>> On Apr 11, 2025, at 10:13, gagan tiwari <
> gagan.tiw...@mathisys-india.com> wrote:
> >>>>
> >>>> Hi Anthony,
> >>>> We will be using Samsung SSD 870 QVO 8TB disks on
> >>>> all OSD servers.
> >>>
> >>> I’m a newbie to ceph and I have a 4 node cluster and it doesn’t have a
> lot of users so downtime is easily scheduled for tinkering. I started with
> consumer SSDs (SATA/NVMEs) because they were free and lying around.
> Performance was bad. Then just the NVMEs, still bad. Then enterprise SSDs,
> still bad (relative to DAS anyway).
> >>
> >> Real enteprise SSDs? Enterprise NVMe not enterprise SATA? Sellers can
> lie sometimes. Also be sure to update firmware to the latest, that can make
> a substantial difference.
> >>
> >> Other factors include:
> >>
> >> * Enough hosts and OSDs. Three hosts with one OSD each aren’t going to
> deliver a great experience
> >> * At least 6GB of available physmem per NVMe OSD
> >> * How you measure - a 1K QD1 fsync workload is going to be more
> demanding than a buffered 64K QD32 workload.
> >>>
> >>> Each step on the journey to enterprise SSDs made things faster. The
> problem with the consumer stuff is the latency. Enterprise SSDs are 0-2ms.
> Consumer SSDs are 15-300ms. As you can see, the latency difference is
> significant.
> >>
> >> Some client SSDs are “DRAMless”, they don’t use ~~ 1GB of onboard RAM
> per 1TB of capacity as the LBA indirection table. This can be a substantial
> issue for enterprise workloads.
> >>
> >>>
> >>> So from my experience, I would say ceph is very slow in general
> compared to DAS. You need all the help you can get.
> >>>
> >>> If you want to use the consumer stuff, I would recommend to make a
> slow tier (2nd pool with a different policy). Or I suppose just expect it
> to be slow in general. I still have my consumer drives installed, just
> configured as a 2nd tier which is unused right now because we have an old
> JBOD for 2nd tier that is much faster.
> >>
> >> How much drives in each?
> >>>
> >>> Good luck!
> >>>
> >>> _BB
> >>>
> >>>
> >>> _______________________________________________
> >>> ceph-users mailing list -- ceph-users@ceph.io
> >>> To unsubscribe send an email to ceph-users-le...@ceph.io
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph deployment best practice

Reply via email to