> On Oct 2, 2024, at 2:19 PM, quag...@bol.com.br wrote: > > Hi Kyriazis, > I work with a cluster similar to yours : 142 HDDs and 18 SSDs. > I had a lot of performance gains when I made the following settings: > > 1-) For the pool that is configured on the HDDs (here, home directories are > on HDDs), reduce the following replica settings (I don't know what your > resilience requirement is): > *size=2 > * min_size=1 > > I do this for at least 4 years with no problems (even when there is a > need to change discs or reboot a server, this config never got me in trouble). > It is nonetheless risky. The wrong sequence of cascading events, of overlapping failures and you may lose data.
> 2-) Move the filesystem metadata pools to use at least SSD only. > Absolutely. The CephFS docs suggest using size=4 for the MD pool. > > 3-) Increase server and client cache. > Here I left it like this: > osd_memory_target_autotune=true (each OSD always has more than 12G). > > For clients: > client_cache_size=163840 > > client_oc_max_dirty=1048576000 > > client_oc_max_dirty_age=50 > client_oc_max_objects=10000 > > client_oc_size=2097152000 > > client_oc_target_dirty=838860800 > > Evaluate, following the documentation, which of these variables makes > sense for your cluster. > > For the backup scenario, I imagine that decreasing the size and min_size > values will change the impact. However, you must evaluate your needs for > these settings. > > > Rafael. > > > > De: "Kyriazis, George" <george.kyria...@intel.com> > Enviada: 2024/10/02 13:06:09 > Para: ebl...@nde.ag, ceph-users@ceph.io > Assunto: [ceph-users] Re: Question about speeding hdd based cluster > > Thank you all. > > The cluster is used mostly for backup of large files currently, but we are > hoping to use it for home directories (compiles, etc.) soon. Most usage would > be for large files, though. > > What I've observed with its current usage is that ceph rebalances, and > proxmox-initiated VM backups bring the storage to its knees. > > Would a safe approach be to move the metadata pool to ssd first, see how it > goes (since it would be cheaper), and then add DB/WAL disks? How would ceph > behave if we are adding DB/WAL disks "slowly" (ie one node at a time)? We > have about 100 OSDs (mix hdd/ssd) spread across about 25 hosts. Hosts are > server-grade with plenty of memory and processing power. > > Thank you! > > George > > > > -----Original Message----- > > From: Eugen Block <ebl...@nde.ag> > > Sent: Wednesday, October 2, 2024 2:18 AM > > To: ceph-users@ceph.io > > Subject: [ceph-users] Re: Question about speeding hdd based cluster > > > > Hi George, > > > > the docs [0] strongly recommend to have dedicated SSD or NVMe OSDs for > > the metadata pool. You'll also benefit from dedicated DB/WAL devices. > > But as Joachim already stated, it depends on a couple of factors like the > > number of clients, the load they produce, file sizes etc. There's no easy > > answer. > > > > Regards, > > Eugen > > > > [0] https://docs.ceph.com/en/latest/cephfs/createfs/#creating-pools > > > > Zitat von Joachim Kraftmayer <joachim.kraftma...@clyso.com>: > > > > > Hi Kyriazis, > > > > > > depends on the workload. > > > I would recommend to add ssd/nvme DB/WAL to each osd. > > > > > > > > > > > > Joachim Kraftmayer > > > > > > www.clyso.com <http://www.clyso.com/> > > > > > > Hohenzollernstr. 27, 80801 Munich > > > > > > Utting a. A. | HR: Augsburg | HRB: 25866 | USt. ID-Nr.: DE2754306 > > > > > > Kyriazis, George <george.kyria...@intel.com> schrieb am Mi., 2. Okt. > > > 2024, > > > 07:37: > > > > > >> Hello ceph-users, > > >> > > >> I’ve been wondering…. I have a proxmox hdd-based cephfs pool with no > > >> DB/WAL drives. I also have ssd drives in this setup used for other pools. > > >> > > >> What would increase the speed of the hdd-based cephfs more, and in > > >> what usage scenarios: > > >> > > >> 1. Adding ssd/nvme DB/WAL drives for each node 2. Moving the metadata > > >> pool for my cephfs to ssd 3. Increasing the performance of the > > >> network. I currently have 10gbe links. > > >> > > >> It doesn’t look like the network is currently saturated, so I’m > > >> thinking > > >> (3) is not a solution. However, if I choose any of the other > > >> options, would I need to also upgrade the network so that the network > > >> does not become a bottleneck? > > >> > > >> Thank you! > > >> > > >> George > > >> > > >> _______________________________________________ > > >> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an > > >> email to ceph-users-le...@ceph.io > > >> > > > _______________________________________________ > > > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an > > > email to ceph-users-le...@ceph.io > > > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email > > to > > ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to > ceph-users-leave@ceph.io_______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io