> This sounds interesting because this way the pressure wouldn't be too big if > go like 0.1 0.2 OSD by OSD.
I used to do this as well, back before pg-upmap was a thing, and while I still had Jewel clients. It is however less efficient, because some data ends up moving more than once. Upweighting a handful of OSDs at the same time may spread the load and allow faster progress than going one at a time. Say one per host or one per failure domain. The PG remapping tools allow fine-grained control with more efficiency, though any clients that aren’t Luminous or later will have a really bad day. > What I can see how ceph did it, when add the new OSDs, the complete host get > the remapped pgs from other hosts also, so the old osds PG number increased > by like +50% (which was already overloaded) and slowly rebalance to the newly > added osds on the same host. This initial pressure to big. I don’t follow; adding new OSDs should on average decrease the PG replicas on the existing OSDs. But imbalances during topology changes are one reason I like to raise mon_max_pg_per_osd to 1000, otherwise you can end up with PGs that won’t activate. > > This "misplaced ratio to 1%" I've never tried, let me read a bit, thank you. > > Istvan > ________________________________ > From: Eugen Block <ebl...@nde.ag> > Sent: Saturday, September 7, 2024 4:55:40 AM > To: ceph-users@ceph.io <ceph-users@ceph.io> > Subject: [ceph-users] Re: Somehow throotle recovery even further than basic > options? > > Email received from the internet. If in doubt, don't click any link nor open > any attachment ! > ________________________________ > > I can’t say anything about the pgremapper, but have you tried > increasing the crush weight gradually? Add new OSDs with crush initial > weight 0 and then increase it in small steps. I haven’t used that > approach for years, but maybe that can help here. Or are all OSDs > already up and in? Or you could reduce the max misplaced ratio to 1% > or even lower (default is 5%)? > > Zitat von "Szabo, Istvan (Agoda)" <istvan.sz...@agoda.com>: > >> Forgot to paste, somehow I want to reduce this recovery operation: >> recovery: 0 B/s, 941.90k keys/s, 188 objects/s >> To 2-300Keys/sec >> >> >> >> ________________________________ >> From: Szabo, Istvan (Agoda) <istvan.sz...@agoda.com> >> Sent: Friday, September 6, 2024 11:18 PM >> To: Ceph Users <ceph-users@ceph.io> >> Subject: [ceph-users] Somehow throotle recovery even further than >> basic options? >> >> Hi, >> >> 4 years ago we've created our cluster with all disks 4osds (ssds and >> nvme disks) on octopus. >> The 15TB SSDs still working properly with 4 osds but the small 1.8T >> nvmes with the index pool not. >> Each new nvme osd adding to the existing nodes generates slow ops >> with scrub off, recovery_op_priority 1, backfill and recovery 1-1. >> I even turned off all index pool heavy sync mechanism but the read >> latency still high which means recovery op pushes it even higher. >> >> I'm trying to somehow add resource to the cluster to spread the 2048 >> index pool pg (in replica 3 means 6144pg index pool) but can't make >> it more gentle. >> >> The balancer is working in upmap with max deviation 1. >> >> Have this script from digitalocean >> https://github.com/digitalocean/pgremapper, is there anybody tried >> it before how is it or could this help actually? >> >> Thank you the ideas. >> >> ________________________________ >> This message is confidential and is for the sole use of the intended >> recipient(s). It may also be privileged or otherwise protected by >> copyright or other legal rules. If you have received it by mistake >> please let us know by reply email and delete it from your system. It >> is prohibited to copy this message or disclose its content to >> anyone. Any confidentiality or privilege is not waived or lost by >> any mistaken delivery or unauthorized disclosure of the message. All >> messages sent to and from Agoda may be monitored to ensure >> compliance with company policies, to protect the company's interests >> and to remove potential malware. Electronic messages may be >> intercepted, amended, lost or deleted, or contain viruses. >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io > > > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > > ________________________________ > This message is confidential and is for the sole use of the intended > recipient(s). It may also be privileged or otherwise protected by copyright > or other legal rules. If you have received it by mistake please let us know > by reply email and delete it from your system. It is prohibited to copy this > message or disclose its content to anyone. Any confidentiality or privilege > is not waived or lost by any mistaken delivery or unauthorized disclosure of > the message. All messages sent to and from Agoda may be monitored to ensure > compliance with company policies, to protect the company's interests and to > remove potential malware. Electronic messages may be intercepted, amended, > lost or deleted, or contain viruses. > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io