I just checked the logs, and there are also laggy PGs when all others are active+clean.
After adding 5x8tb disks the laggy PGs stopped. ¯\_(ツ)_/¯ Maybe upping the PGs could do something do something to make deep scrubbing faster. But we have 2,4 and 8 TB disks in this cluster. And they are only sata SSDs, so not particularly fast ones. I always thought that too many PGs have impact on the disk IO. I guess this is wrong? So I could double the PGs in the pool and see if things become better. And yes, removing that single OSD from the cluster stopped the flapping of "monitor marked osd.N down". > Am 15.08.2024 um 10:14 schrieb Frank Schilder <fr...@dtu.dk>: > > The current ceph recommendation is to use between 100-200 PGs/OSD. > Therefore, a large PG is a PG that has more data than 0.5-1% of the disk > capacity and you should split PGs for the relevant pool. > > A huge PG is a PG for which deep-scrub takes much longer than 20min on HDD > and 4-5min on SSD. > > Average deep-scrub times (time it takes to deep-scrub) are actually a very > good way of judging if PGs are too large. These times roughly correlate with > the time it takes to copy a PG. > > On SSDs we aim for 200+PGs/OSD and for HDDs for 150PGs/OSD. For very large > HDD disks (>=16TB) we consider raising this to 300PGs/OSD due to excessively > long deep-scrub times per PG. > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Szabo, Istvan (Agoda) <istvan.sz...@agoda.com> > Sent: Wednesday, August 14, 2024 12:00 PM > To: Eugen Block; ceph-users@ceph.io > Subject: [ceph-users] Re: Identify laggy PGs > > Just curiously I've checked my pg size which is like 150GB, when are we > talking about big pgs? > ________________________________ > From: Eugen Block <ebl...@nde.ag> > Sent: Wednesday, August 14, 2024 2:23 PM > To: ceph-users@ceph.io <ceph-users@ceph.io> > Subject: [ceph-users] Re: Identify laggy PGs > > Email received from the internet. If in doubt, don't click any link nor open > any attachment ! > ________________________________ > > Hi, > > how big are those PGs? If they're huge and are deep-scrubbed, for > example, that can cause significant delays. I usually look at 'ceph pg > ls-by-pool {pool}' and the "BYTES" column. > > Zitat von Boris <b...@kervyn.de>: > >> Hi, >> >> currently we encouter laggy PGs and I would like to find out what is >> causing it. >> I suspect it might be one or more failing OSDs. We had flapping OSDs and I >> synced one out, which helped with the flapping, but it doesn't help with >> the laggy ones. >> >> Any tooling to identify or count PG performance and map that to OSDs? >> >> >> -- >> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im >> groüen Saal. >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io > > > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > > ________________________________ > This message is confidential and is for the sole use of the intended > recipient(s). It may also be privileged or otherwise protected by copyright > or other legal rules. If you have received it by mistake please let us know > by reply email and delete it from your system. It is prohibited to copy this > message or disclose its content to anyone. Any confidentiality or privilege > is not waived or lost by any mistaken delivery or unauthorized disclosure of > the message. All messages sent to and from Agoda may be monitored to ensure > compliance with company policies, to protect the company's interests and to > remove potential malware. Electronic messages may be intercepted, amended, > lost or deleted, or contain viruses. > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io