[ceph-users] Re: Identify laggy PGs

Boris Sat, 17 Aug 2024 04:47:56 -0700

I just checked the logs, and there are also laggy PGs when all others are 
active+clean.


After adding 5x8tb disks the laggy PGs stopped. ¯\_(ツ)_/¯ 

Maybe upping the PGs could do something do something to make deep scrubbing 
faster. 
But we have 2,4 and 8 TB disks in this cluster. And they are only sata SSDs, so 
not particularly fast ones. 

I always thought that too many PGs have impact on the disk IO. I guess this is 
wrong?

So I could double the PGs in the pool and see if things become better. 

And yes, removing that single OSD from the cluster stopped the flapping of 
"monitor marked osd.N down". 

> Am 15.08.2024 um 10:14 schrieb Frank Schilder <fr...@dtu.dk>:
> 
> The current ceph recommendation is to use between 100-200 PGs/OSD. 
> Therefore, a large PG is a PG that has more data than 0.5-1% of the disk 
> capacity and you should split PGs for the relevant pool.
> 
> A huge PG is a PG for which deep-scrub takes much longer than 20min on HDD 
> and 4-5min on SSD.
> 
> Average deep-scrub times (time it takes to deep-scrub) are actually a very 
> good way of judging if PGs are too large. These times roughly correlate with 
> the time it takes to copy a PG.
> 
> On SSDs we aim for 200+PGs/OSD and for HDDs for 150PGs/OSD. For very large 
> HDD disks (>=16TB) we consider raising this to 300PGs/OSD due to excessively 
> long deep-scrub times per PG.
> 
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> 
> ________________________________________
> From: Szabo, Istvan (Agoda) <istvan.sz...@agoda.com>
> Sent: Wednesday, August 14, 2024 12:00 PM
> To: Eugen Block; ceph-users@ceph.io
> Subject: [ceph-users] Re: Identify laggy PGs
> 
> Just curiously I've checked my pg size which is like 150GB, when are we 
> talking about big pgs?
> ________________________________
> From: Eugen Block <ebl...@nde.ag>
> Sent: Wednesday, August 14, 2024 2:23 PM
> To: ceph-users@ceph.io <ceph-users@ceph.io>
> Subject: [ceph-users] Re: Identify laggy PGs
> 
> Email received from the internet. If in doubt, don't click any link nor open 
> any attachment !
> ________________________________
> 
> Hi,
> 
> how big are those PGs? If they're huge and are deep-scrubbed, for
> example, that can cause significant delays. I usually look at 'ceph pg
> ls-by-pool {pool}' and the "BYTES" column.
> 
> Zitat von Boris <b...@kervyn.de>:
> 
>> Hi,
>> 
>> currently we encouter laggy PGs and I would like to find out what is
>> causing it.
>> I suspect it might be one or more failing OSDs. We had flapping OSDs and I
>> synced one out, which helped with the flapping, but it doesn't help with
>> the laggy ones.
>> 
>> Any tooling to identify or count PG performance and map that to OSDs?
>> 
>> 
>> --
>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>> groÃƒ¼en Saal.
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> ________________________________
> This message is confidential and is for the sole use of the intended 
> recipient(s). It may also be privileged or otherwise protected by copyright 
> or other legal rules. If you have received it by mistake please let us know 
> by reply email and delete it from your system. It is prohibited to copy this 
> message or disclose its content to anyone. Any confidentiality or privilege 
> is not waived or lost by any mistaken delivery or unauthorized disclosure of 
> the message. All messages sent to and from Agoda may be monitored to ensure 
> compliance with company policies, to protect the company's interests and to 
> remove potential malware. Electronic messages may be intercepted, amended, 
> lost or deleted, or contain viruses.
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Identify laggy PGs

Reply via email to