Re: many sessions waiting DataFileRead and extend

Frits Hoogland Wed, 25 Jun 2025 07:34:37 -0700

> On 25 Jun 2025, at 07:59, Laurenz Albe <laurenz.a...@cybertec.at> wrote:
> 
> On Wed, 2025-06-25 at 11:15 +0800, James Pang wrote:
>> pgv14, RHEL8, xfs , we suddenly see tens of sessions waiting on 
>> "DataFileRead" and
>> "extend", it last about 2 seconds(based on pg_stat_activity query) , during 
>> the
>> waiting time, "%sys" cpu increased to 80% , but from "iostat" , no high iops 
>> and
>> io read/write latency increased either.
> 
> Run "sar -P all 1" and see if "%iowait" is high.
I would (strongly) advise against the use of iowait as an indicator. It is a 
kernel approximation of time spent in IO from which cannot be use used in any 
sensible way other than possibly you're doing IO.
First of all, iowait is not a kernel state, and therefore it's taken from idle. 
This means that if there is no, or too little, idle time, iowait that should be 
there is gone.
Second, the calculation to transfer idle time to iowait is done for synchronous 
IO calls only. Which currently is not a problem for postgres because it uses 
exactly that, but in the future it might.
Very roughly put, what the kernel does is keep a counter of tasks currently in 
certain system IO calls, and then try to express that using iowait. The time in 
IO wait can't be used calculate any IO facts.

In that sense, it puts it in the same area as the load figure: indicative, but 
mostly useless because it doesn't give you any facts about what it is 
expressing.
> 
> Check if you have transparent hugepages enabled:
> 
>  cat /sys/kernel/mm/transparent_hugepage/enabled
> 
> If they are enabled, disable them and see if it makes a difference.
> 
> I am only guessing here.
Absolutely. Anything that is using signficant amounts of memory and is not 
created to take advantage of transparent hugepages will probably experience 
more downsides from THP than it helps.
> 
>> many sessions were running same "DELETE FROM xxxx" in parallel waiting on 
>> "extend"
>> and "DataFileRead", there are triggers in this table "After delete" to 
>> insert/delete
>> other tables in the tigger. 
> 
> One thing that almost certainly would improve your situation is to run fewer
> concurrent statements, for example by using a reasonably sized connection 
> pool.
This is true if the limits of the IO device, or anything towards to IO device 
or devices are hit.
And in general, high "%sys", alias lots of time spent in kernel mode alias 
system time indicates lots of time spent in system calls, which is what the 
read and write calls in postgres are.
Therefore these figures suggest blocking for IO, for which Laurenz' advise to 
lower the amount of concurrent sessions doing IO in general makes sense.
A more nuanced analysis: if IO requests get queued, these will wait in 'D' 
state in linux, which by definition is off cpu, and thus do not spent cpu 
(system/kernel) time.

What sounds suspicious is that you indicate you indicate there is you see no 
signficant change in the amount of IO in iostat.

In order to understand this, you will have to first carefully find the actual 
IO physical IO devices that you are using for postgres IO.
In current linux this can be tricky, depending on how the hardware or virtual 
machine looks like, and how the disks are arranged in linux.
What you need to determine is which actual disk devices are used, and what 
their limits are. 
Limits for any disk are IOPS (operations per second) and MBPS (megabytes per 
second -> bandwdith).

There is an additional thing to realize, which makes this really tricky: 
postgres for common IO uses buffered IO.
Buffered IO means any read or write will use the linux buffercache, and read or 
writes can be served from the buffercache if possible.

So in your case, if you managed to make the database perform identical read or 
write requests, this could result in a difference of amounts of read and write 
IOs served from the cache, which can make an enormous amounts of difference for 
how fast these requests are served. If somehow you managed to make the 
operating system choose to use the physical IO path, you will see significant 
amounts time spent on that, which will have IO related wait events.

Not a simple answer, but this is how it works.

So I would suggest checking the difference between the situation of when it's 
doing the same which is considered well performing versus badly performing.


> 
> Yours,
> Laurenz Albe
> 
>
Re: many sessions waiting DataFileRead and extend

Reply via email to