thans for you suggestions, we have iowait from sar command too, copy here,
checking with infra team not found abnormal IO activities either.
02:00:01 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal
 %guest  %gnice   %idle
02:00:03 PM  all         15.92    0.00   43.02    0.65    0.76    2.56
 0.00    0.00    0.00   37.09
02:00:03 PM    0         17.59    0.00   46.73    1.01    0.50    0.50
 0.00    0.00    0.00   33.67
02:00:03 PM    1          9.50     0.00   61.50    0.50    0.50    1.00
 0.00    0.00    0.00   27.00
02:00:03 PM    2          20.71    0.00   44.44    1.01    0.51    0.51
 0.00    0.00    0.00   32.83
02:00:03 PM    3         14.00    0.00   51.50    2.00    1.00    1.00
 0.00    0.00    0.00   30.50
02:00:03 PM    4          6.57     0.00   52.53    0.51    0.51    3.54
 0.00    0.00    0.00   36.36
02:00:03 PM    5          10.20    0.00   49.49    1.02    1.53    0.00
 0.00    0.00    0.00   37.76
02:00:03 PM    6          27.64    0.00   41.21    0.50    0.50    0.50
 0.00    0.00    0.00   29.65
02:00:03 PM    7          9.05    0.00   50.75    0.50    1.01    0.50
 0.00    0.00    0.00   38.19
02:00:03 PM    8          12.18    0.00   49.75    0.51    0.51    0.51
 0.00    0.00    0.00   36.55
02:00:03 PM    9          13.00    0.00    9.50    0.50    1.50   15.50
 0.00    0.00    0.00   60.00
02:00:03 PM   10         15.58    0.00   46.23    0.00    0.50    0.50
 0.00    0.00    0.00   37.19
02:00:03 PM   11          20.71    0.00   10.10    0.00    1.01   14.14
 0.00    0.00    0.00   54.04
02:00:03 PM   12         21.00    0.00   37.00    0.50    1.00    1.00
 0.00    0.00    0.00   39.50
02:00:03 PM   13         13.57    0.00   45.73    1.01    1.01    1.01
 0.00    0.00    0.00   37.69
02:00:03 PM   14   18.18    0.00   39.39    1.01    0.51    0.51    0.00
 0.00    0.00   40.40
02:00:03 PM   15   14.00    0.00   49.50    0.50    0.50    3.50    0.00
 0.00    0.00   32.00
02:00:03 PM   16   19.39    0.00   39.80    1.02    1.53    0.51    0.00
 0.00    0.00   37.76
02:00:03 PM   17   16.75    0.00   45.18    1.52    1.02    2.54    0.00
 0.00    0.00   32.99
02:00:03 PM   18   12.63    0.00   50.00    0.00    1.01    0.00    0.00
 0.00    0.00   36.36
02:00:03 PM   19    5.56    0.00   82.32    0.00    0.00    0.00    0.00
 0.00    0.00   12.12
02:00:03 PM   20   15.08    0.00   48.24    0.50    0.50    3.52    0.00
 0.00    0.00   32.16
02:00:03 PM   21   17.68    0.00    9.09    0.51    1.52   13.64    0.00
 0.00    0.00   57.58
02:00:03 PM   22   13.13    0.00   43.94    0.51    0.51    0.51    0.00
 0.00    0.00   41.41
02:00:03 PM   23   14.07    0.00   42.71    0.50    0.50    0.50    0.00
 0.00    0.00   41.71
02:00:03 PM   24   13.13    0.00   41.92    1.01    0.51    0.51    0.00
 0.00    0.00   42.93
02:00:03 PM   25   16.58    0.00   47.74    0.50    1.01    0.50    0.00
 0.00    0.00   33.67
02:00:03 PM   26   16.58    0.00   46.73    0.50    1.01    0.50    0.00
 0.00    0.00   34.67
02:00:03 PM   27   45.50    0.00   54.50    0.00    0.00    0.00    0.00
 0.00    0.00    0.00
02:00:03 PM   28    6.06    0.00   32.32    0.00    0.51   13.13    0.00
 0.00    0.00   47.98
02:00:03 PM   29   13.93    0.00   44.78    1.00    1.00    0.50    0.00
 0.00    0.00   38.81
02:00:03 PM   30   11.56    0.00   57.79    0.00    0.50    1.01    0.00
 0.00    0.00   29.15
02:00:03 PM   31   33.85    0.00    9.23    0.51    1.54    0.51    0.00
 0.00    0.00   54.36
02:00:03 PM   32   30.15    0.00   41.71    0.50    0.50    1.51    0.00
 0.00    0.00   25.63

Thanks,

James

Frits Hoogland <frits.hoogl...@gmail.com> 於 2025年6月25日週三 下午10:27寫道:

>
>
> > On 25 Jun 2025, at 07:59, Laurenz Albe <laurenz.a...@cybertec.at> wrote:
> >
> > On Wed, 2025-06-25 at 11:15 +0800, James Pang wrote:
> >> pgv14, RHEL8, xfs , we suddenly see tens of sessions waiting on
> "DataFileRead" and
> >> "extend", it last about 2 seconds(based on pg_stat_activity query) ,
> during the
> >> waiting time, "%sys" cpu increased to 80% , but from "iostat" , no high
> iops and
> >> io read/write latency increased either.
> >
> > Run "sar -P all 1" and see if "%iowait" is high.
> I would (strongly) advise against the use of iowait as an indicator. It is
> a kernel approximation of time spent in IO from which cannot be use used in
> any sensible way other than possibly you're doing IO.
> First of all, iowait is not a kernel state, and therefore it's taken from
> idle. This means that if there is no, or too little, idle time, iowait that
> should be there is gone.
> Second, the calculation to transfer idle time to iowait is done for
> synchronous IO calls only. Which currently is not a problem for postgres
> because it uses exactly that, but in the future it might.
> Very roughly put, what the kernel does is keep a counter of tasks
> currently in certain system IO calls, and then try to express that using
> iowait. The time in IO wait can't be used calculate any IO facts.
>
> In that sense, it puts it in the same area as the load figure: indicative,
> but mostly useless because it doesn't give you any facts about what it is
> expressing.
> >
> > Check if you have transparent hugepages enabled:
> >
> >  cat /sys/kernel/mm/transparent_hugepage/enabled
> >
> > If they are enabled, disable them and see if it makes a difference.
> >
> > I am only guessing here.
> Absolutely. Anything that is using signficant amounts of memory and is not
> created to take advantage of transparent hugepages will probably experience
> more downsides from THP than it helps.
> >
> >> many sessions were running same "DELETE FROM xxxx" in parallel waiting
> on "extend"
> >> and "DataFileRead", there are triggers in this table "After delete" to
> insert/delete
> >> other tables in the tigger.
> >
> > One thing that almost certainly would improve your situation is to run
> fewer
> > concurrent statements, for example by using a reasonably sized
> connection pool.
> This is true if the limits of the IO device, or anything towards to IO
> device or devices are hit.
> And in general, high "%sys", alias lots of time spent in kernel mode alias
> system time indicates lots of time spent in system calls, which is what the
> read and write calls in postgres are.
> Therefore these figures suggest blocking for IO, for which Laurenz' advise
> to lower the amount of concurrent sessions doing IO in general makes sense.
> A more nuanced analysis: if IO requests get queued, these will wait in 'D'
> state in linux, which by definition is off cpu, and thus do not spent cpu
> (system/kernel) time.
>
> What sounds suspicious is that you indicate you indicate there is you see
> no signficant change in the amount of IO in iostat.
>
> In order to understand this, you will have to first carefully find the
> actual IO physical IO devices that you are using for postgres IO.
> In current linux this can be tricky, depending on how the hardware or
> virtual machine looks like, and how the disks are arranged in linux.
> What you need to determine is which actual disk devices are used, and what
> their limits are.
> Limits for any disk are IOPS (operations per second) and MBPS (megabytes
> per second -> bandwdith).
>
> There is an additional thing to realize, which makes this really tricky:
> postgres for common IO uses buffered IO.
> Buffered IO means any read or write will use the linux buffercache, and
> read or writes can be served from the buffercache if possible.
>
> So in your case, if you managed to make the database perform identical
> read or write requests, this could result in a difference of amounts of
> read and write IOs served from the cache, which can make an enormous
> amounts of difference for how fast these requests are served. If somehow
> you managed to make the operating system choose to use the physical IO
> path, you will see significant amounts time spent on that, which will have
> IO related wait events.
>
> Not a simple answer, but this is how it works.
>
> So I would suggest checking the difference between the situation of when
> it's doing the same which is considered well performing versus badly
> performing.
>
>
> >
> > Yours,
> > Laurenz Albe
> >
> >
>
>

Reply via email to