>Original
>From: Jakub Wartak <[email protected]>
>Date: 2026-04-30 19:14
>To: PostgreSQL Hackers <[email protected]>
>Subject: md: measure just FileSync() for pgstat_io without FileClose()
>
>
>Hi,
>
>As?per?the?attachment,?shouldn't?we?exclude?the?timing?of?close()?from
>fsync()'s?duration?
>
>-J.
Hi Jakub,
Thanks a lot for your patch!
I agree that ideally the call to `pgstat_count_io_op_time(..., IOOP_FSYNC,
...)` inside `mdsyncfiletag()`
should only measure and accurately reflect the actual execution time of
`FileSync()`.
However, I haven’t figured out a clean way to capture and retain the required
timing metrics within
the existing logic of `mdsyncfiletag()`, so that we can feed them into
`pgstat_count_io_op_time()` (or
its variants) for proper IO statistics tracking. Conceptually, the ideal flow
would look like this:
```c
io_start = pgstat_prepare_io_time(track_io_timing);
result = FileSync(file, WAIT_EVENT_DATA_FILE_SYNC);
io_end = pgstat_get_io_time???();
if (need_to_close)
FileClose(file);
pgstat_count_io_op_time_v2(..., io_start, io_end, ...);
```
I’ve been thinking through the logic of `PathNameOpenFile()`, the subsequent
`FileClose()` call,
and how this sequence impacts the fsync timing measurement inside
`mdsyncfiletag()`.
Let’s break down all heavy or complex work done inside `FileClose()` for
temporary file handles
opened and closed within `mdsyncfiletag()`:
1. AIO-related logic
Files opened via `PathNameOpenFile()` get flags `O_RDWR | PG_BINARY`
(potentially combined
with `PG_O_DIRECT`). When we call `FileClose()`, it invokes
`pgaio_closing_fd(vfdP->fd)`.
I’ll admit I haven’t fully wrapped my head around the full AIO subsystem, which
is quite complex.
For these data file handles, we only pass `PG_O_DIRECT` at open time and do not
leverage AIO
or io_uring for the I/O path. That leads me to suspect `pgaio_closing_fd()`
will trigger no meaningful
AIO cleanup work here. This is especially relevant given the `while` loop
inside `pgaio_closing_fd()` that
may invoke the potentially slow blocking wait `pgaio_io_wait()`.
My analysis on this part is tentative, and I welcome any corrections or further
insights from the community.
2. Logic tied to `vfdP->fdstate`
When opening a file via `PathNameOpenFile()`, `vfdP->fdstate` is initialized to
`0x0`.
During `FileClose()`, the checks `(vfdP->fdstate & FD_TEMP_FILE_LIMIT)` and
`(vfdP->fdstate & FD_DELETE_AT_CLOSE)`
will both evaluate to false, so none of the associated cleanup routines for
files will run.
3. Logic tied to `vfdP->resowner`
`PathNameOpenFile()` sets `vfdP->resowner = NULL`.
In turn, the conditional `if (vfdP->resowner)` inside `FileClose()` will not be
satisfied,
meaning no resource owner cleanup logic will execute either.
To sum up: the file handles we temporarily open and close locally within
`mdsyncfiletag()`
do not trigger any expensive cleanup paths in `FileClose()`. The overhead
introduced by
closing these file descriptors should be negligible compared to the latency of
the preceding `FileSync()` call.
I should note I have not yet run comprehensive benchmark tests to quantify this
overhead
, so this is only a qualitative assessment.
From a broader system design perspective, closing the file descriptor ahead of
emitting
IO statistics via `pgstat_count_io_op_time()` is also a reasonable performance
optimization:
we release unused file resources as early as possible.
Happy to discuss this further, and I’m open to any feedback, alternative ideas
or corrections you might have.
regards,
--
ZizhuanLiu (X-MAN)
[email protected]