>Original
>From: Jakub Wartak <[email protected]>
>Date: 2026-04-30 19:14
>To: PostgreSQL Hackers <[email protected]>
>Subject: md: measure just FileSync() for pgstat_io without FileClose()
>
>
>Hi,
>
>As?per?the?attachment,?shouldn't?we?exclude?the?timing?of?close()?from
>fsync()'s?duration?
>
>-J.

Hi Jakub,

Thanks a lot for your patch!

I agree that ideally the call to `pgstat_count_io_op_time(..., IOOP_FSYNC, 
...)` inside `mdsyncfiletag()` 
should only measure and accurately reflect the actual execution time of 
`FileSync()`.

However, I haven’t figured out a clean way to capture and retain the required 
timing metrics within 
the existing logic of `mdsyncfiletag()`, so that we can feed them into 
`pgstat_count_io_op_time()` (or 
its variants) for proper IO statistics tracking. Conceptually, the ideal flow 
would look like this:
```c
io_start = pgstat_prepare_io_time(track_io_timing);
result = FileSync(file, WAIT_EVENT_DATA_FILE_SYNC);
io_end = pgstat_get_io_time???();

if (need_to_close)
    FileClose(file);
pgstat_count_io_op_time_v2(..., io_start, io_end, ...);
```

I’ve been thinking through the logic of `PathNameOpenFile()`, the subsequent 
`FileClose()` call, 
and how this sequence impacts the fsync timing measurement inside 
`mdsyncfiletag()`. 

Let’s break down all heavy or complex work done inside `FileClose()` for 
temporary file handles 
opened and closed within `mdsyncfiletag()`:

1. AIO-related logic
Files opened via `PathNameOpenFile()` get flags `O_RDWR | PG_BINARY` 
(potentially combined 
with `PG_O_DIRECT`). When we call `FileClose()`, it invokes 
`pgaio_closing_fd(vfdP->fd)`.

I’ll admit I haven’t fully wrapped my head around the full AIO subsystem, which 
is quite complex. 
For these data file handles, we only pass `PG_O_DIRECT` at open time and do not 
leverage AIO 
or io_uring for the I/O path. That leads me to suspect `pgaio_closing_fd()` 
will trigger no meaningful 
AIO cleanup work here. This is especially relevant given the `while` loop 
inside `pgaio_closing_fd()` that 
may invoke the potentially slow blocking wait `pgaio_io_wait()`. 

My analysis on this part is tentative, and I welcome any corrections or further 
insights from the community.

2. Logic tied to `vfdP->fdstate`
When opening a file via `PathNameOpenFile()`, `vfdP->fdstate` is initialized to 
`0x0`.
During `FileClose()`, the checks `(vfdP->fdstate & FD_TEMP_FILE_LIMIT)` and 
`(vfdP->fdstate & FD_DELETE_AT_CLOSE)` 
will both evaluate to false, so none of the associated cleanup routines for 
files will run.

3. Logic tied to `vfdP->resowner`
`PathNameOpenFile()` sets `vfdP->resowner = NULL`.
In turn, the conditional `if (vfdP->resowner)` inside `FileClose()` will not be 
satisfied, 
meaning no resource owner cleanup logic will execute either.

To sum up: the file handles we temporarily open and close locally within 
`mdsyncfiletag()` 
do not trigger any expensive cleanup paths in `FileClose()`. The overhead 
introduced by 
closing these file descriptors should be negligible compared to the latency of 
the preceding `FileSync()` call. 
I should note I have not yet run comprehensive benchmark tests to quantify this 
overhead
, so this is only a qualitative assessment.

From a broader system design perspective, closing the file descriptor ahead of 
emitting 
IO statistics via `pgstat_count_io_op_time()` is also a reasonable performance 
optimization: 
we release unused file resources as early as possible.

Happy to discuss this further, and I’m open to any feedback, alternative ideas 
or corrections you might have.


regards,
--
ZizhuanLiu (X-MAN) 
[email protected]

Reply via email to