On 2021-01-25 10:36, Masahiko Sawada wrote:
On Fri, Jan 22, 2021 at 10:05 PM Masahiro Ikeda
<ikeda...@oss.nttdata.com> wrote:
On 2021-01-22 14:50, Masahiko Sawada wrote:
> On Fri, Dec 25, 2020 at 6:46 PM Masahiro Ikeda
> <ikeda...@oss.nttdata.com> wrote:
>>
>> Hi,
>>
>> I rebased the patch to the master branch.
>
> Thank you for working on this. I've read the latest patch. Here are
> comments:
>
> ---
> + if (track_wal_io_timing)
> + {
> + INSTR_TIME_SET_CURRENT(duration);
> + INSTR_TIME_SUBTRACT(duration, start);
> + WalStats.m_wal_write_time +=
> INSTR_TIME_GET_MILLISEC(duration);
> + }
>
> * I think it should add the time in micro sec.
> After running pgbench with track_wal_io_timing = on for 30 sec,
> pg_stat_wal showed the following on my environment:
>
> postgres(1:61569)=# select * from pg_stat_wal;
> -[ RECORD 1 ]----+-----------------------------
> wal_records | 285947
> wal_fpi | 53285
> wal_bytes | 442008213
> wal_buffers_full | 0
> wal_write | 25516
> wal_write_time | 0
> wal_sync | 25437
> wal_sync_time | 14490
> stats_reset | 2021-01-22 10:56:13.29464+09
>
> Since writes can complete less than a millisecond, wal_write_time
> didn't increase. I think sync_time could also have the same problem.
Thanks for your comments. I didn't notice that.
I changed the unit from milliseconds to microseconds.
> ---
> + /*
> + * Measure i/o timing to fsync WAL data.
> + *
> + * The wal receiver skip to collect it to avoid performance
> degradation of standy servers.
> + * If sync_method doesn't have its fsync method, to skip too.
> + */
> + if (!AmWalReceiverProcess() && track_wal_io_timing &&
> fsyncMethodCalled())
> + INSTR_TIME_SET_CURRENT(start);
>
> * Why does only the wal receiver skip it even if track_wal_io_timinig
> is true? I think the performance degradation is also true for backend
> processes. If there is another reason for that, I think it's better to
> mention in both the doc and comment.
> * How about checking track_wal_io_timing first?
> * s/standy/standby/
I fixed it.
As kuroda-san mentioned too, the skip is no need to be considered.
I think you also removed the code to have the wal receiver report the
stats. So with the latest patch, the wal receiver tracks those
statistics but doesn't report.
And maybe XLogWalRcvWrite() also needs to track I/O?
Thanks, I forgot to add them.
I'll fix it.
> ---
> + /* increment the i/o timing and the number of times to fsync WAL
> data */
> + if (fsyncMethodCalled())
> + {
> + if (!AmWalReceiverProcess() && track_wal_io_timing)
> + {
> + INSTR_TIME_SET_CURRENT(duration);
> + INSTR_TIME_SUBTRACT(duration, start);
> + WalStats.m_wal_sync_time +=
> INSTR_TIME_GET_MILLISEC(duration);
> + }
> +
> + WalStats.m_wal_sync++;
> + }
>
> * I'd avoid always calling fsyncMethodCalled() in this path. How about
> incrementing m_wal_sync after each sync operation?
I think if syncing the disk does not occur, m_wal_sync should not be
incremented.
It depends enableFsync and sync_method.
enableFsync is checked in each fsync method like
pg_fsync_no_writethrough(),
so if incrementing m_wal_sync after each sync operation, it should be
implemented
in each fsync method. It leads to many duplicated codes.
Right. I missed that each fsync function checks enableFsync.
So, why don't you change the function to a flag whether to
sync data to the disk will be occurred or not in issue_xlog_fsync()?
Looks better. Since we don't necessarily need to increment m_wal_sync
after doing fsync we can write the code without an additional variable
as follows:
if (enableFsync)
{
switch (sync_method)
{
case SYNC_METHOD_FSYNC:
#ifdef HAVE_FSYNC_WRITETHROUGH
case SYNC_METHOD_FSYNC_WRITETHROUGH:
#endif
#ifdef HAVE_FDATASYNC
case SYNC_METHOD_FDATASYNC:
#endif
WalStats.m_wal_sync++;
if (track_wal_io_timing)
INSTR_TIME_SET_CURRENT(start);
break;
default:
break;
}
}
(do fsync and error handling here)
/* increment the i/o timing and the number of times to fsync WAL
data */
if (track_wal_io_timing)
{
INSTR_TIME_SET_CURRENT(duration);
INSTR_TIME_SUBTRACT(duration, start);
WalStats.m_wal_sync_time = INSTR_TIME_GET_MICROSEC(duration);
}
IIUC, I think we can't handle the following case.
When "sync_method" is SYNC_METHOD_OPEN or SYNC_METHOD_OPEN_DSYNC and
"track_wal_io_timing" is enabled, "start" doesn't be initialized.
My understanding is something wrong, isn't it?
I think we can change the first switch-case to an if statement.
Yes, I'll change it.
Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION