Hi, Masahiro
Thanks for you update the v4 patch. Here are some comments: (1) + char *msg = NULL; + bool sync_called; /* whether to sync data to the disk. */ + instr_time start; + instr_time duration; + + /* check whether to sync data to the disk is really occurred. */ + sync_called = false; Maybe we can initialize the "sync_called" variable when declare it. (2) + if (sync_called) + { + /* increment the i/o timing and the number of times to fsync WAL data */ + if (track_wal_io_timing) + { + INSTR_TIME_SET_CURRENT(duration); + INSTR_TIME_SUBTRACT(duration, start); + WalStats.m_wal_sync_time = INSTR_TIME_GET_MICROSEC(duration); + } + + WalStats.m_wal_sync++; + } There is an extra space before INSTR_TIME_GET_MICROSEC(duration). In the issue_xlog_fsync(), the comment says that if sync_method is SYNC_METHOD_OPEN or SYNC_METHOD_OPEN_DSYNC, it already write synced. Does that mean it synced when write the WAL data? And for those cases, we cannot get accurate write/sync timing and number of write/sync times, right? case SYNC_METHOD_OPEN: case SYNC_METHOD_OPEN_DSYNC: /* write synced it already */ break; On Fri, 22 Jan 2021 at 21:05, Masahiro Ikeda <ikeda...@oss.nttdata.com> wrote: > On 2021-01-22 14:50, Masahiko Sawada wrote: >> On Fri, Dec 25, 2020 at 6:46 PM Masahiro Ikeda >> <ikeda...@oss.nttdata.com> wrote: >>> >>> Hi, >>> >>> I rebased the patch to the master branch. >> >> Thank you for working on this. I've read the latest patch. Here are >> comments: >> >> --- >> + if (track_wal_io_timing) >> + { >> + INSTR_TIME_SET_CURRENT(duration); >> + INSTR_TIME_SUBTRACT(duration, start); >> + WalStats.m_wal_write_time += >> INSTR_TIME_GET_MILLISEC(duration); >> + } >> >> * I think it should add the time in micro sec. >> After running pgbench with track_wal_io_timing = on for 30 sec, >> pg_stat_wal showed the following on my environment: >> >> postgres(1:61569)=# select * from pg_stat_wal; >> -[ RECORD 1 ]----+----------------------------- >> wal_records | 285947 >> wal_fpi | 53285 >> wal_bytes | 442008213 >> wal_buffers_full | 0 >> wal_write | 25516 >> wal_write_time | 0 >> wal_sync | 25437 >> wal_sync_time | 14490 >> stats_reset | 2021-01-22 10:56:13.29464+09 >> >> Since writes can complete less than a millisecond, wal_write_time >> didn't increase. I think sync_time could also have the same problem. > > Thanks for your comments. I didn't notice that. > I changed the unit from milliseconds to microseconds. > >> --- >> + /* >> + * Measure i/o timing to fsync WAL data. >> + * >> + * The wal receiver skip to collect it to avoid performance >> degradation of standy servers. >> + * If sync_method doesn't have its fsync method, to skip too. >> + */ >> + if (!AmWalReceiverProcess() && track_wal_io_timing && >> fsyncMethodCalled()) >> + INSTR_TIME_SET_CURRENT(start); >> >> * Why does only the wal receiver skip it even if track_wal_io_timinig >> is true? I think the performance degradation is also true for backend >> processes. If there is another reason for that, I think it's better to >> mention in both the doc and comment. >> * How about checking track_wal_io_timing first? >> * s/standy/standby/ > > I fixed it. > As kuroda-san mentioned too, the skip is no need to be considered. > >> --- >> + /* increment the i/o timing and the number of times to fsync WAL >> data */ >> + if (fsyncMethodCalled()) >> + { >> + if (!AmWalReceiverProcess() && track_wal_io_timing) >> + { >> + INSTR_TIME_SET_CURRENT(duration); >> + INSTR_TIME_SUBTRACT(duration, start); >> + WalStats.m_wal_sync_time += >> INSTR_TIME_GET_MILLISEC(duration); >> + } >> + >> + WalStats.m_wal_sync++; >> + } >> >> * I'd avoid always calling fsyncMethodCalled() in this path. How about >> incrementing m_wal_sync after each sync operation? > > I think if syncing the disk does not occur, m_wal_sync should not be > incremented. > It depends enableFsync and sync_method. > > enableFsync is checked in each fsync method like > pg_fsync_no_writethrough(), > so if incrementing m_wal_sync after each sync operation, it should be > implemented > in each fsync method. It leads to many duplicated codes. > > So, why don't you change the function to a flag whether to > sync data to the disk will be occurred or not in issue_xlog_fsync()? > > >> --- >> +/* >> + * Check if fsync mothod is called. >> + */ >> +static bool >> +fsyncMethodCalled() >> +{ >> + if (!enableFsync) >> + return false; >> + >> + switch (sync_method) >> + { >> + case SYNC_METHOD_FSYNC: >> + case SYNC_METHOD_FSYNC_WRITETHROUGH: >> + case SYNC_METHOD_FDATASYNC: >> + return true; >> + default: >> + /* others don't have a specific fsync method */ >> + return false; >> + } >> +} >> >> * I'm concerned that the function name could confuse the reader >> because it's called even before the fsync method is called. As I >> commented above, calling to fsyncMethodCalled() can be eliminated. >> That way, this function is called at only once. So do we really need >> this function? > > Thanks to your comments, I removed them. > > >> * As far as I read the code, issue_xlog_fsync() seems to do fsync even >> if enableFsync is false. Why does the function return false in that >> case? I might be missing something. > > IIUC, the reason is that I thought that each fsync functions like > pg_fsync_no_writethrough() check enableFsync. > > If this code doesn't check, m_wal_sync_time may be incremented > even though some sync methods like SYNC_METHOD_OPEN don't call to sync > some data to the disk at the time. > >> * void is missing as argument? >> >> * s/mothod/method/ > > I removed them. > > > Regards, -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.