On 2021/05/17 22:34, Fujii Masao wrote: > > > On 2021/05/17 16:39, Masahiro Ikeda wrote: >> >> Thanks for your comments! >> >>>> + * is executed, wal records aren't nomally generated (although HOT >>>> makes >>> >>> nomally -> normally? >> >> Yes, fixed. >> >>>> + * It's not enough to check the number of generated wal records, for >>>> + * example the walwriter may write/sync the WAL although it doesn't >>> >>> You use both 'wal' and 'WAL' in the comments, but are there any intension? >> >> No, I changed to use 'WAL' only. Although some other comments have 'wal' and >> 'WAL', it seems that 'WAL' is often used. > > Thanks for updating the patch!
Thanks a lot of comments! > + * Buffer and generated WAL usage counters. > + * > + * The counters are accumulated values. There are infrastructures > + * to add the values and calculate the difference within a specific period. > > Is it really worth adding these comments here? BufferUsage has almost same comments. So, I removed it. > + * Note: regarding to WAL statistics counters, it's not enough to check > + * only whether the WAL record is generated or not. If a read transaction > + * is executed, WAL records aren't normally generated (although HOT makes > + * WAL records). But, just writes and syncs the WAL data may happen when > + * to flush buffers. > > Aren't the following comments better? > > ------------------------------ > To determine whether any WAL activity has occurred since last time, not only > the number of generated WAL records but also the numbers of WAL writes and > syncs need to be checked. Because even transaction that generates no WAL > records can write or sync WAL data when flushing the data pages. > ------------------------------ Thanks. Yes, I fixed it. > - * This function can be called even if nothing at all has happened. In > - * this case, avoid sending a completely empty message to the stats > - * collector. > > I think that it's better to leave this comment as it is. OK. I leave it. > + * First, to check the WAL stats counters were updated. > + * > + * Even if the 'force' is true, we don't need to send the stats if the > + * counters were not updated. > + * > + * We can know whether the counters were updated or not to check only > + * three counters. So, for performance, we don't allocate another memory > + * spaces and check the all stats like pgstat_send_slru(). > > Is it really worth adding these comments here? I removed them because the following comments are enough. * This function can be called even if nothing at all has happened. In * this case, avoid sending a completely empty message to the stats * collector. > + * The current 'wal_records' is the same as the previous one means that > + * WAL records weren't generated. So, the counters of 'wal_fpi', > + * 'wal_bytes', 'm_wal_buffers_full' are not updated neither. > + * > + * It's not enough to check the number of generated WAL records, for > + * example the walwriter may write/sync the WAL although it doesn't > + * generate WAL records. 'm_wal_write' and 'm_wal_sync' are zero means > the > + * counters of time spent are zero too. > > Aren't the following comments better? > > ------------------------ > Check wal_records counter to determine whether any WAL activity has happened > since last time. Note that other WalUsage counters don't need to be checked > because they are incremented always together with wal_records counter. > > m_wal_buffers_full also doesn't need to be checked because it's incremented > only when at least one WAL record is generated (i.e., wal_records counter is > incremented). But for safely, we assert that m_wal_buffers_full is always zero > when no WAL record is generated > > This function can be called by a process like walwriter that normally > generates no WAL records. To determine whether any WAL activity has happened > at that process since the last time, the numbers of WAL writes and syncs are > also checked. > ------------------------ Yes, I modified them. > + * The accumulated counters for buffer usage. > + * > + * The reason the counters are accumulated values is to avoid unexpected > + * reset because many callers may use them. > > Aren't the following comments better? > > ------------------------ > These counters keep being incremented infinitely, i.e., must never be reset to > zero, so that we can calculate how much the counters are incremented in an > arbitrary period. > ------------------------ Yes, thanks. I fixed it. Regards, -- Masahiro Ikeda NTT DATA CORPORATION
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c index cdd07770a0..75a95f3de7 100644 --- a/src/backend/postmaster/checkpointer.c +++ b/src/backend/postmaster/checkpointer.c @@ -505,7 +505,7 @@ CheckpointerMain(void) pgstat_send_bgwriter(); /* Send WAL statistics to the stats collector. */ - pgstat_report_wal(); + pgstat_send_wal(true); /* * If any checkpoint flags have been set, redo the loop to handle the @@ -583,7 +583,7 @@ HandleCheckpointerInterrupts(void) BgWriterStats.m_requested_checkpoints++; ShutdownXLOG(0, 0); pgstat_send_bgwriter(); - pgstat_report_wal(); + pgstat_send_wal(true); /* Normal exit from the checkpointer is here */ proc_exit(0); /* done */ diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index 249b17c92b..5dceccd2fa 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -133,8 +133,8 @@ PgStat_MsgWal WalStats; /* * WAL usage counters saved from pgWALUsage at the previous call to - * pgstat_report_wal(). This is used to calculate how much WAL usage - * happens between pgstat_report_wal() calls, by substracting + * pgstat_send_wal(). This is used to calculate how much WAL usage + * happens between pgstat_send_wal() calls, by substracting * the previous counters from the current ones. */ static WalUsage prevWalUsage; @@ -852,9 +852,18 @@ pgstat_report_stat(bool disconnect) TabStatusArray *tsa; int i; - /* Don't expend a clock check if nothing to do */ + /* + * Don't expend a clock check if nothing to do. + * + * To determine whether any WAL activity has occurred since last time, not only + * the number of generated WAL records but also the numbers of WAL writes and + * syncs need to be checked. Because even transaction that generates no WAL + * records can write or sync WAL data when flushing the data pages. + */ if ((pgStatTabList == NULL || pgStatTabList->tsa_used == 0) && pgStatXactCommit == 0 && pgStatXactRollback == 0 && + pgWalUsage.wal_records == prevWalUsage.wal_records && + WalStats.m_wal_write == 0 && WalStats.m_wal_sync == 0 && !have_function_stats && !disconnect) return; @@ -948,7 +957,7 @@ pgstat_report_stat(bool disconnect) pgstat_send_funcstats(); /* Send WAL statistics */ - pgstat_report_wal(); + pgstat_send_wal(true); /* Finally send SLRU statistics */ pgstat_send_slru(); @@ -2918,7 +2927,7 @@ void pgstat_initialize(void) { /* - * Initialize prevWalUsage with pgWalUsage so that pgstat_report_wal() can + * Initialize prevWalUsage with pgWalUsage so that pgstat_send_wal() can * calculate how much pgWalUsage counters are increased by substracting * prevWalUsage from pgWalUsage. */ @@ -3030,44 +3039,6 @@ pgstat_send_bgwriter(void) MemSet(&BgWriterStats, 0, sizeof(BgWriterStats)); } -/* ---------- - * pgstat_report_wal() - - * - * Calculate how much WAL usage counters are increased and send - * WAL statistics to the collector. - * - * Must be called by processes that generate WAL. - * ---------- - */ -void -pgstat_report_wal(void) -{ - WalUsage walusage; - - /* - * Calculate how much WAL usage counters are increased by substracting the - * previous counters from the current ones. Fill the results in WAL stats - * message. - */ - MemSet(&walusage, 0, sizeof(WalUsage)); - WalUsageAccumDiff(&walusage, &pgWalUsage, &prevWalUsage); - - WalStats.m_wal_records = walusage.wal_records; - WalStats.m_wal_fpi = walusage.wal_fpi; - WalStats.m_wal_bytes = walusage.wal_bytes; - - /* - * Send WAL stats message to the collector. - */ - if (!pgstat_send_wal(true)) - return; - - /* - * Save the current counters for the subsequent calculation of WAL usage. - */ - prevWalUsage = pgWalUsage; -} - /* ---------- * pgstat_send_wal() - * @@ -3075,24 +3046,38 @@ pgstat_report_wal(void) * * If 'force' is not set, WAL stats message is only sent if enough time has * passed since last one was sent to reach PGSTAT_STAT_INTERVAL. - * - * Return true if the message is sent, and false otherwise. * ---------- */ -bool +void pgstat_send_wal(bool force) { - /* We assume this initializes to zeroes */ - static const PgStat_MsgWal all_zeroes; static TimestampTz sendTime = 0; /* * This function can be called even if nothing at all has happened. In * this case, avoid sending a completely empty message to the stats * collector. + * + * Check wal_records counter to determine whether any WAL activity has happened + * since last time. Note that other WalUsage counters don't need to be checked + * because they are incremented always together with wal_records counter. + * + * m_wal_buffers_full also doesn't need to be checked because it's incremented + * only when at least one WAL record is generated (i.e., wal_records counter is + * incremented). But for safely, we assert that m_wal_buffers_full is always zero + * when no WAL record is generated + * + * This function can be called by a process like walwriter that normally + * generates no WAL records. To determine whether any WAL activity has happened + * at that process since the last time, the numbers of WAL writes and syncs are + * also checked. */ - if (memcmp(&WalStats, &all_zeroes, sizeof(PgStat_MsgWal)) == 0) - return false; + if (pgWalUsage.wal_records == prevWalUsage.wal_records && + WalStats.m_wal_write == 0 && WalStats.m_wal_sync == 0) + { + Assert(WalStats.m_wal_buffers_full == 0); + return; + } if (!force) { @@ -3100,13 +3085,40 @@ pgstat_send_wal(bool force) /* * Don't send a message unless it's been at least PGSTAT_STAT_INTERVAL - * msec since we last sent one. + * msec since we last sent one to avoid overloading the stats + * collector. */ if (!TimestampDifferenceExceeds(sendTime, now, PGSTAT_STAT_INTERVAL)) - return false; + return; sendTime = now; } + /* + * Set the counters related to generated WAL data if the counters were + * updated. + */ + if (pgWalUsage.wal_records != prevWalUsage.wal_records) + { + WalUsage walusage; + + /* + * Calculate how much WAL usage counters were increased by substracting + * the previous counters from the current ones. Fill the results in + * WAL stats message. + */ + MemSet(&walusage, 0, sizeof(WalUsage)); + WalUsageAccumDiff(&walusage, &pgWalUsage, &prevWalUsage); + + WalStats.m_wal_records = walusage.wal_records; + WalStats.m_wal_fpi = walusage.wal_fpi; + WalStats.m_wal_bytes = walusage.wal_bytes; + + /* + * Save the current counters for the subsequent calculation of WAL usage. + */ + prevWalUsage = pgWalUsage; + } + /* * Prepare and send the message */ @@ -3117,8 +3129,6 @@ pgstat_send_wal(bool force) * Clear out the statistics buffer, so it can be re-used. */ MemSet(&WalStats, 0, sizeof(WalStats)); - - return true; } /* ---------- diff --git a/src/include/executor/instrument.h b/src/include/executor/instrument.h index fc87eed4fb..2a7b3f1233 100644 --- a/src/include/executor/instrument.h +++ b/src/include/executor/instrument.h @@ -16,6 +16,11 @@ #include "portability/instr_time.h" +/* + * These counters keep being incremented infinitely, i.e., must never be reset to + * zero, so that we can calculate how much the counters are incremented in an + * arbitrary period. + */ typedef struct BufferUsage { int64 shared_blks_hit; /* # of shared buffer hits */ @@ -32,6 +37,15 @@ typedef struct BufferUsage instr_time blk_write_time; /* time spent writing */ } BufferUsage; +/* + * The accumulated counters for generated WAL usage. + * + * The reason the counters are accumulated values is the same as BufferUsage's one. + * And the reason to store only generated WAL usage and doesn't store WAL I/O + * activity, is that this is assumed for knowing the WAL usage in per query or + * transaction. So, common resources for the cluster like WAL I/O activity is + * not stored. + */ typedef struct WalUsage { int64 wal_records; /* # of WAL records produced */ diff --git a/src/include/pgstat.h b/src/include/pgstat.h index 5fbd3a05ba..9612c0a6c2 100644 --- a/src/include/pgstat.h +++ b/src/include/pgstat.h @@ -1091,8 +1091,7 @@ extern void pgstat_twophase_postabort(TransactionId xid, uint16 info, extern void pgstat_send_archiver(const char *xlog, bool failed); extern void pgstat_send_bgwriter(void); -extern void pgstat_report_wal(void); -extern bool pgstat_send_wal(bool force); +extern void pgstat_send_wal(bool force); /* ---------- * Support functions for the SQL-callable functions to