On 2020/02/18 14:20, Kyotaro Horiguchi wrote:
Hello.

At Tue, 18 Feb 2020 12:25:51 +0900, Fujii Masao <masao.fu...@oss.nttdata.com> 
wrote in
Hi,

RecoveryWalAll and RecoveryWalStream wait events are documented as
follows.

     RecoveryWalAll
     Waiting for WAL from any kind of source (local, archive or stream) at
     recovery.

     RecoveryWalStream
     Waiting for WAL from a stream at recovery.

But as far as I read the code, RecoveryWalAll is reported only when
waiting
for WAL from a stream. So the current description looks
incorrect. What's
described now for RecoveryWalStream seems rather fit to
RecoveryWalAll.
I'd like to change the description of RecoveryWalAll to "Waiting for
WAL
  from a stream at recovery".

Good catch!

Regarding RecoveryWalStream, as far as I read the code, while this
event is
being reported, the startup process is waiting for next trial to
retrieve
WAL data when WAL data is not available from any sources, based on
wal_retrieve_retry_interval. So this current description looks also
incorrect. I'd like to change it to "Waiting when WAL data is not
available
  from any kind of sources (local, archive or stream) before trying
  again
  to retrieve WAL data".

Thought?

I agree that the corrected description sound correct in meaning. The
latter seems a bit lengthy, though.

Yeah, so better idea?

Anyway, attached is the patch (fix_recovery_wait_event_doc_v1.patch)
that fixes the descriptions of those wait events. This should be
back-patched to v9.5.

Also the current names of these wait events sound confusing. I think
that RecoveryWalAll should be changed to RecoveryWalStream.
RecoveryWalStream should be RecoveryRetrieveRetryInterval or
something.

I agree to the former, I think RecoveryWalInterval works well enough.

RecoveryWalInterval sounds confusing to me...

Attached is the patch (improve_recovery_wait_event_for_master_v1.patch) that
changes the names and types of wait events. This patch uses
RecoveryRetrieveRetryInterval, but if there is better name,
I will adopt that.

Note that this patch needs to be applied after
fix_recovery_wait_event_doc_v1.patch is applied.

Another problem is that the current wait event types of them also look
strange. Currently the type of them is Activity, but IMO it's better
to
use IPC for RecoveryWalAll because it's waiting for walreceiver to
receive new WAL. Also it's better to use Timeout for RecoveryWalStream
because it's waiting depending on wal_retrieve_retry_interval.

Do you mean condition variable by the "IPC"? But the WaitLatch waits
not only for new WAL but also for trigger, SIGHUP, shutdown and
walreceiver events other than new WAL. I'm not sure that condition
variable fits for the purpose.

OK, I didn't change the type of RecoveryWalStream to IPC, in the patch.
Its type is still Activity.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 87586a7b06..c641361418 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1270,12 +1270,16 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   
11:34   0:00 postgres: ser
         </row>
         <row>
          <entry><literal>RecoveryWalAll</literal></entry>
-         <entry>Waiting for WAL from any kind of source (local, archive or 
stream) at recovery.</entry>
-        </row>
-        <row>
-         <entry><literal>RecoveryWalStream</literal></entry>
          <entry>Waiting for WAL from a stream at recovery.</entry>
         </row>
+        <row>
+         <entry><literal>RecoveryWalStream</literal></entry>
+         <entry>
+          Waiting when WAL data is not available from any kind of sources
+          (local, archive or stream) before trying again to retrieve WAL data,
+          at recovery.
+         </entry>
+        </row>
         <row>
          <entry><literal>SysLoggerMain</literal></entry>
          <entry>Waiting in main loop of syslogger process.</entry>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index c641361418..8fbaad197b 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1236,7 +1236,7 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   
11:34   0:00 postgres: ser
          <entry>Waiting to acquire a pin on a buffer.</entry>
         </row>
         <row>
-         <entry morerows="13"><literal>Activity</literal></entry>
+         <entry morerows="12"><literal>Activity</literal></entry>
          <entry><literal>ArchiverMain</literal></entry>
          <entry>Waiting in main loop of the archiver process.</entry>
         </row>
@@ -1268,17 +1268,9 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   
11:34   0:00 postgres: ser
          <entry><literal>PgStatMain</literal></entry>
          <entry>Waiting in main loop of the statistics collector 
process.</entry>
         </row>
-        <row>
-         <entry><literal>RecoveryWalAll</literal></entry>
-         <entry>Waiting for WAL from a stream at recovery.</entry>
-        </row>
         <row>
          <entry><literal>RecoveryWalStream</literal></entry>
-         <entry>
-          Waiting when WAL data is not available from any kind of sources
-          (local, archive or stream) before trying again to retrieve WAL data,
-          at recovery.
-         </entry>
+         <entry>Waiting for WAL from a stream at recovery.</entry>
         </row>
         <row>
          <entry><literal>SysLoggerMain</literal></entry>
@@ -1488,7 +1480,7 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   
11:34   0:00 postgres: ser
          <entry>Waiting for confirmation from remote server during synchronous 
replication.</entry>
         </row>
         <row>
-         <entry morerows="2"><literal>Timeout</literal></entry>
+         <entry morerows="3"><literal>Timeout</literal></entry>
          <entry><literal>BaseBackupThrottle</literal></entry>
          <entry>Waiting during base backup when throttling activity.</entry>
         </row>
@@ -1500,6 +1492,14 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   
11:34   0:00 postgres: ser
          <entry><literal>RecoveryApplyDelay</literal></entry>
          <entry>Waiting to apply WAL at recovery because it is delayed.</entry>
         </row>
+        <row>
+         <entry><literal>RecoveryRetrieveRetryInterval</literal></entry>
+         <entry>
+          Waiting when WAL data is not available from any kind of sources
+          (local, archive or stream) before trying again to retrieve WAL data,
+          at recovery.
+         </entry>
+        </row>
         <row>
          <entry morerows="68"><literal>IO</literal></entry>
          <entry><literal>BufFileRead</literal></entry>
diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index 3813eadfb4..a9659223e0 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -11981,7 +11981,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool 
randAccess,
                                                                                
 WL_LATCH_SET | WL_TIMEOUT |
                                                                                
 WL_EXIT_ON_PM_DEATH,
                                                                                
 wait_time,
-                                                                               
 WAIT_EVENT_RECOVERY_WAL_STREAM);
+                                                                               
 WAIT_EVENT_RECOVERY_RETRIEVE_RETRY_INTERVAL);
                                                
ResetLatch(&XLogCtl->recoveryWakeupLatch);
                                                now = GetCurrentTimestamp();
                                        }
@@ -12165,7 +12165,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool 
randAccess,
                                        (void) 
WaitLatch(&XLogCtl->recoveryWakeupLatch,
                                                                         
WL_LATCH_SET | WL_TIMEOUT |
                                                                         
WL_EXIT_ON_PM_DEATH,
-                                                                        5000L, 
WAIT_EVENT_RECOVERY_WAL_ALL);
+                                                                        5000L, 
WAIT_EVENT_RECOVERY_WAL_STREAM);
                                        
ResetLatch(&XLogCtl->recoveryWakeupLatch);
                                        break;
                                }
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 59dc4f31ab..aba23f5d23 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -3654,9 +3654,6 @@ pgstat_get_wait_activity(WaitEventActivity w)
                case WAIT_EVENT_PGSTAT_MAIN:
                        event_name = "PgStatMain";
                        break;
-               case WAIT_EVENT_RECOVERY_WAL_ALL:
-                       event_name = "RecoveryWalAll";
-                       break;
                case WAIT_EVENT_RECOVERY_WAL_STREAM:
                        event_name = "RecoveryWalStream";
                        break;
@@ -3876,6 +3873,9 @@ pgstat_get_wait_timeout(WaitEventTimeout w)
                case WAIT_EVENT_RECOVERY_APPLY_DELAY:
                        event_name = "RecoveryApplyDelay";
                        break;
+               case WAIT_EVENT_RECOVERY_RETRIEVE_RETRY_INTERVAL:
+                       event_name = "RecoveryRetrieveRetryInterval";
+                       break;
                        /* no default case, so that compiler will warn */
        }
 
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 3a65a51696..45496738f0 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -779,7 +779,6 @@ typedef enum
        WAIT_EVENT_LOGICAL_APPLY_MAIN,
        WAIT_EVENT_LOGICAL_LAUNCHER_MAIN,
        WAIT_EVENT_PGSTAT_MAIN,
-       WAIT_EVENT_RECOVERY_WAL_ALL,
        WAIT_EVENT_RECOVERY_WAL_STREAM,
        WAIT_EVENT_SYSLOGGER_MAIN,
        WAIT_EVENT_WAL_RECEIVER_MAIN,
@@ -866,7 +865,8 @@ typedef enum
 {
        WAIT_EVENT_BASE_BACKUP_THROTTLE = PG_WAIT_TIMEOUT,
        WAIT_EVENT_PG_SLEEP,
-       WAIT_EVENT_RECOVERY_APPLY_DELAY
+       WAIT_EVENT_RECOVERY_APPLY_DELAY,
+       WAIT_EVENT_RECOVERY_RETRIEVE_RETRY_INTERVAL
 } WaitEventTimeout;
 
 /* ----------

Reply via email to