[HACKERS] PATCH: regular logging of checkpoint progress

Tomas Vondra Thu, 25 Aug 2011 13:58:05 -0700

Hello,

I'd like to propose a small patch that allows better checkpoint progress
monitoring. The patch is quite simple - it adds a new integer GUC
"checkpoint_update_limit" and every time checkpoint writes this number of
buffers, it does two things:


(a) logs a "checkpoint status" message into the server log, with info
about total number of buffers to write, number of already written buffers,
current and average write speed and estimate of remaining time

(b) sends bgwriter stats (so that the buffers_checkpoint is updated)

I believe this will make checkpoint tuning easier, especially with large
shared bufferers and large when there's other write activity (so that it's
difficult to see checkpoint I/O).

The default value (0) means this continuous logging is disabled.

Tomas

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
new file mode 100644
index 67e722f..64d84b0
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
*************** SET ENABLE_SEQSCAN TO OFF;
*** 1863,1868 ****
--- 1863,1885 ----
        </listitem>
       </varlistentry>
  
+      <varlistentry id="guc-checkpoint-update-limit" 
xreflabel="checkpoint_update_limit">
+       <term><varname>checkpoint_update_limit</varname> 
(<type>integer</type>)</term>
+       <indexterm>
+        <primary><varname>checkpoint_update_limit</> configuration 
parameter</primary>
+       </indexterm>
+       <listitem>
+        <para>
+         Number of buffers written during a checkpoint between logging a status
+         (with total number of buffers to write, number of already written 
buffers,
+         average/current write and estimate of the remaining time) and updates 
bgwriter
+         stats at the same time. The default value 0 disables the continuous 
updates so
+         the stats are updated only at the end of the checkpoint. This 
parameter can only
+         be set in the <filename>postgresql.conf</> file or on the server 
command line.
+        </para>
+       </listitem>
+      </varlistentry>
+ 
       </variablelist>
       </sect2>
       <sect2 id="runtime-config-wal-archiving">
diff --git a/src/backend/storage/buffer/bufmgr.c 
b/src/backend/storage/buffer/bufmgr.c
new file mode 100644
index 4c7cfb0..b24ec93
*** a/src/backend/storage/buffer/bufmgr.c
--- b/src/backend/storage/buffer/bufmgr.c
***************
*** 66,71 ****
--- 66,72 ----
  bool          zero_damaged_pages = false;
  int                   bgwriter_lru_maxpages = 100;
  double                bgwriter_lru_multiplier = 2.0;
+ int           checkpoint_update_limit = 0;
  
  /*
   * How many buffers PrefetchBuffer callers should try to stay ahead of their
*************** BufferSync(int flags)
*** 1175,1180 ****
--- 1176,1192 ----
        int                     num_to_write;
        int                     num_written;
        int                     mask = BM_DIRTY;
+       
+       int                     num_since_update;
+       
+       long            curr_secs,
+                               total_secs;
+       int                     curr_usecs,
+                               total_usecs;
+       float           curr_time,
+                               total_time;
+       
+       TimestampTz             startTimestamp, lastTimestamp;
  
        /* Make sure we can handle the pin inside SyncOneBuffer */
        ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
*************** BufferSync(int flags)
*** 1238,1243 ****
--- 1250,1260 ----
        buf_id = StrategySyncStart(NULL, NULL);
        num_to_scan = NBuffers;
        num_written = 0;
+       num_since_update = 0;
+       
+       startTimestamp = GetCurrentTimestamp();
+       lastTimestamp = startTimestamp;
+       
        while (num_to_scan-- > 0)
        {
                volatile BufferDesc *bufHdr = &BufferDescriptors[buf_id];
*************** BufferSync(int flags)
*** 1261,1266 ****
--- 1278,1327 ----
                                TRACE_POSTGRESQL_BUFFER_SYNC_WRITTEN(buf_id);
                                BgWriterStats.m_buf_written_checkpoints++;
                                num_written++;
+                               num_since_update++;
+                               
+                               /*
+                                * Every time we write enough buffers 
(checkpoint_update_limit),
+                                * we log a checkpoint status message and 
update the bgwriter
+                                * stats (so that the pg_stat_bgwriter table 
may be updated).
+                                * 
+                                * The log message contains info about total 
number of buffers to
+                                * write, how many buffers are already written, 
average and current
+                                * write speed and an estimate remaining time.
+                                */
+                               if ((checkpoint_update_limit > 0) && 
(num_since_update >= checkpoint_update_limit))
+                               {
+                                 
+                                       TimestampDifference(lastTimestamp,
+                                               GetCurrentTimestamp(),
+                                               &curr_secs, &curr_usecs);
+                                       
+                                       TimestampDifference(startTimestamp,
+                                               GetCurrentTimestamp(),
+                                               &total_secs, &total_usecs);
+                                       
+                                       curr_time = curr_secs + 
(float)curr_usecs / 1000000;
+                                       total_time = total_secs + 
(float)total_usecs / 1000000;
+                                       
+                                       elog(LOG, "checkpoint status: wrote %d 
buffers of %d (%.1f%%) in %.1f s; "
+                                               "average %.1f MB/s (%d buffers, 
%ld.%03d s), "
+                                               "current %.1f MB/s (%d buffers, 
%ld.%03d s), "
+                                               "remaining %.1f s",
+                                               num_written, num_to_write, 
((float) num_written * 100 / num_to_write),
+                                               total_time,
+                                               ((float)BLCKSZ * num_written / 
1024 / 1024 / total_time),
+                                               num_written, total_secs, 
total_usecs/1000,
+                                               ((float)BLCKSZ * 
num_since_update / 1024 / 1024 / curr_time),
+                                               num_since_update, curr_secs, 
curr_usecs/1000,
+                                               (float)(num_to_write - 
num_written) * total_time / (num_written));
+                                       
+                                       pgstat_send_bgwriter();
+                                       
+                                       /* reset the counter and timestamp */
+                                       num_since_update = 0;                   
                
+                                       lastTimestamp = GetCurrentTimestamp();
+ 
+                               }
  
                                /*
                                 * We know there are at most num_to_write 
buffers with
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
new file mode 100644
index 6670997..2176933
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
*************** static struct config_int ConfigureNamesI
*** 1965,1970 ****
--- 1965,1981 ----
        },
  
        {
+               {"checkpoint_update_limit", PGC_SIGHUP, WAL_CHECKPOINTS,
+                       gettext_noop("Sets number of buffers written between 
stats updates during a checkpoint."),
+                       gettext_noop("Every time the checkpoint writes this 
number of buffers, it updates "
+                       "the stats in pg_stat_bgwriter and logs a message with 
the current checkpoint progress. "
+                       "Zero means the stats are updated when the checkpoint 
only."),
+               },
+               &checkpoint_update_limit,
+               0, 0, INT_MAX, NULL, NULL
+       },
+ 
+       {
                {"wal_buffers", PGC_POSTMASTER, WAL_SETTINGS,
                        gettext_noop("Sets the number of disk-page buffers in 
shared memory for WAL."),
                        NULL,
diff --git a/src/backend/utils/misc/postgresql.conf.sample 
b/src/backend/utils/misc/postgresql.conf.sample
new file mode 100644
index 65fd126..284ddd1
*** a/src/backend/utils/misc/postgresql.conf.sample
--- b/src/backend/utils/misc/postgresql.conf.sample
***************
*** 180,185 ****
--- 180,187 ----
  #checkpoint_timeout = 5min            # range 30s-1h
  #checkpoint_completion_target = 0.5   # checkpoint target duration, 0.0 - 1.0
  #checkpoint_warning = 30s             # 0 disables
+ #checkpoint_update_limit = 0  # buffers written between loggin checkpoint
+                               # status and updating bgwriter stats, 0 disables
  
  # - Archiving -
  
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
new file mode 100644
index b8fc87e..b36a87a
*** a/src/include/storage/bufmgr.h
--- b/src/include/storage/bufmgr.h
*************** extern bool zero_damaged_pages;
*** 49,54 ****
--- 49,55 ----
  extern int    bgwriter_lru_maxpages;
  extern double bgwriter_lru_multiplier;
  extern int    target_prefetch_pages;
+ extern int    checkpoint_update_limit;
  
  /* in buf_init.c */
  extern PGDLLIMPORT char *BufferBlocks;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] PATCH: regular logging of checkpoint progress

Reply via email to