On Thu, Apr 02, 2020 at 11:07:29AM +0530, Amit Kapila wrote:
> On Wed, Apr 1, 2020 at 8:00 PM Julien Rouhaud <rjuju...@gmail.com> wrote:
> >
> > On Wed, Apr 01, 2020 at 04:29:16PM +0530, Amit Kapila wrote:
> > > 3. Doing some testing with and without parallelism to ensure WAL usage
> > > data is correct would be great and if possible, share the results?
> >
> >
> > I just saw that Dilip did some testing, but just in case here is some
> > additional one
> >
> > - vacuum, after a truncate, loading 1M row and a "UPDATE t1 SET id = id"
> >
> > =# select query, calls, wal_bytes, wal_records, wal_num_fpw from 
> > pg_stat_statements where query ilike '%vacuum%';
> >          query          | calls | wal_bytes | wal_records | wal_num_fpw
> > ------------------------+-------+-----------+-------------+-------------
> >  vacuum (parallel 3) t1 |     1 |  20098962 |       34104 |           2
> >  vacuum (parallel 0) t1 |     1 |  20098962 |       34104 |           2
> > (2 rows)
> >
> > - create index, overload t1's parallel_workers, using the 1M line just
> >   vacuumed:
> >
> > =# alter table t1 set (parallel_workers = 2);
> > ALTER TABLE
> >
> > =# create index t1_parallel_2 on t1(id);
> > CREATE INDEX
> >
> > =# alter table t1 set (parallel_workers = 0);
> > ALTER TABLE
> >
> > =# create index t1_parallel_0 on t1(id);
> > CREATE INDEX
> >
> > =# select query, calls, wal_bytes, wal_records, wal_num_fpw from 
> > pg_stat_statements where query ilike '%create index%';
> >                 query                 | calls | wal_bytes | wal_records | 
> > wal_num_fpw
> > --------------------------------------+-------+-----------+-------------+-------------
> >  create index t1_parallel_0 on t1(id) |     1 |  20355540 |        2762 |   
> >      2745
> >  create index t1_parallel_2 on t1(id) |     1 |  20406811 |        2762 |   
> >      2758
> > (2 rows)
> >
> > It all looks good to me.
> >
> 
> Here the wal_num_fpw and wal_bytes are different between parallel and
> non-parallel versions.  Is it due to checkpoint or something else?  We
> can probably rule out checkpoint by increasing checkpoint_timeout and
> other checkpoint related parameters.

I think this is because I did a checkpoint after the VACUUM tests, so the 1st
CREATE INDEX (with parallelism) induced some FPW on the catalog blocks.  I
didn't try to investigate more since:

On Thu, Apr 02, 2020 at 11:22:16AM +0530, Amit Kapila wrote:
>
> Also, I forgot to mention that let's not base this on buffer usage
> patch for create index
> (v10-0002-Allow-parallel-index-creation-to-accumulate-buff) because as
> per recent discussion I am not sure about its usefulness.  I think we
> can proceed with this patch without
> v10-0002-Allow-parallel-index-creation-to-accumulate-buff as well.


Which is done in attached v11.


> > > 5.
> > > -SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE 
> > > "C";
> > > -               query               | calls | rows
> > > ------------------------------------+-------+------
> > > - SELECT $1::TEXT                   |     1 |    1
> > > - SELECT PLUS_ONE($1)               |     2 |    2
> > > - SELECT PLUS_TWO($1)               |     2 |    2
> > > - SELECT pg_stat_statements_reset() |     1 |    1
> > > +SELECT query, calls, rows, wal_bytes, wal_records FROM
> > > pg_stat_statements ORDER BY query COLLATE "C";
> > > +               query               | calls | rows | wal_bytes | 
> > > wal_records
> > > +-----------------------------------+-------+------+-----------+-------------
> > > + SELECT $1::TEXT                   |     1 |    1 |         0 |          
> > >  0
> > > + SELECT PLUS_ONE($1)               |     2 |    2 |         0 |          
> > >  0
> > > + SELECT PLUS_TWO($1)               |     2 |    2 |         0 |          
> > >  0
> > > + SELECT pg_stat_statements_reset() |     1 |    1 |         0 |          
> > >  0
> > >  (4 rows)
> > >
> > > Again, I am not sure if these modifications make much sense?
> >
> >
> > Those are queries that were previously executed.  As those are read-only 
> > query,
> > that are pretty much guaranteed to not cause any WAL activity, I don't see 
> > how
> > it hurts to test at the same time that that's we indeed record with
> > pg_stat_statements, just to be safe.
> >
> 
> On a similar theory, one could have checked bufferusage stats as well.
> The statements are using some expressions so don't see any value in
> check all usage data for such statements.


Dropped.


> Right now, that particular patch is not getting applied (probably due
> to recent commit 17e0328224).  Can you rebase it?


Done.


> > > v9-0004-Add-option-to-report-WAL-usage-in-EXPLAIN-and-aut
> > >
> > > 3.
> > > + if (usage->wal_num_fpw > 0)
> > > + appendStringInfo(es->str, " full page records=%ld",
> > > +    usage->wal_num_fpw);
> > > + if (usage->wal_bytes > 0)
> > > + appendStringInfo(es->str, " bytes=" UINT64_FORMAT,
> > > +    usage->wal_bytes);
> > >
> > > Shall we change to 'full page writes' or 'full page image' instead of
> > > full page records?
> >
> >
> > Indeed, I changed it in the (auto)vacuum output but missed this one.  Fixed.
> >
> 
> I don't see this change in the patch.


Yes, as Dilip reported I fixuped the wrong commit, sorry about that.  This
version should now be ok.


On Thu, Apr 02, 2020 at 12:04:32PM +0530, Dilip Kumar wrote:
> On Wed, Apr 1, 2020 at 8:00 PM Julien Rouhaud <rjuju...@gmail.com> wrote:
> >
> > > By tomorrow, I will try to finish reviewing 0005 and 0006.
>
> I have reviewed these patches and I have a few cosmetic comments.
> v10-0005-Keep-track-of-WAL-usage-in-pg_stat_statements
>
> 1.
> + uint64 wal_bytes; /* total amount of wal bytes written */
> + int64 wal_records; /* # of wal records written */
> + int64 wal_num_fpw; /* # of full page wal records written */
>
>
> /s/# of full page wal records written / /* # of WAL full page image produced 
> */


Done, I also consistently s/wal/WAL/.

>
> 2.
>  static void pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
>   ProcessUtilityContext context, ParamListInfo params,
>   QueryEnvironment *queryEnv,
> - DestReceiver *dest, QueryCompletion *qc);
> + DestReceiver *dest, QueryCompletion * qc);
>
> Useless hunk.


Oops, leftover of a pgindent as QueryCompletion isn't in the typedefs yet.  I
thought I discarded all the useless hunks but missed this one.  Thanks, fixed.


>
> 3.
>
> v10-0006-Expose-WAL-usage-counters-in-verbose-auto-vacuum
>
> @@ -3105,7 +3105,7 @@ show_wal_usage(ExplainState *es, const WalUsage *usage)
>   {
>   ExplainPropertyInteger("WAL records", NULL,
>      usage->wal_records, es);
> - ExplainPropertyInteger("WAL full page records", NULL,
> + ExplainPropertyInteger("WAL full page writes", NULL,
>      usage->wal_num_fpw, es);
> Just noticed that in 0004 you have first added "WAL full page
> records", which is later corrected to "WAL full page writes" in 0006.
> I think we better keep this proper in 0004 itself and avoid this hunk
> in 0006, otherwise, it creates confusion while reviewing.


Oh, I didn't realized that I fixuped the wrong commit.  Fixed.


I also adapted the documentation that mentioned full page records instead of
full page images, and integrated Justin's comment:

> In 0003:
> +       /* Provide WAL update data to the instrumentation */
> Remove "data" ??

so changed to "Report WAL traffic to the instrumentation."

I didn't change the (auto)vacuum output yet (except fixing the s/full page
records/full page writes/ that I previously missed), as it's not clear what the
consensus is yet.  I'll take care of that as soon as we reach to a consensus.
>From 05776f11ad1fac45dc390cda4df03f5402e214be Mon Sep 17 00:00:00 2001
From: Kirill Bychik <kirill.byc...@gmail.com>
Date: Tue, 17 Mar 2020 14:41:50 +0100
Subject: [PATCH v11 1/4] Add infrastructure to track WAL usage.

Author: Kirill Bychik, Julien Rouhaud
Reviewed-by: Fuji Masao
Discussion: 
https://postgr.es/m/CAB-hujrP8ZfUkvL5OYETipQwA=e3n7oqHFU=4zlxws_cza3...@mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c    | 37 ++++++++++++++-----
 src/backend/access/transam/xlog.c       | 12 +++++-
 src/backend/access/transam/xloginsert.c | 13 +++++--
 src/backend/executor/execParallel.c     | 38 ++++++++++++++-----
 src/backend/executor/instrument.c       | 49 ++++++++++++++++++++++---
 src/include/access/xlog.h               |  3 +-
 src/include/executor/execParallel.h     |  1 +
 src/include/executor/instrument.h       | 18 ++++++++-
 src/tools/pgindent/typedefs.list        |  1 +
 9 files changed, 139 insertions(+), 33 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c 
b/src/backend/access/heap/vacuumlazy.c
index 9f9596c718..cc7e8521a5 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -139,6 +139,7 @@
 #define PARALLEL_VACUUM_KEY_DEAD_TUPLES                2
 #define PARALLEL_VACUUM_KEY_QUERY_TEXT         3
 #define PARALLEL_VACUUM_KEY_BUFFER_USAGE       4
+#define PARALLEL_VACUUM_KEY_WAL_USAGE          5
 
 /*
  * Macro to check if we are in a parallel vacuum.  If true, we are in the
@@ -275,6 +276,9 @@ typedef struct LVParallelState
        /* Points to buffer usage area in DSM */
        BufferUsage *buffer_usage;
 
+       /* Points to WAL usage area in DSM */
+       WalUsage   *wal_usage;
+
        /*
         * The number of indexes that support parallel index bulk-deletion and
         * parallel index cleanup respectively.
@@ -2143,8 +2147,8 @@ lazy_parallel_vacuum_indexes(Relation *Irel, 
IndexBulkDeleteResult **stats,
                                                  vacrelstats->dead_tuples, 
nindexes, vacrelstats);
 
        /*
-        * Next, accumulate buffer usage.  (This must wait for the workers to
-        * finish, or we might get incomplete data.)
+        * Next, accumulate buffer and WAL usage.  (This must wait for the 
workers
+        * to finish, or we might get incomplete data.)
         */
        if (nworkers > 0)
        {
@@ -2154,7 +2158,7 @@ lazy_parallel_vacuum_indexes(Relation *Irel, 
IndexBulkDeleteResult **stats,
                WaitForParallelWorkersToFinish(lps->pcxt);
 
                for (i = 0; i < lps->pcxt->nworkers_launched; i++)
-                       InstrAccumParallelQuery(&lps->buffer_usage[i]);
+                       InstrAccumParallelQuery(&lps->buffer_usage[i], 
&lps->wal_usage[i]);
        }
 
        /*
@@ -3171,6 +3175,7 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, 
LVRelStats *vacrelstats,
        LVShared   *shared;
        LVDeadTuples *dead_tuples;
        BufferUsage *buffer_usage;
+       WalUsage   *wal_usage;
        bool       *can_parallel_vacuum;
        long            maxtuples;
        char       *sharedquery;
@@ -3255,15 +3260,19 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, 
LVRelStats *vacrelstats,
        shm_toc_estimate_keys(&pcxt->estimator, 1);
 
        /*
-        * Estimate space for BufferUsage -- PARALLEL_VACUUM_KEY_BUFFER_USAGE.
+        * Estimate space for BufferUsage and WalUsage --
+        * PARALLEL_VACUUM_KEY_BUFFER_USAGE and PARALLEL_VACUUM_KEY_WAL_USAGE.
         *
         * If there are no extensions loaded that care, we could skip this.  We
-        * have no way of knowing whether anyone's looking at pgBufferUsage, so 
do
-        * it unconditionally.
+        * have no way of knowing whether anyone's looking at pgBufferUsage or
+        * pgWalUsage, so do it unconditionally.
         */
        shm_toc_estimate_chunk(&pcxt->estimator,
                                                   
mul_size(sizeof(BufferUsage), pcxt->nworkers));
        shm_toc_estimate_keys(&pcxt->estimator, 1);
+       shm_toc_estimate_chunk(&pcxt->estimator,
+                                                  mul_size(sizeof(WalUsage), 
pcxt->nworkers));
+       shm_toc_estimate_keys(&pcxt->estimator, 1);
 
        /* Finally, estimate PARALLEL_VACUUM_KEY_QUERY_TEXT space */
        querylen = strlen(debug_query_string);
@@ -3299,11 +3308,18 @@ begin_parallel_vacuum(Oid relid, Relation *Irel, 
LVRelStats *vacrelstats,
        shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_DEAD_TUPLES, dead_tuples);
        vacrelstats->dead_tuples = dead_tuples;
 
-       /* Allocate space for each worker's BufferUsage; no need to initialize 
*/
+       /*
+        * Allocate space for each worker's BufferUsage and WalUsage; no need to
+        * initialize
+        */
        buffer_usage = shm_toc_allocate(pcxt->toc,
                                                                        
mul_size(sizeof(BufferUsage), pcxt->nworkers));
        shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_BUFFER_USAGE, 
buffer_usage);
        lps->buffer_usage = buffer_usage;
+       wal_usage = shm_toc_allocate(pcxt->toc,
+                                                                       
mul_size(sizeof(WalUsage), pcxt->nworkers));
+       shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_WAL_USAGE, wal_usage);
+       lps->wal_usage = wal_usage;
 
        /* Store query string for workers */
        sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
@@ -3435,6 +3451,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
        LVShared   *lvshared;
        LVDeadTuples *dead_tuples;
        BufferUsage *buffer_usage;
+       WalUsage   *wal_usage;
        int                     nindexes;
        char       *sharedquery;
        IndexBulkDeleteResult **stats;
@@ -3511,9 +3528,11 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
        parallel_vacuum_index(indrels, stats, lvshared, dead_tuples, nindexes,
                                                  &vacrelstats);
 
-       /* Report buffer usage during parallel execution */
+       /* Report buffer/WAL usage during parallel execution */
        buffer_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_BUFFER_USAGE, 
false);
-       InstrEndParallelQuery(&buffer_usage[ParallelWorkerNumber]);
+       wal_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_WAL_USAGE, false);
+       InstrEndParallelQuery(&buffer_usage[ParallelWorkerNumber],
+                                                 
&wal_usage[ParallelWorkerNumber]);
 
        /* Pop the error context stack */
        error_context_stack = errcallback.previous;
diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index 977d448f50..50b78f3143 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -43,6 +43,7 @@
 #include "commands/progress.h"
 #include "commands/tablespace.h"
 #include "common/controldata_utils.h"
+#include "executor/instrument.h"
 #include "miscadmin.h"
 #include "pg_trace.h"
 #include "pgstat.h"
@@ -996,7 +997,8 @@ static void WALInsertLockUpdateInsertingAt(XLogRecPtr 
insertingAt);
 XLogRecPtr
 XLogInsertRecord(XLogRecData *rdata,
                                 XLogRecPtr fpw_lsn,
-                                uint8 flags)
+                                uint8 flags,
+                                int num_fpw)
 {
        XLogCtlInsert *Insert = &XLogCtl->Insert;
        pg_crc32c       rdata_crc;
@@ -1252,6 +1254,14 @@ XLogInsertRecord(XLogRecData *rdata,
        ProcLastRecPtr = StartPos;
        XactLastRecEnd = EndPos;
 
+       /* Report WAL traffic to the instrumentation. */
+       if (inserted)
+       {
+               pgWalUsage.wal_bytes += rechdr->xl_tot_len;
+               pgWalUsage.wal_records++;
+               pgWalUsage.wal_num_fpw += num_fpw;
+       }
+
        return EndPos;
 }
 
diff --git a/src/backend/access/transam/xloginsert.c 
b/src/backend/access/transam/xloginsert.c
index a618dec776..5e032e7042 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -25,6 +25,7 @@
 #include "access/xloginsert.h"
 #include "catalog/pg_control.h"
 #include "common/pg_lzcompress.h"
+#include "executor/instrument.h"
 #include "miscadmin.h"
 #include "pg_trace.h"
 #include "replication/origin.h"
@@ -108,7 +109,7 @@ static MemoryContext xloginsert_cxt;
 
 static XLogRecData *XLogRecordAssemble(RmgrId rmid, uint8 info,
                                                                           
XLogRecPtr RedoRecPtr, bool doPageWrites,
-                                                                          
XLogRecPtr *fpw_lsn);
+                                                                          
XLogRecPtr *fpw_lsn, int *num_fpw);
 static bool XLogCompressBackupBlock(char *page, uint16 hole_offset,
                                                                        uint16 
hole_length, char *dest, uint16 *dlen);
 
@@ -448,6 +449,7 @@ XLogInsert(RmgrId rmid, uint8 info)
                bool            doPageWrites;
                XLogRecPtr      fpw_lsn;
                XLogRecData *rdt;
+               int                     num_fpw = 0;
 
                /*
                 * Get values needed to decide whether to do full-page writes. 
Since
@@ -457,9 +459,9 @@ XLogInsert(RmgrId rmid, uint8 info)
                GetFullPageWriteInfo(&RedoRecPtr, &doPageWrites);
 
                rdt = XLogRecordAssemble(rmid, info, RedoRecPtr, doPageWrites,
-                                                                &fpw_lsn);
+                                                                &fpw_lsn, 
&num_fpw);
 
-               EndPos = XLogInsertRecord(rdt, fpw_lsn, curinsert_flags);
+               EndPos = XLogInsertRecord(rdt, fpw_lsn, curinsert_flags, 
num_fpw);
        } while (EndPos == InvalidXLogRecPtr);
 
        XLogResetInsertion();
@@ -482,7 +484,7 @@ XLogInsert(RmgrId rmid, uint8 info)
 static XLogRecData *
 XLogRecordAssemble(RmgrId rmid, uint8 info,
                                   XLogRecPtr RedoRecPtr, bool doPageWrites,
-                                  XLogRecPtr *fpw_lsn)
+                                  XLogRecPtr *fpw_lsn, int *num_fpw)
 {
        XLogRecData *rdt;
        uint32          total_len = 0;
@@ -635,6 +637,9 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
                         */
                        bkpb.fork_flags |= BKPBLOCK_HAS_IMAGE;
 
+                       /* Report a full page image constructed for the WAL 
record */
+                       *num_fpw += 1;
+
                        /*
                         * Construct XLogRecData entries for the page content.
                         */
diff --git a/src/backend/executor/execParallel.c 
b/src/backend/executor/execParallel.c
index a753d6efa0..7d9ca66fc8 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -12,7 +12,7 @@
  * workers and ensuring that their state generally matches that of the
  * leader; see src/backend/access/transam/README.parallel for details.
  * However, we must save and restore relevant executor state, such as
- * any ParamListInfo associated with the query, buffer usage info, and
+ * any ParamListInfo associated with the query, buffer/WAL usage info, and
  * the actual plan to be passed down to the worker.
  *
  * IDENTIFICATION
@@ -62,6 +62,7 @@
 #define PARALLEL_KEY_DSA                               
UINT64CONST(0xE000000000000007)
 #define PARALLEL_KEY_QUERY_TEXT                UINT64CONST(0xE000000000000008)
 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
+#define PARALLEL_KEY_WAL_USAGE                 UINT64CONST(0xE00000000000000A)
 
 #define PARALLEL_TUPLE_QUEUE_SIZE              65536
 
@@ -573,6 +574,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
        char       *pstmt_space;
        char       *paramlistinfo_space;
        BufferUsage *bufusage_space;
+       WalUsage        *walusage_space;
        SharedExecutorInstrumentation *instrumentation = NULL;
        SharedJitInstrumentation *jit_instrumentation = NULL;
        int                     pstmt_len;
@@ -646,6 +648,13 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
                                                   
mul_size(sizeof(BufferUsage), pcxt->nworkers));
        shm_toc_estimate_keys(&pcxt->estimator, 1);
 
+       /*
+        * Same thing for WalUsage.
+        */
+       shm_toc_estimate_chunk(&pcxt->estimator,
+                                                  mul_size(sizeof(WalUsage), 
pcxt->nworkers));
+       shm_toc_estimate_keys(&pcxt->estimator, 1);
+
        /* Estimate space for tuple queues. */
        shm_toc_estimate_chunk(&pcxt->estimator,
                                                   
mul_size(PARALLEL_TUPLE_QUEUE_SIZE, pcxt->nworkers));
@@ -728,6 +737,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
        shm_toc_insert(pcxt->toc, PARALLEL_KEY_BUFFER_USAGE, bufusage_space);
        pei->buffer_usage = bufusage_space;
 
+       /* Same for WalUsage. */
+       walusage_space = shm_toc_allocate(pcxt->toc,
+                                                                         
mul_size(sizeof(WalUsage), pcxt->nworkers));
+       shm_toc_insert(pcxt->toc, PARALLEL_KEY_WAL_USAGE, walusage_space);
+       pei->wal_usage = walusage_space;
+
        /* Set up the tuple queues that the workers will write into. */
        pei->tqueue = ExecParallelSetupTupleQueues(pcxt, false);
 
@@ -1069,7 +1084,7 @@ ExecParallelRetrieveJitInstrumentation(PlanState 
*planstate,
 
 /*
  * Finish parallel execution.  We wait for parallel workers to finish, and
- * accumulate their buffer usage.
+ * accumulate their buffer/WAL usage.
  */
 void
 ExecParallelFinish(ParallelExecutorInfo *pei)
@@ -1109,11 +1124,11 @@ ExecParallelFinish(ParallelExecutorInfo *pei)
        WaitForParallelWorkersToFinish(pei->pcxt);
 
        /*
-        * Next, accumulate buffer usage.  (This must wait for the workers to
-        * finish, or we might get incomplete data.)
+        * Next, accumulate buffer/WAL usage.  (This must wait for the workers
+        * to finish, or we might get incomplete data.)
         */
        for (i = 0; i < nworkers; i++)
-               InstrAccumParallelQuery(&pei->buffer_usage[i]);
+               InstrAccumParallelQuery(&pei->buffer_usage[i], 
&pei->wal_usage[i]);
 
        pei->finished = true;
 }
@@ -1333,6 +1348,7 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
 {
        FixedParallelExecutorState *fpes;
        BufferUsage *buffer_usage;
+       WalUsage *wal_usage;
        DestReceiver *receiver;
        QueryDesc  *queryDesc;
        SharedExecutorInstrumentation *instrumentation;
@@ -1386,11 +1402,11 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
        ExecSetTupleBound(fpes->tuples_needed, queryDesc->planstate);
 
        /*
-        * Prepare to track buffer usage during query execution.
+        * Prepare to track buffer/WAL usage during query execution.
         *
         * We do this after starting up the executor to match what happens in 
the
-        * leader, which also doesn't count buffer accesses that occur during
-        * executor startup.
+        * leader, which also doesn't count buffer accesses and WAL activity 
that
+        * occur during executor startup.
         */
        InstrStartParallelQuery();
 
@@ -1406,9 +1422,11 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
        /* Shut down the executor */
        ExecutorFinish(queryDesc);
 
-       /* Report buffer usage during parallel execution. */
+       /* Report buffer/WAL usage during parallel execution. */
        buffer_usage = shm_toc_lookup(toc, PARALLEL_KEY_BUFFER_USAGE, false);
-       InstrEndParallelQuery(&buffer_usage[ParallelWorkerNumber]);
+       wal_usage = shm_toc_lookup (toc, PARALLEL_KEY_WAL_USAGE, false);
+       InstrEndParallelQuery(&buffer_usage[ParallelWorkerNumber],
+                                                 
&wal_usage[ParallelWorkerNumber]);
 
        /* Report instrumentation data if any instrumentation options are set. 
*/
        if (instrumentation != NULL)
diff --git a/src/backend/executor/instrument.c 
b/src/backend/executor/instrument.c
index 042e10f96b..dd615581ac 100644
--- a/src/backend/executor/instrument.c
+++ b/src/backend/executor/instrument.c
@@ -19,8 +19,11 @@
 
 BufferUsage pgBufferUsage;
 static BufferUsage save_pgBufferUsage;
+WalUsage pgWalUsage;
+static WalUsage save_pgWalUsage;
 
 static void BufferUsageAdd(BufferUsage *dst, const BufferUsage *add);
+static void WalUsageAdd(WalUsage *dst, WalUsage *add);
 
 
 /* Allocate new instrumentation structure(s) */
@@ -31,15 +34,17 @@ InstrAlloc(int n, int instrument_options)
 
        /* initialize all fields to zeroes, then modify as needed */
        instr = palloc0(n * sizeof(Instrumentation));
-       if (instrument_options & (INSTRUMENT_BUFFERS | INSTRUMENT_TIMER))
+       if (instrument_options & (INSTRUMENT_BUFFERS | INSTRUMENT_TIMER | 
INSTRUMENT_WAL))
        {
                bool            need_buffers = (instrument_options & 
INSTRUMENT_BUFFERS) != 0;
+               bool            need_wal = (instrument_options & 
INSTRUMENT_WAL) != 0;
                bool            need_timer = (instrument_options & 
INSTRUMENT_TIMER) != 0;
                int                     i;
 
                for (i = 0; i < n; i++)
                {
                        instr[i].need_bufusage = need_buffers;
+                       instr[i].need_walusage = need_wal;
                        instr[i].need_timer = need_timer;
                }
        }
@@ -53,6 +58,7 @@ InstrInit(Instrumentation *instr, int instrument_options)
 {
        memset(instr, 0, sizeof(Instrumentation));
        instr->need_bufusage = (instrument_options & INSTRUMENT_BUFFERS) != 0;
+       instr->need_walusage = (instrument_options & INSTRUMENT_WAL) != 0;
        instr->need_timer = (instrument_options & INSTRUMENT_TIMER) != 0;
 }
 
@@ -67,6 +73,9 @@ InstrStartNode(Instrumentation *instr)
        /* save buffer usage totals at node entry, if needed */
        if (instr->need_bufusage)
                instr->bufusage_start = pgBufferUsage;
+
+       if (instr->need_walusage)
+               instr->walusage_start = pgWalUsage;
 }
 
 /* Exit from a plan node */
@@ -95,6 +104,10 @@ InstrStopNode(Instrumentation *instr, double nTuples)
                BufferUsageAccumDiff(&instr->bufusage,
                                                         &pgBufferUsage, 
&instr->bufusage_start);
 
+       if (instr->need_walusage)
+               WalUsageAccumDiff(&instr->walusage,
+                                                 &pgWalUsage, 
&instr->walusage_start);
+
        /* Is this the first tuple of this cycle? */
        if (!instr->running)
        {
@@ -158,6 +171,9 @@ InstrAggNode(Instrumentation *dst, Instrumentation *add)
        /* Add delta of buffer usage since entry to node's totals */
        if (dst->need_bufusage)
                BufferUsageAdd(&dst->bufusage, &add->bufusage);
+
+       if (dst->need_walusage)
+               WalUsageAdd(&dst->walusage, &add->walusage);
 }
 
 /* note current values during parallel executor startup */
@@ -165,21 +181,25 @@ void
 InstrStartParallelQuery(void)
 {
        save_pgBufferUsage = pgBufferUsage;
+       save_pgWalUsage = pgWalUsage;
 }
 
 /* report usage after parallel executor shutdown */
 void
-InstrEndParallelQuery(BufferUsage *result)
+InstrEndParallelQuery(BufferUsage *bufusage, WalUsage *walusage)
 {
-       memset(result, 0, sizeof(BufferUsage));
-       BufferUsageAccumDiff(result, &pgBufferUsage, &save_pgBufferUsage);
+       memset(bufusage, 0, sizeof(BufferUsage));
+       memset(walusage, 0, sizeof(WalUsage));
+       BufferUsageAccumDiff(bufusage, &pgBufferUsage, &save_pgBufferUsage);
+       WalUsageAccumDiff(walusage, &pgWalUsage, &save_pgWalUsage);
 }
 
 /* accumulate work done by workers in leader's stats */
 void
-InstrAccumParallelQuery(BufferUsage *result)
+InstrAccumParallelQuery(BufferUsage *bufusage, WalUsage *walusage)
 {
-       BufferUsageAdd(&pgBufferUsage, result);
+       BufferUsageAdd(&pgBufferUsage, bufusage);
+       WalUsageAdd(&pgWalUsage, walusage);
 }
 
 /* dst += add */
@@ -221,3 +241,20 @@ BufferUsageAccumDiff(BufferUsage *dst,
        INSTR_TIME_ACCUM_DIFF(dst->blk_write_time,
                                                  add->blk_write_time, 
sub->blk_write_time);
 }
+
+/* helper functions for WAL usage accumulation */
+static void
+WalUsageAdd(WalUsage *dst, WalUsage *add)
+{
+       dst->wal_bytes += add->wal_bytes;
+       dst->wal_records += add->wal_records;
+       dst->wal_num_fpw += add->wal_num_fpw;
+}
+
+void
+WalUsageAccumDiff(WalUsage *dst, const WalUsage *add, const WalUsage *sub)
+{
+       dst->wal_bytes += add->wal_bytes - sub->wal_bytes;
+       dst->wal_records += add->wal_records - sub->wal_records;
+       dst->wal_num_fpw += add->wal_num_fpw - sub->wal_num_fpw;
+}
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 9ec7b31cce..b91e724b2d 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -259,7 +259,8 @@ struct XLogRecData;
 
 extern XLogRecPtr XLogInsertRecord(struct XLogRecData *rdata,
                                                                   XLogRecPtr 
fpw_lsn,
-                                                                  uint8 flags);
+                                                                  uint8 flags,
+                                                                  int num_fpw);
 extern void XLogFlush(XLogRecPtr RecPtr);
 extern bool XLogBackgroundFlush(void);
 extern bool XLogNeedsFlush(XLogRecPtr RecPtr);
diff --git a/src/include/executor/execParallel.h 
b/src/include/executor/execParallel.h
index 17d07cf020..1cc5b524fd 100644
--- a/src/include/executor/execParallel.h
+++ b/src/include/executor/execParallel.h
@@ -26,6 +26,7 @@ typedef struct ParallelExecutorInfo
        PlanState  *planstate;          /* plan subtree we're running in 
parallel */
        ParallelContext *pcxt;          /* parallel context we're using */
        BufferUsage *buffer_usage;      /* points to bufusage area in DSM */
+       WalUsage *wal_usage;            /* walusage area in DSM */
        SharedExecutorInstrumentation *instrumentation; /* optional */
        struct SharedJitInstrumentation *jit_instrumentation;   /* optional */
        dsa_area   *area;                       /* points to DSA area in DSM */
diff --git a/src/include/executor/instrument.h 
b/src/include/executor/instrument.h
index 3825a5ac1f..e8875a8e9b 100644
--- a/src/include/executor/instrument.h
+++ b/src/include/executor/instrument.h
@@ -32,12 +32,20 @@ typedef struct BufferUsage
        instr_time      blk_write_time; /* time spent writing */
 } BufferUsage;
 
+typedef struct WalUsage
+{
+       long            wal_records;    /* # of WAL records produced */
+       long            wal_num_fpw;    /* # of WAL full page image produced */
+       uint64          wal_bytes;              /* size of WAL records produced 
*/
+} WalUsage;
+
 /* Flag bits included in InstrAlloc's instrument_options bitmask */
 typedef enum InstrumentOption
 {
        INSTRUMENT_TIMER = 1 << 0,      /* needs timer (and row counts) */
        INSTRUMENT_BUFFERS = 1 << 1,    /* needs buffer usage */
        INSTRUMENT_ROWS = 1 << 2,       /* needs row count */
+       INSTRUMENT_WAL = 1 << 3,        /* needs WAL usage */
        INSTRUMENT_ALL = PG_INT32_MAX
 } InstrumentOption;
 
@@ -46,6 +54,7 @@ typedef struct Instrumentation
        /* Parameters set at node creation: */
        bool            need_timer;             /* true if we need timer data */
        bool            need_bufusage;  /* true if we need buffer usage data */
+       bool            need_walusage;  /* true if we need WAL usage data */
        /* Info about current plan cycle: */
        bool            running;                /* true if we've completed 
first tuple */
        instr_time      starttime;              /* start time of current 
iteration of node */
@@ -53,6 +62,7 @@ typedef struct Instrumentation
        double          firsttuple;             /* time for first tuple of this 
cycle */
        double          tuplecount;             /* # of tuples emitted so far 
this cycle */
        BufferUsage bufusage_start; /* buffer usage at start */
+       WalUsage        walusage_start; /* WAL usage at start */
        /* Accumulated statistics across all completed cycles: */
        double          startup;                /* total startup time (in 
seconds) */
        double          total;                  /* total time (in seconds) */
@@ -62,6 +72,7 @@ typedef struct Instrumentation
        double          nfiltered1;             /* # of tuples removed by 
scanqual or joinqual */
        double          nfiltered2;             /* # of tuples removed by 
"other" quals */
        BufferUsage bufusage;           /* total buffer usage */
+       WalUsage        walusage;               /* total WAL usage */
 } Instrumentation;
 
 typedef struct WorkerInstrumentation
@@ -71,6 +82,7 @@ typedef struct WorkerInstrumentation
 } WorkerInstrumentation;
 
 extern PGDLLIMPORT BufferUsage pgBufferUsage;
+extern PGDLLIMPORT WalUsage pgWalUsage;
 
 extern Instrumentation *InstrAlloc(int n, int instrument_options);
 extern void InstrInit(Instrumentation *instr, int instrument_options);
@@ -79,9 +91,11 @@ extern void InstrStopNode(Instrumentation *instr, double 
nTuples);
 extern void InstrEndLoop(Instrumentation *instr);
 extern void InstrAggNode(Instrumentation *dst, Instrumentation *add);
 extern void InstrStartParallelQuery(void);
-extern void InstrEndParallelQuery(BufferUsage *result);
-extern void InstrAccumParallelQuery(BufferUsage *result);
+extern void InstrEndParallelQuery(BufferUsage *bufusage, WalUsage *walusage);
+extern void InstrAccumParallelQuery(BufferUsage *bufusage, WalUsage *walusage);
 extern void BufferUsageAccumDiff(BufferUsage *dst,
                                                                 const 
BufferUsage *add, const BufferUsage *sub);
+extern void WalUsageAccumDiff(WalUsage *dst, const WalUsage *add,
+                                                         const WalUsage *sub);
 
 #endif                                                 /* INSTRUMENT_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 939de985d3..34623523a7 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2643,6 +2643,7 @@ WalSndCtlData
 WalSndSendDataCallback
 WalSndState
 WalTimeSample
+WalUsage
 WalWriteMethod
 Walfile
 WindowAgg
-- 
2.20.1

>From 14bb2d37bf74f1cdc0d78fb0cab4b49525d34e20 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <julien.rouh...@free.fr>
Date: Sun, 29 Mar 2020 12:38:14 +0200
Subject: [PATCH v11 2/4] Add option to report WAL usage in EXPLAIN and
 auto_explain.

Author: Julien Rouhaud
Reviewed-by:
Discussion: 
https://postgr.es/m/CAB-hujrP8ZfUkvL5OYETipQwA=e3n7oqHFU=4zlxws_cza3...@mail.gmail.com
---
 contrib/auto_explain/auto_explain.c | 15 ++++++
 doc/src/sgml/auto-explain.sgml      | 20 ++++++++
 doc/src/sgml/ref/explain.sgml       | 14 ++++++
 src/backend/commands/explain.c      | 74 +++++++++++++++++++++++++++--
 src/bin/psql/tab-complete.c         |  4 +-
 src/include/commands/explain.h      |  3 ++
 6 files changed, 124 insertions(+), 6 deletions(-)

diff --git a/contrib/auto_explain/auto_explain.c 
b/contrib/auto_explain/auto_explain.c
index f69dde876c..56c549d84c 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -27,6 +27,7 @@ static int    auto_explain_log_min_duration = -1; /* msec or 
-1 */
 static bool auto_explain_log_analyze = false;
 static bool auto_explain_log_verbose = false;
 static bool auto_explain_log_buffers = false;
+static bool auto_explain_log_wal = false;
 static bool auto_explain_log_triggers = false;
 static bool auto_explain_log_timing = true;
 static bool auto_explain_log_settings = false;
@@ -148,6 +149,17 @@ _PG_init(void)
                                                         NULL,
                                                         NULL);
 
+       DefineCustomBoolVariable("auto_explain.log_wal",
+                                                        "Log WAL usage.",
+                                                        NULL,
+                                                        &auto_explain_log_wal,
+                                                        false,
+                                                        PGC_SUSET,
+                                                        0,
+                                                        NULL,
+                                                        NULL,
+                                                        NULL);
+
        DefineCustomBoolVariable("auto_explain.log_triggers",
                                                         "Include trigger 
statistics in plans.",
                                                         "This has no effect 
unless log_analyze is also set.",
@@ -280,6 +292,8 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
                                queryDesc->instrument_options |= 
INSTRUMENT_ROWS;
                        if (auto_explain_log_buffers)
                                queryDesc->instrument_options |= 
INSTRUMENT_BUFFERS;
+                       if (auto_explain_log_wal)
+                               queryDesc->instrument_options |= INSTRUMENT_WAL;
                }
        }
 
@@ -374,6 +388,7 @@ explain_ExecutorEnd(QueryDesc *queryDesc)
                        es->analyze = (queryDesc->instrument_options && 
auto_explain_log_analyze);
                        es->verbose = auto_explain_log_verbose;
                        es->buffers = (es->analyze && auto_explain_log_buffers);
+                       es->wal = (es->analyze && auto_explain_log_wal);
                        es->timing = (es->analyze && auto_explain_log_timing);
                        es->summary = es->analyze;
                        es->format = auto_explain_log_format;
diff --git a/doc/src/sgml/auto-explain.sgml b/doc/src/sgml/auto-explain.sgml
index 3d619d4a3d..d4d29c4a64 100644
--- a/doc/src/sgml/auto-explain.sgml
+++ b/doc/src/sgml/auto-explain.sgml
@@ -109,6 +109,26 @@ LOAD 'auto_explain';
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <varname>auto_explain.log_wal</varname> (<type>boolean</type>)
+     <indexterm>
+      <primary><varname>auto_explain.log_wal</varname> configuration 
parameter</primary>
+     </indexterm>
+    </term>
+    <listitem>
+     <para>
+      <varname>auto_explain.log_wal</varname> controls whether WAL
+      usage statistics are printed when an execution plan is logged; it's
+      equivalent to the <literal>WAL</literal> option of 
<command>EXPLAIN</command>.
+      This parameter has no effect
+      unless <varname>auto_explain.log_analyze</varname> is enabled.
+      This parameter is off by default.
+      Only superusers can change this setting.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <varname>auto_explain.log_timing</varname> (<type>boolean</type>)
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index 385d10411f..494e60ecc9 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -41,6 +41,7 @@ EXPLAIN [ ANALYZE ] [ VERBOSE ] <replaceable 
class="parameter">statement</replac
     COSTS [ <replaceable class="parameter">boolean</replaceable> ]
     SETTINGS [ <replaceable class="parameter">boolean</replaceable> ]
     BUFFERS [ <replaceable class="parameter">boolean</replaceable> ]
+    WAL [ <replaceable class="parameter">boolean</replaceable> ]
     TIMING [ <replaceable class="parameter">boolean</replaceable> ]
     SUMMARY [ <replaceable class="parameter">boolean</replaceable> ]
     FORMAT { TEXT | XML | JSON | YAML }
@@ -192,6 +193,19 @@ ROLLBACK;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>WAL</literal></term>
+    <listitem>
+     <para>
+      Include information on WAL record generation. Specifically, include the
+      number of records, full page images and bytes generated.  In text
+      format, only non-zero values are printed.  This parameter may only be
+      used when <literal>ANALYZE</literal> is also enabled.  It defaults to
+      <literal>FALSE</literal>.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>TIMING</literal></term>
     <listitem>
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index ee0e638f33..cefe2144e5 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -113,6 +113,7 @@ static void show_foreignscan_info(ForeignScanState 
*fsstate, ExplainState *es);
 static void show_eval_params(Bitmapset *bms_params, ExplainState *es);
 static const char *explain_get_index_name(Oid indexId);
 static void show_buffer_usage(ExplainState *es, const BufferUsage *usage);
+static void show_wal_usage(ExplainState *es, const WalUsage *usage);
 static void ExplainIndexScanDetails(Oid indexid, ScanDirection indexorderdir,
                                                                        
ExplainState *es);
 static void ExplainScanTarget(Scan *plan, ExplainState *es);
@@ -175,6 +176,8 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
                        es->costs = defGetBoolean(opt);
                else if (strcmp(opt->defname, "buffers") == 0)
                        es->buffers = defGetBoolean(opt);
+               else if (strcmp(opt->defname, "wal") == 0)
+                       es->wal = defGetBoolean(opt);
                else if (strcmp(opt->defname, "settings") == 0)
                        es->settings = defGetBoolean(opt);
                else if (strcmp(opt->defname, "timing") == 0)
@@ -219,6 +222,11 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
                                (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                                 errmsg("EXPLAIN option BUFFERS requires 
ANALYZE")));
 
+       if (es->wal && !es->analyze)
+               ereport(ERROR,
+                               (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                                errmsg("EXPLAIN option WAL requires 
ANALYZE")));
+
        /* if the timing was not set explicitly, set default value */
        es->timing = (timing_set) ? es->timing : es->analyze;
 
@@ -494,6 +502,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, 
ExplainState *es,
 
        if (es->buffers)
                instrument_option |= INSTRUMENT_BUFFERS;
+       if (es->wal)
+               instrument_option |= INSTRUMENT_WAL;
 
        /*
         * We always collect timing for the entire statement, even when 
node-level
@@ -1942,12 +1952,14 @@ ExplainNode(PlanState *planstate, List *ancestors,
                }
        }
 
-       /* Show buffer usage */
+       /* Show buffer/WAL usage */
        if (es->buffers && planstate->instrument)
                show_buffer_usage(es, &planstate->instrument->bufusage);
+       if (es->wal && planstate->instrument)
+               show_wal_usage(es, &planstate->instrument->walusage);
 
-       /* Prepare per-worker buffer usage */
-       if (es->workers_state && es->buffers && es->verbose)
+       /* Prepare per-worker buffer/WAL usage */
+       if (es->workers_state && (es->buffers || es->wal) && es->verbose)
        {
                WorkerInstrumentation *w = planstate->worker_instrument;
 
@@ -1960,7 +1972,10 @@ ExplainNode(PlanState *planstate, List *ancestors,
                                continue;
 
                        ExplainOpenWorker(n, es);
-                       show_buffer_usage(es, &instrument->bufusage);
+                       if (es->buffers)
+                               show_buffer_usage(es, &instrument->bufusage);
+                       if (es->wal)
+                               show_wal_usage(es, &instrument->walusage);
                        ExplainCloseWorker(n, es);
                }
        }
@@ -3059,6 +3074,44 @@ show_buffer_usage(ExplainState *es, const BufferUsage 
*usage)
        }
 }
 
+/*
+ * Show WAL usage details.
+ */
+static void
+show_wal_usage(ExplainState *es, const WalUsage *usage)
+{
+       if (es->format == EXPLAIN_FORMAT_TEXT)
+       {
+               /* Show only positive counter values. */
+               if ((usage->wal_records > 0) || (usage->wal_num_fpw > 0) ||
+                               (usage->wal_bytes > 0))
+               {
+                       ExplainIndentText(es);
+                       appendStringInfoString(es->str, "WAL:");
+
+                       if (usage->wal_records > 0)
+                               appendStringInfo(es->str, " records=%ld",
+                                                                          
usage->wal_records);
+                       if (usage->wal_num_fpw > 0)
+                               appendStringInfo(es->str, " full page 
writes=%ld",
+                                                                          
usage->wal_num_fpw);
+                       if (usage->wal_bytes > 0)
+                               appendStringInfo(es->str, " bytes=" 
UINT64_FORMAT,
+                                                                          
usage->wal_bytes);
+                       appendStringInfoChar(es->str, '\n');
+               }
+       }
+       else
+       {
+               ExplainPropertyInteger("WAL records", NULL,
+                                                          usage->wal_records, 
es);
+               ExplainPropertyInteger("WAL full page writes", NULL,
+                                                          usage->wal_num_fpw, 
es);
+               ExplainPropertyUInteger("WAL bytes", NULL,
+                                                          usage->wal_bytes, 
es);
+       }
+}
+
 /*
  * Add some additional details about an IndexScan or IndexOnlyScan
  */
@@ -3843,6 +3896,19 @@ ExplainPropertyInteger(const char *qlabel, const char 
*unit, int64 value,
        ExplainProperty(qlabel, unit, buf, true, es);
 }
 
+/*
+ * Explain an unsigned integer-valued property.
+ */
+void
+ExplainPropertyUInteger(const char *qlabel, const char *unit, uint64 value,
+                                               ExplainState *es)
+{
+       char            buf[32];
+
+       snprintf(buf, sizeof(buf), UINT64_FORMAT, value);
+       ExplainProperty(qlabel, unit, buf, true, es);
+}
+
 /*
  * Explain a float-valued property, using the specified number of
  * fractional digits.
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5fec59723c..0e7a373caf 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3045,8 +3045,8 @@ psql_completion(const char *text, int start, int end)
                 */
                if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
                        COMPLETE_WITH("ANALYZE", "VERBOSE", "COSTS", "SETTINGS",
-                                                 "BUFFERS", "TIMING", 
"SUMMARY", "FORMAT");
-               else if 
(TailMatches("ANALYZE|VERBOSE|COSTS|SETTINGS|BUFFERS|TIMING|SUMMARY"))
+                                                 "BUFFERS", "WAL", "TIMING", 
"SUMMARY", "FORMAT");
+               else if 
(TailMatches("ANALYZE|VERBOSE|COSTS|SETTINGS|BUFFERS|WAL|TIMING|SUMMARY"))
                        COMPLETE_WITH("ON", "OFF");
                else if (TailMatches("FORMAT"))
                        COMPLETE_WITH("TEXT", "XML", "JSON", "YAML");
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 54f6240e5e..7b0b0a94a6 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -42,6 +42,7 @@ typedef struct ExplainState
        bool            analyze;                /* print actual times */
        bool            costs;                  /* print estimated costs */
        bool            buffers;                /* print buffer usage */
+       bool            wal;                    /* print WAL usage */
        bool            timing;                 /* print detailed node timing */
        bool            summary;                /* print total planning and 
execution timing */
        bool            settings;               /* print modified settings */
@@ -110,6 +111,8 @@ extern void ExplainPropertyText(const char *qlabel, const 
char *value,
                                                                ExplainState 
*es);
 extern void ExplainPropertyInteger(const char *qlabel, const char *unit,
                                                                   int64 value, 
ExplainState *es);
+extern void ExplainPropertyUInteger(const char *qlabel, const char *unit,
+                                                                       uint64 
value, ExplainState *es);
 extern void ExplainPropertyFloat(const char *qlabel, const char *unit,
                                                                 double value, 
int ndigits, ExplainState *es);
 extern void ExplainPropertyBool(const char *qlabel, bool value,
-- 
2.20.1

>From 4fe82b4d09b068041bdca91628733b48845acc16 Mon Sep 17 00:00:00 2001
From: Kirill Bychik <kirill.byc...@gmail.com>
Date: Tue, 17 Mar 2020 14:42:02 +0100
Subject: [PATCH v11 3/4] Keep track of WAL usage in pg_stat_statements.

Author: Kirill Bychik
Reviewed-by: Julien Rouhaud, Fuji Masao
Discussion: 
https://postgr.es/m/CAB-hujrP8ZfUkvL5OYETipQwA=e3n7oqHFU=4zlxws_cza3...@mail.gmail.com
---
 .../expected/pg_stat_statements.out           | 39 +++++++++++++
 .../pg_stat_statements--1.7--1.8.sql          |  7 ++-
 .../pg_stat_statements/pg_stat_statements.c   | 55 +++++++++++++++++--
 .../sql/pg_stat_statements.sql                | 23 ++++++++
 doc/src/sgml/pgstatstatements.sgml            | 27 +++++++++
 5 files changed, 144 insertions(+), 7 deletions(-)

diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out 
b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index 45dbe9e677..02da7245b4 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -211,6 +211,45 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY 
query COLLATE "C";
  UPDATE test SET b = $1 WHERE a > $2                                          
|     1 |    3
 (10 rows)
 
+--
+-- CRUD: INSERT SELECT UPDATE DELETE on test non-temp table to validate WAL 
generation metrics
+--
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--------------------------
+ 
+(1 row)
+
+-- utility "create table" should not be shown
+CREATE TABLE pgss_test (a int, b char(20));
+INSERT INTO pgss_test VALUES(generate_series(1, 10), 'aaa');
+UPDATE pgss_test SET b = 'bbb' WHERE a > 7;
+DELETE FROM pgss_test WHERE a > 9;
+-- DROP test table
+SET pg_stat_statements.track_utility = TRUE;
+DROP TABLE pgss_test;
+SET pg_stat_statements.track_utility = FALSE;
+-- SELECT usage data, check WAL usage is reported, wal_records equal rows 
count for INSERT/UPDATE/DELETE
+SELECT query, calls, rows,
+wal_bytes > 0 as wal_bytes_generated,
+wal_records > 0 as wal_records_generated,
+wal_records = rows as wal_records_as_rows
+FROM pg_stat_statements ORDER BY query COLLATE "C";
+                           query                           | calls | rows | 
wal_bytes_generated | wal_records_generated | wal_records_as_rows 
+-----------------------------------------------------------+-------+------+---------------------+-----------------------+---------------------
+ DELETE FROM pgss_test WHERE a > $1                        |     1 |    1 | t  
                 | t                     | t
+ DROP TABLE pgss_test                                      |     1 |    0 | t  
                 | t                     | f
+ INSERT INTO pgss_test VALUES(generate_series($1, $2), $3) |     1 |   10 | t  
                 | t                     | t
+ SELECT pg_stat_statements_reset()                         |     1 |    1 | f  
                 | f                     | f
+ SELECT query, calls, rows,                               +|     0 |    0 | f  
                 | f                     | t
+ wal_bytes > $1 as wal_bytes_generated,                   +|       |      |    
                 |                       | 
+ wal_records > $2 as wal_records_generated,               +|       |      |    
                 |                       | 
+ wal_records = rows as wal_records_as_rows                +|       |      |    
                 |                       | 
+ FROM pg_stat_statements ORDER BY query COLLATE "C"        |       |      |    
                 |                       | 
+ SET pg_stat_statements.track_utility = FALSE              |     1 |    0 | f  
                 | f                     | t
+ UPDATE pgss_test SET b = $1 WHERE a > $2                  |     1 |    3 | t  
                 | t                     | t
+(7 rows)
+
 --
 -- pg_stat_statements.track = none
 --
diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql 
b/contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql
index aa5cc3c77b..2fcf7aee01 100644
--- a/contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql
+++ b/contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql
@@ -41,7 +41,10 @@ CREATE FUNCTION pg_stat_statements(IN showtext boolean,
     OUT temp_blks_read int8,
     OUT temp_blks_written int8,
     OUT blk_read_time float8,
-    OUT blk_write_time float8
+    OUT blk_write_time float8,
+    OUT wal_bytes numeric,
+    OUT wal_records int8,
+    OUT wal_num_fpw int8
 )
 RETURNS SETOF record
 AS 'MODULE_PATHNAME', 'pg_stat_statements_1_8'
@@ -51,5 +54,3 @@ CREATE VIEW pg_stat_statements AS
   SELECT * FROM pg_stat_statements(true);
 
 GRANT SELECT ON pg_stat_statements TO PUBLIC;
-
-
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c 
b/contrib/pg_stat_statements/pg_stat_statements.c
index 942922b01f..f8bf4f852a 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -185,6 +185,9 @@ typedef struct Counters
        int64           local_blks_written; /* # of local disk blocks written */
        int64           temp_blks_read; /* # of temp blocks read */
        int64           temp_blks_written;      /* # of temp blocks written */
+       uint64          wal_bytes;              /* total amount of WAL bytes 
generated */
+       int64           wal_records;    /* # of WAL records generated */
+       int64           wal_num_fpw;    /* # of WAL full page image generated */
        double          blk_read_time;  /* time spent reading, in msec */
        double          blk_write_time; /* time spent writing, in msec */
        double          usage;                  /* usage factor */
@@ -348,6 +351,7 @@ static void pgss_store(const char *query, uint64 queryId,
                                           pgssStoreKind kind,
                                           double total_time, uint64 rows,
                                           const BufferUsage *bufusage,
+                                          const WalUsage *walusage,
                                           pgssJumbleState *jstate);
 static void pg_stat_statements_internal(FunctionCallInfo fcinfo,
                                                                                
pgssVersion api_version,
@@ -891,6 +895,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query)
                                   0,
                                   0,
                                   NULL,
+                                  NULL,
                                   &jstate);
 }
 
@@ -926,9 +931,16 @@ pgss_planner(Query *parse,
                instr_time      duration;
                BufferUsage bufusage_start,
                                        bufusage;
+               WalUsage        walusage_start,
+                                       walusage;
 
                /* We need to track buffer usage as the planner can access 
them. */
                bufusage_start = pgBufferUsage;
+               /*
+                * Similarly the planner could write some WAL records in some 
cases
+                * (e.g. setting a hint bit with those being WAL-logged)
+                */
+               walusage_start = pgWalUsage;
                INSTR_TIME_SET_CURRENT(start);
 
                plan_nested_level++;
@@ -954,6 +966,10 @@ pgss_planner(Query *parse,
                memset(&bufusage, 0, sizeof(BufferUsage));
                BufferUsageAccumDiff(&bufusage, &pgBufferUsage, 
&bufusage_start);
 
+               /* calc differences of WAL counters. */
+               memset(&walusage, 0, sizeof(WalUsage));
+               WalUsageAccumDiff(&walusage, &pgWalUsage, &walusage_start);
+
                pgss_store(query_string,
                                   parse->queryId,
                                   parse->stmt_location,
@@ -962,6 +978,7 @@ pgss_planner(Query *parse,
                                   INSTR_TIME_GET_MILLISEC(duration),
                                   0,
                                   &bufusage,
+                                  &walusage,
                                   NULL);
        }
        else
@@ -1079,6 +1096,7 @@ pgss_ExecutorEnd(QueryDesc *queryDesc)
                                   queryDesc->totaltime->total * 1000.0,        
/* convert to msec */
                                   queryDesc->estate->es_processed,
                                   &queryDesc->totaltime->bufusage,
+                                  &queryDesc->totaltime->walusage,
                                   NULL);
        }
 
@@ -1123,8 +1141,10 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char 
*queryString,
                uint64          rows;
                BufferUsage bufusage_start,
                                        bufusage;
-
+               WalUsage        walusage_start,
+                                       walusage;
                bufusage_start = pgBufferUsage;
+               walusage_start = pgWalUsage;
                INSTR_TIME_SET_CURRENT(start);
 
                exec_nested_level++;
@@ -1154,6 +1174,10 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char 
*queryString,
                memset(&bufusage, 0, sizeof(BufferUsage));
                BufferUsageAccumDiff(&bufusage, &pgBufferUsage, 
&bufusage_start);
 
+               /* calc differences of WAL counters. */
+               memset(&walusage, 0, sizeof(WalUsage));
+               WalUsageAccumDiff(&walusage, &pgWalUsage, &walusage_start);
+
                pgss_store(queryString,
                                   0,                   /* signal that it's a 
utility stmt */
                                   pstmt->stmt_location,
@@ -1162,6 +1186,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char 
*queryString,
                                   INSTR_TIME_GET_MILLISEC(duration),
                                   rows,
                                   &bufusage,
+                                  &walusage,
                                   NULL);
        }
        else
@@ -1197,7 +1222,8 @@ pgss_hash_string(const char *str, int len)
  *
  * If jstate is not NULL then we're trying to create an entry for which
  * we have no statistics as yet; we just want to record the normalized
- * query string.  total_time, rows, bufusage are ignored in this case.
+ * query string.  total_time, rows, bufusage and walusage are ignored in this
+ * case.
  *
  * If kind is PGSS_PLAN or PGSS_EXEC, its value is used as the array position
  * for the arrays in the Counters field.
@@ -1208,6 +1234,7 @@ pgss_store(const char *query, uint64 queryId,
                   pgssStoreKind kind,
                   double total_time, uint64 rows,
                   const BufferUsage *bufusage,
+                  const WalUsage *walusage,
                   pgssJumbleState *jstate)
 {
        pgssHashKey key;
@@ -1402,6 +1429,9 @@ pgss_store(const char *query, uint64 queryId,
                e->counters.blk_read_time += 
INSTR_TIME_GET_MILLISEC(bufusage->blk_read_time);
                e->counters.blk_write_time += 
INSTR_TIME_GET_MILLISEC(bufusage->blk_write_time);
                e->counters.usage += USAGE_EXEC(total_time);
+               e->counters.wal_bytes += walusage->wal_bytes;
+               e->counters.wal_records += walusage->wal_records;
+               e->counters.wal_num_fpw += walusage->wal_num_fpw;
 
                SpinLockRelease(&e->mutex);
        }
@@ -1449,8 +1479,8 @@ pg_stat_statements_reset(PG_FUNCTION_ARGS)
 #define PG_STAT_STATEMENTS_COLS_V1_1   18
 #define PG_STAT_STATEMENTS_COLS_V1_2   19
 #define PG_STAT_STATEMENTS_COLS_V1_3   23
-#define PG_STAT_STATEMENTS_COLS_V1_8   29
-#define PG_STAT_STATEMENTS_COLS                        29      /* maximum of 
above */
+#define PG_STAT_STATEMENTS_COLS_V1_8   32
+#define PG_STAT_STATEMENTS_COLS                        32      /* maximum of 
above */
 
 /*
  * Retrieve statement statistics.
@@ -1786,6 +1816,23 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
                        values[i++] = Float8GetDatumFast(tmp.blk_read_time);
                        values[i++] = Float8GetDatumFast(tmp.blk_write_time);
                }
+               if (api_version >= PGSS_V1_8)
+               {
+                       char            buf[256];
+                       Datum           wal_bytes;
+
+                       snprintf(buf, sizeof buf, UINT64_FORMAT, tmp.wal_bytes);
+
+                       /* Convert to numeric. */
+                       wal_bytes = DirectFunctionCall3(numeric_in,
+                                                                               
        CStringGetDatum(buf),
+                                                                               
        ObjectIdGetDatum(0),
+                                                                               
        Int32GetDatum(-1));
+
+                       values[i++] = wal_bytes;
+                       values[i++] = Int64GetDatumFast(tmp.wal_records);
+                       values[i++] = Int64GetDatumFast(tmp.wal_num_fpw);
+               }
 
                Assert(i == (api_version == PGSS_V1_0 ? 
PG_STAT_STATEMENTS_COLS_V1_0 :
                                         api_version == PGSS_V1_1 ? 
PG_STAT_STATEMENTS_COLS_V1_1 :
diff --git a/contrib/pg_stat_statements/sql/pg_stat_statements.sql 
b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
index 435d51008f..eaacd4021a 100644
--- a/contrib/pg_stat_statements/sql/pg_stat_statements.sql
+++ b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
@@ -101,6 +101,29 @@ SELECT * FROM test WHERE a IN (1, 2, 3, 4, 5);
 
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 
+--
+-- CRUD: INSERT SELECT UPDATE DELETE on test non-temp table to validate WAL 
generation metrics
+--
+SELECT pg_stat_statements_reset();
+
+-- utility "create table" should not be shown
+CREATE TABLE pgss_test (a int, b char(20));
+
+INSERT INTO pgss_test VALUES(generate_series(1, 10), 'aaa');
+UPDATE pgss_test SET b = 'bbb' WHERE a > 7;
+DELETE FROM pgss_test WHERE a > 9;
+-- DROP test table
+SET pg_stat_statements.track_utility = TRUE;
+DROP TABLE pgss_test;
+SET pg_stat_statements.track_utility = FALSE;
+
+-- SELECT usage data, check WAL usage is reported, wal_records equal rows 
count for INSERT/UPDATE/DELETE
+SELECT query, calls, rows,
+wal_bytes > 0 as wal_bytes_generated,
+wal_records > 0 as wal_records_generated,
+wal_records = rows as wal_records_as_rows
+FROM pg_stat_statements ORDER BY query COLLATE "C";
+
 --
 -- pg_stat_statements.track = none
 --
diff --git a/doc/src/sgml/pgstatstatements.sgml 
b/doc/src/sgml/pgstatstatements.sgml
index b4df84c60b..3d26108649 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -264,6 +264,33 @@
       </entry>
      </row>
 
+     <row>
+      <entry><structfield>wal_bytes</structfield></entry>
+      <entry><type>numeric</type></entry>
+      <entry></entry>
+      <entry>
+        Total amount of WAL bytes generated by the statement
+      </entry>
+     </row>
+
+     <row>
+      <entry><structfield>wal_records</structfield></entry>
+      <entry><type>bigint</type></entry>
+      <entry></entry>
+      <entry>
+        Total count of WAL records generated by the statement
+      </entry>
+     </row>
+
+     <row>
+      <entry><structfield>wal_num_fpw</structfield></entry>
+      <entry><type>bigint</type></entry>
+      <entry></entry>
+      <entry>
+        Total count of WAL full page writes generated by the statement
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
-- 
2.20.1

>From aff0b78f60bfe182d6795531f74615810631f0e0 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <julien.rouh...@free.fr>
Date: Thu, 19 Mar 2020 16:08:47 +0100
Subject: [PATCH v11 4/4] Expose WAL usage counters in verbose (auto)vacuum
 output.

Author: Julien Rouhaud
Reviewed-by: Fuji Masao
Discussion: 
https://postgr.es/m/CAB-hujrP8ZfUkvL5OYETipQwA=e3n7oqHFU=4zlxws_cza3...@mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/heap/vacuumlazy.c 
b/src/backend/access/heap/vacuumlazy.c
index cc7e8521a5..735087dd74 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -410,6 +410,8 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
        int                     nindexes;
        PGRUsage        ru0;
        TimestampTz starttime = 0;
+       WalUsage walusage_start = pgWalUsage;
+       WalUsage walusage = {0, 0, 0};
        long            secs;
        int                     usecs;
        double          read_rate,
@@ -614,6 +616,9 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 
                        TimestampDifference(starttime, endtime, &secs, &usecs);
 
+                       memset(&walusage, 0, sizeof(WalUsage));
+                       WalUsageAccumDiff(&walusage, &pgWalUsage, 
&walusage_start);
+
                        read_rate = 0;
                        write_rate = 0;
                        if ((secs > 0) || (usecs > 0))
@@ -666,7 +671,13 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
                                                         (long long) 
VacuumPageDirty);
                        appendStringInfo(&buf, _("avg read rate: %.3f MB/s, avg 
write rate: %.3f MB/s\n"),
                                                         read_rate, write_rate);
-                       appendStringInfo(&buf, _("system usage: %s"), 
pg_rusage_show(&ru0));
+                       appendStringInfo(&buf, _("system usage: %s\n"), 
pg_rusage_show(&ru0));
+                       appendStringInfo(&buf,
+                                                        _("WAL usage: %ld 
records, %ld full page writes, "
+                                                        UINT64_FORMAT " 
bytes"),
+                                                        walusage.wal_records,
+                                                        walusage.wal_num_fpw,
+                                                        walusage.wal_bytes);
 
                        ereport(LOG,
                                        (errmsg_internal("%s", buf.data)));
@@ -758,6 +769,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, 
LVRelStats *vacrelstats,
        IndexBulkDeleteResult **indstats;
        int                     i;
        PGRUsage        ru0;
+       WalUsage walusage_start = pgWalUsage;
+       WalUsage walusage = {0, 0, 0};
        Buffer          vmbuffer = InvalidBuffer;
        BlockNumber next_unskippable_block;
        bool            skipping_blocks;
@@ -1727,6 +1740,15 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, 
LVRelStats *vacrelstats,
                                                                        "%u 
pages are entirely empty.\n",
                                                                        
empty_pages),
                                         empty_pages);
+
+       memset(&walusage, 0, sizeof(WalUsage));
+       WalUsageAccumDiff(&walusage, &pgWalUsage, &walusage_start);
+       appendStringInfo(&buf, _("%ld WAL records, %ld WAL full page writes, "
+                                                       UINT64_FORMAT " WAL 
bytes\n"),
+                                                       walusage.wal_records,
+                                                       walusage.wal_num_fpw,
+                                                       walusage.wal_bytes);
+
        appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
 
        ereport(elevel,
-- 
2.20.1

Reply via email to