Hi,

2012/08/12 7:11, Jeff Janes wrote:
On Sat, Jul 28, 2012 at 3:33 PM, Jeff Janes <jeff.ja...@gmail.com> wrote:
On Sat, Jul 7, 2012 at 9:17 PM, Satoshi Nagayasu <sn...@uptime.jp> wrote:
Hi,

Jeff Janes has pointed out that my previous patch could hold
a number of the dirty writes only in single local backend, and
it could not hold all over the cluster, because the counter
was allocated in the local process memory.

That's true, and I have fixed it with moving the counter into
the shared memory, as a member of XLogCtlWrite, to keep total
dirty writes in the cluster.


...

The comment "XLogCtrlWrite must be protected with WALWriteLock"
mis-spells XLogCtlWrite.

The final patch will need to add a sections to the documentation.

Thanks to Robert and Tom for addressing my concerns about the pointer
volatility.

I think there is enough consensus that this is useful without adding
more things to it, like histograms or high water marks.

However, I do think we will want to add a way to query for the time of
the last reset, as other monitoring features are going that way.

Is it OK that the count is reset upon a server restart?
pg_stat_bgwriter, for example, does not do that.  Unfortunately I
think fixing this in an acceptable way will be harder than the entire
rest of the patch was.


The coding looks OK to me, it applies and builds, and passes make
check, and does what it says.  I didn't do performance testing, as it
is hard to believe it would have a meaningful effect.

I'll marked it as waiting on author, for the documentation and reset
time.  I'd ask a more senior hacker to comment on the durability over
restarts.

I have rewritten the patch to deal with dirty write statistics
through pgstat collector as bgwriter does.
Yeah, it's a bit bigger rewrite.

With this patch, walwriter process and each backend process
would sum up dirty writes, and send it to the stat collector.
So, the value could be saved in the stat file, and could be
kept on restarting.

The statistics could be retreive with using
pg_stat_get_xlog_dirty_writes() function, and could be reset
with calling pg_stat_reset_shared('walwriter').

Now, I have one concern.

The reset time could be captured in globalStats.stat_reset_timestamp,
but this value is the same with the bgwriter one.

So, once pg_stat_reset_shared('walwriter') is called,
stats_reset column in pg_stat_bgwriter does represent
the reset time for walwriter, not for bgwriter.

How should we handle this?  Should we split this value?
And should we have new system view for walwriter?

Of course, I will work on documentation next.

Regards,


Cheers,

Jeff

--
Satoshi Nagayasu <sn...@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp
diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index ff56c26..234d568 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1518,6 +1518,7 @@ AdvanceXLInsertBuffer(bool new_segment)
                                WriteRqst.Write = OldPageRqstPtr;
                                WriteRqst.Flush = 0;
                                XLogWrite(WriteRqst, false, false);
+                               WalWriterStats.m_xlog_dirty_writes++;
                                LWLockRelease(WALWriteLock);
                                TRACE_POSTGRESQL_WAL_BUFFER_WRITE_DIRTY_DONE();
                        }
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 8389d5c..f031be1 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -125,6 +125,15 @@ char          *pgstat_stat_tmpname = NULL;
  */
 PgStat_MsgBgWriter BgWriterStats;
 
+/*
+ * WalWriter global statistics counter.
+ * Despite its name, this counter is actually used not only in walwriter,
+ * but also in each backend process to sum up xlog dirty writes.
+ * Those processes would increment this counter in each XLogWrite call,
+ * then send it to the stat collector process.
+ */
+PgStat_MsgWalWriter WalWriterStats;
+
 /* ----------
  * Local data
  * ----------
@@ -279,6 +288,7 @@ static void pgstat_recv_autovac(PgStat_MsgAutovacStart 
*msg, int len);
 static void pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len);
 static void pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len);
 static void pgstat_recv_bgwriter(PgStat_MsgBgWriter *msg, int len);
+static void pgstat_recv_walwriter(PgStat_MsgWalWriter *msg, int len);
 static void pgstat_recv_funcstat(PgStat_MsgFuncstat *msg, int len);
 static void pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len);
 static void pgstat_recv_recoveryconflict(PgStat_MsgRecoveryConflict *msg, int 
len);
@@ -1188,6 +1198,8 @@ pgstat_reset_shared_counters(const char *target)
 
        if (strcmp(target, "bgwriter") == 0)
                msg.m_resettarget = RESET_BGWRITER;
+       else if (strcmp(target, "walwriter") == 0)
+               msg.m_resettarget = RESET_WALWRITER;
        else
                ereport(ERROR,
                                (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -2988,6 +3000,38 @@ pgstat_send_bgwriter(void)
        MemSet(&BgWriterStats, 0, sizeof(BgWriterStats));
 }
 
+/* ----------
+ * pgstat_send_walwriter() -
+ *
+ *             Send walwriter statistics to the collector
+ * ----------
+ */
+void
+pgstat_send_walwriter(void)
+{
+       /* We assume this initializes to zeroes */
+       static const PgStat_MsgBgWriter all_zeroes;
+
+       /*
+        * This function can be called even if nothing at all has happened. In
+        * this case, avoid sending a completely empty message to the stats
+        * collector.
+        */
+       if (memcmp(&WalWriterStats, &all_zeroes, sizeof(PgStat_MsgWalWriter)) 
== 0)
+               return;
+
+       /*
+        * Prepare and send the message
+        */
+       pgstat_setheader(&WalWriterStats.m_hdr, PGSTAT_MTYPE_WALWRITER);
+       pgstat_send(&WalWriterStats, sizeof(WalWriterStats));
+
+       /*
+        * Clear out the statistics buffer, so it can be re-used.
+        */
+       MemSet(&WalWriterStats, 0, sizeof(WalWriterStats));
+}
+
 
 /* ----------
  * PgstatCollectorMain() -
@@ -3207,6 +3251,10 @@ PgstatCollectorMain(int argc, char *argv[])
                                        
pgstat_recv_bgwriter((PgStat_MsgBgWriter *) &msg, len);
                                        break;
 
+                               case PGSTAT_MTYPE_WALWRITER:
+                                       
pgstat_recv_walwriter((PgStat_MsgWalWriter *) &msg, len);
+                                       break;
+
                                case PGSTAT_MTYPE_FUNCSTAT:
                                        
pgstat_recv_funcstat((PgStat_MsgFuncstat *) &msg, len);
                                        break;
@@ -4382,6 +4430,12 @@ 
pgstat_recv_resetsharedcounter(PgStat_MsgResetsharedcounter *msg, int len)
                memset(&globalStats, 0, sizeof(globalStats));
                globalStats.stat_reset_timestamp = GetCurrentTimestamp();
        }
+       else if (msg->m_resettarget == RESET_WALWRITER)
+       {
+               /* Reset the global walwriter statistics for the cluster. */
+               memset(&globalStats, 0, sizeof(globalStats));
+               globalStats.stat_reset_timestamp = GetCurrentTimestamp();
+       }
 
        /*
         * Presumably the sender of this message validated the target, don't
@@ -4534,6 +4588,18 @@ pgstat_recv_bgwriter(PgStat_MsgBgWriter *msg, int len)
 }
 
 /* ----------
+ * pgstat_recv_walwriter() -
+ *
+ *     Process a WALWRITER message.
+ * ----------
+ */
+static void
+pgstat_recv_walwriter(PgStat_MsgWalWriter *msg, int len)
+{
+       globalStats.xlog_dirty_writes += msg->m_xlog_dirty_writes;
+}
+
+/* ----------
  * pgstat_recv_recoveryconflict() -
  *
  *     Process a RECOVERYCONFLICT message.
diff --git a/src/backend/postmaster/walwriter.c 
b/src/backend/postmaster/walwriter.c
index 4313901..6fbc5f9 100644
--- a/src/backend/postmaster/walwriter.c
+++ b/src/backend/postmaster/walwriter.c
@@ -289,6 +289,8 @@ WalWriterMain(void)
                else if (left_till_hibernate > 0)
                        left_till_hibernate--;
 
+               pgstat_send_walwriter();
+
                /*
                 * Sleep until we are signaled or WalWriterDelay has elapsed.  
If we
                 * haven't done anything useful for quite some time, lengthen 
the
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index f1248a8..1ed2c0a 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3918,6 +3918,8 @@ PostgresMain(int argc, char *argv[], const char *username)
                                pgstat_report_activity(STATE_IDLE, NULL);
                        }
 
+                       pgstat_send_walwriter();
+
                        ReadyForQuery(whereToSendOutput);
                        send_ready_for_query = false;
                }
diff --git a/src/backend/utils/adt/pgstatfuncs.c 
b/src/backend/utils/adt/pgstatfuncs.c
index 7d4059f..f707c9a 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -118,6 +118,8 @@ extern Datum pg_stat_reset_shared(PG_FUNCTION_ARGS);
 extern Datum pg_stat_reset_single_table_counters(PG_FUNCTION_ARGS);
 extern Datum pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS);
 
+extern Datum pg_stat_get_xlog_dirty_writes(PG_FUNCTION_ARGS);
+
 /* Global bgwriter statistics, from bgwriter.c */
 extern PgStat_MsgBgWriter bgwriterStats;
 
@@ -1701,3 +1703,9 @@ pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS)
 
        PG_RETURN_VOID();
 }
+
+Datum
+pg_stat_get_xlog_dirty_writes(PG_FUNCTION_ARGS)
+{
+       PG_RETURN_INT64(pgstat_fetch_global()->xlog_dirty_writes);
+}
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 77a3b41..da2225c 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2740,6 +2740,9 @@ DESCR("statistics: reset collected statistics for a 
single table or index in the
 DATA(insert OID = 3777 (  pg_stat_reset_single_function_counters       PGNSP 
PGUID 12 1 0 0 0 f f f f f f v 1 0 2278 "26" _null_ _null_ _null_ _null_  
pg_stat_reset_single_function_counters _null_ _null_ _null_ ));
 DESCR("statistics: reset collected statistics for a single function in the 
current database");
 
+DATA(insert OID = 3766 (  pg_stat_get_xlog_dirty_writes  PGNSP PGUID 12 1 0 0 
0 f f f f f f v 0 0 20 "" _null_ _null_ _null_ _null_ 
pg_stat_get_xlog_dirty_writes _null_ _null_ _null_ ));
+DESCR("statistics: get xlog dirty buffer write statistics");
+
 DATA(insert OID = 3163 (  pg_trigger_depth                             PGNSP 
PGUID 12 1 0 0 0 f f f f t f s 0 0 23 "" _null_ _null_ _null_ _null_ 
pg_trigger_depth _null_ _null_ _null_ ));
 DESCR("current trigger depth");
 
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 613c1c2..59aa1ba 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -45,6 +45,7 @@ typedef enum StatMsgType
        PGSTAT_MTYPE_VACUUM,
        PGSTAT_MTYPE_ANALYZE,
        PGSTAT_MTYPE_BGWRITER,
+       PGSTAT_MTYPE_WALWRITER,
        PGSTAT_MTYPE_FUNCSTAT,
        PGSTAT_MTYPE_FUNCPURGE,
        PGSTAT_MTYPE_RECOVERYCONFLICT,
@@ -102,7 +103,8 @@ typedef struct PgStat_TableCounts
 /* Possible targets for resetting cluster-wide shared values */
 typedef enum PgStat_Shared_Reset_Target
 {
-       RESET_BGWRITER
+       RESET_BGWRITER,
+       RESET_WALWRITER
 } PgStat_Shared_Reset_Target;
 
 /* Possible object types for resetting single counters */
@@ -372,6 +374,17 @@ typedef struct PgStat_MsgBgWriter
 } PgStat_MsgBgWriter;
 
 /* ----------
+ * PgStat_MsgWalWriter                 Sent by the walwriter to update 
statistics.
+ * ----------
+ */
+typedef struct PgStat_MsgWalWriter
+{
+       PgStat_MsgHdr m_hdr;
+
+       PgStat_Counter m_xlog_dirty_writes;
+} PgStat_MsgWalWriter;
+
+/* ----------
  * PgStat_MsgRecoveryConflict  Sent by the backend upon recovery conflict
  * ----------
  */
@@ -499,6 +512,7 @@ typedef union PgStat_Msg
        PgStat_MsgVacuum msg_vacuum;
        PgStat_MsgAnalyze msg_analyze;
        PgStat_MsgBgWriter msg_bgwriter;
+       PgStat_MsgWalWriter msg_walwriter;
        PgStat_MsgFuncstat msg_funcstat;
        PgStat_MsgFuncpurge msg_funcpurge;
        PgStat_MsgRecoveryConflict msg_recoveryconflict;
@@ -622,6 +636,7 @@ typedef struct PgStat_GlobalStats
        PgStat_Counter buf_written_backend;
        PgStat_Counter buf_fsync_backend;
        PgStat_Counter buf_alloc;
+       PgStat_Counter xlog_dirty_writes;
        TimestampTz stat_reset_timestamp;
 } PgStat_GlobalStats;
 
@@ -730,6 +745,8 @@ extern char *pgstat_stat_filename;
  */
 extern PgStat_MsgBgWriter BgWriterStats;
 
+extern PgStat_MsgWalWriter WalWriterStats;
+
 /*
  * Updated by pgstat_count_buffer_*_time macros
  */
@@ -858,6 +875,7 @@ extern void pgstat_twophase_postabort(TransactionId xid, 
uint16 info,
                                                  void *recdata, uint32 len);
 
 extern void pgstat_send_bgwriter(void);
+extern void pgstat_send_walwriter(void);
 
 /* ----------
  * Support functions for the SQL-callable functions to
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to