[HACKERS] Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system

Noah Misch Tue, 22 Dec 2015 06:51:29 -0800

On Mon, Feb 18, 2013 at 06:19:12PM -0300, Alvaro Herrera wrote:
> I have pushed it now.  Further testing, of course, is always welcome.


While investigating stats.sql buildfarm failures, mostly on animals axolotl
and shearwater, I found that this patch (commit 187492b) inadvertently removed
the collector's ability to coalesce inquiries.  Every PGSTAT_MTYPE_INQUIRY
received now causes one stats file write.  Before, pgstat_recv_inquiry() did:

        if (msg->inquiry_time > last_statrequest)
                last_statrequest = msg->inquiry_time;

and pgstat_write_statsfile() did:

        globalStats.stats_timestamp = GetCurrentTimestamp();
... (work of writing a stats file) ...
                last_statwrite = globalStats.stats_timestamp;
                last_statrequest = last_statwrite;

If the collector entered pgstat_write_statsfile() with more inquiries waiting
in its socket receive buffer, it would ignore them as being too old once it
finished the write and resumed message processing.  Commit 187492b converted
last_statrequest to a "last_statrequests" list that we wipe after each write.

I modeled a machine with slow stats writes using the attached diagnostic patch
(not for commit).  It has pgstat_write_statsfiles() sleep just before renaming
the temporary file, and it logs each stats message received.  A three second
delay causes stats.sql to fail the way it did on shearwater[1] and on
axolotl[2].  Inquiries accumulate until the socket receive buffer overflows,
at which point the socket drops stats messages whose effects we later test
for.  The 3s delay makes us drop some 85% of installcheck stats messages.
Logs from a single VACUUM show receipt of five inquiries ("stats 1:") with 3s
between them:

24239 2015-12-10 04:21:03.439 GMT LOG:  connection authorized: user=nmisch 
database=test
24236 2015-12-10 04:21:03.454 GMT LOG:  stats 2: 1888 + 936 = 2824
24236 2015-12-10 04:21:03.454 GMT LOG:  stats 2: 2824 + 376 = 3200
24236 2015-12-10 04:21:03.454 GMT LOG:  stats 2: 3200 + 824 = 4024
24239 2015-12-10 04:21:03.455 GMT LOG:  statement: vacuum pg_class
24236 2015-12-10 04:21:03.455 GMT LOG:  stats 1: 4024 + 32 = 4056
24236 2015-12-10 04:21:06.458 GMT LOG:  stats 12: 4056 + 88 = 4144
24236 2015-12-10 04:21:06.458 GMT LOG:  stats 1: 4144 + 32 = 4176
24239 2015-12-10 04:21:06.463 GMT LOG:  disconnection: session time: 
0:00:03.025 user=nmisch database=test host=[local]
24236 2015-12-10 04:21:09.486 GMT LOG:  stats 1: 4176 + 32 = 4208
24236 2015-12-10 04:21:12.503 GMT LOG:  stats 1: 4208 + 32 = 4240
24236 2015-12-10 04:21:15.519 GMT LOG:  stats 1: 4240 + 32 = 4272
24236 2015-12-10 04:21:18.535 GMT LOG:  stats 9: 4272 + 48 = 4320
24236 2015-12-10 04:21:18.535 GMT LOG:  stats 2: 4320 + 936 = 5256
24236 2015-12-10 04:21:18.535 GMT LOG:  stats 2: 5256 + 376 = 5632
24236 2015-12-10 04:21:18.535 GMT LOG:  stats 2: 5632 + 264 = 5896
24236 2015-12-10 04:21:18.535 GMT LOG:  stats 12: 5896 + 88 = 5984


As for how to fix this, the most direct translation of the old logic is to
retain last_statrequests entries that could help coalesce inquiries.  I lean
toward that for an initial, back-patched fix.  It would be good, though, to
process two closely-spaced, different-database inquiries in one
pgstat_write_statsfiles() call.  We do one-database writes and all-databases
writes, but we never write "1 < N < all" databases despite the code prepared
to do so.  I tried calling pgstat_write_statsfiles() only when the socket
receive buffer empties.  That's dead simple to implement and aggressively
coalesces inquiries (even a 45s sleep did not break stats.sql), but it starves
inquirers if the socket receive buffer stays full persistently.  Ideally, I'd
want to process inquiries when the buffer empties _or_ when the oldest inquiry
is X seconds old.  I don't have a more-specific design in mind, though.

Thanks,
nm

[1] 
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=shearwater&dt=2015-09-23%2002%3A08%3A31
[2] 
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl&dt=2015-08-04%2019%3A31%3A22

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index ab018c4..e6f04b0 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -3314,6 +3314,7 @@ pgstat_send_bgwriter(void)
 NON_EXEC_STATIC void
 PgstatCollectorMain(int argc, char *argv[])
 {
+       unsigned total = 0;
        int                     len;
        PgStat_Msg      msg;
        int                     wr;
@@ -3425,6 +3426,10 @@ PgstatCollectorMain(int argc, char *argv[])
                                                 errmsg("could not read 
statistics message: %m")));
                        }
 
+                       elog(LOG, "stats %d: %u + %u = %u",
+                                msg.msg_hdr.m_type, total, len, total + len);
+                       total += len;
+
                        /*
                         * We ignore messages that are smaller than our common 
header
                         */
@@ -3817,6 +3822,13 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
         */
        fputc('E', fpout);
 
+       if (1)
+       {
+               PG_SETMASK(&BlockSig);
+               pg_usleep(3 * 1000000L);
+               PG_SETMASK(&UnBlockSig);
+       }
+
        if (ferror(fpout))
        {
                ereport(LOG,
diff --git a/src/test/regress/expected/stats.out 
b/src/test/regress/expected/stats.out
index f5be70f..b042062 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -47,6 +47,9 @@ begin
       FROM pg_stat_user_tables AS st, pg_class AS cl, prevstats AS pr
      WHERE st.relname='tenk2' AND cl.relname='tenk2';
 
+    raise log 'stats updated as of % snapshot: %',
+      pg_stat_get_snapshot_timestamp(), updated;
+
     exit when updated;
 
     -- wait a little
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index cd2d592..e87454d 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -42,6 +42,9 @@ begin
       FROM pg_stat_user_tables AS st, pg_class AS cl, prevstats AS pr
      WHERE st.relname='tenk2' AND cl.relname='tenk2';
 
+    raise log 'stats updated as of % snapshot: %',
+      pg_stat_get_snapshot_timestamp(), updated;
+
     exit when updated;
 
     -- wait a little

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system

Reply via email to