On Mon, Feb 18, 2013 at 06:19:12PM -0300, Alvaro Herrera wrote: > I have pushed it now. Further testing, of course, is always welcome.
While investigating stats.sql buildfarm failures, mostly on animals axolotl and shearwater, I found that this patch (commit 187492b) inadvertently removed the collector's ability to coalesce inquiries. Every PGSTAT_MTYPE_INQUIRY received now causes one stats file write. Before, pgstat_recv_inquiry() did: if (msg->inquiry_time > last_statrequest) last_statrequest = msg->inquiry_time; and pgstat_write_statsfile() did: globalStats.stats_timestamp = GetCurrentTimestamp(); ... (work of writing a stats file) ... last_statwrite = globalStats.stats_timestamp; last_statrequest = last_statwrite; If the collector entered pgstat_write_statsfile() with more inquiries waiting in its socket receive buffer, it would ignore them as being too old once it finished the write and resumed message processing. Commit 187492b converted last_statrequest to a "last_statrequests" list that we wipe after each write. I modeled a machine with slow stats writes using the attached diagnostic patch (not for commit). It has pgstat_write_statsfiles() sleep just before renaming the temporary file, and it logs each stats message received. A three second delay causes stats.sql to fail the way it did on shearwater[1] and on axolotl[2]. Inquiries accumulate until the socket receive buffer overflows, at which point the socket drops stats messages whose effects we later test for. The 3s delay makes us drop some 85% of installcheck stats messages. Logs from a single VACUUM show receipt of five inquiries ("stats 1:") with 3s between them: 24239 2015-12-10 04:21:03.439 GMT LOG: connection authorized: user=nmisch database=test 24236 2015-12-10 04:21:03.454 GMT LOG: stats 2: 1888 + 936 = 2824 24236 2015-12-10 04:21:03.454 GMT LOG: stats 2: 2824 + 376 = 3200 24236 2015-12-10 04:21:03.454 GMT LOG: stats 2: 3200 + 824 = 4024 24239 2015-12-10 04:21:03.455 GMT LOG: statement: vacuum pg_class 24236 2015-12-10 04:21:03.455 GMT LOG: stats 1: 4024 + 32 = 4056 24236 2015-12-10 04:21:06.458 GMT LOG: stats 12: 4056 + 88 = 4144 24236 2015-12-10 04:21:06.458 GMT LOG: stats 1: 4144 + 32 = 4176 24239 2015-12-10 04:21:06.463 GMT LOG: disconnection: session time: 0:00:03.025 user=nmisch database=test host=[local] 24236 2015-12-10 04:21:09.486 GMT LOG: stats 1: 4176 + 32 = 4208 24236 2015-12-10 04:21:12.503 GMT LOG: stats 1: 4208 + 32 = 4240 24236 2015-12-10 04:21:15.519 GMT LOG: stats 1: 4240 + 32 = 4272 24236 2015-12-10 04:21:18.535 GMT LOG: stats 9: 4272 + 48 = 4320 24236 2015-12-10 04:21:18.535 GMT LOG: stats 2: 4320 + 936 = 5256 24236 2015-12-10 04:21:18.535 GMT LOG: stats 2: 5256 + 376 = 5632 24236 2015-12-10 04:21:18.535 GMT LOG: stats 2: 5632 + 264 = 5896 24236 2015-12-10 04:21:18.535 GMT LOG: stats 12: 5896 + 88 = 5984 As for how to fix this, the most direct translation of the old logic is to retain last_statrequests entries that could help coalesce inquiries. I lean toward that for an initial, back-patched fix. It would be good, though, to process two closely-spaced, different-database inquiries in one pgstat_write_statsfiles() call. We do one-database writes and all-databases writes, but we never write "1 < N < all" databases despite the code prepared to do so. I tried calling pgstat_write_statsfiles() only when the socket receive buffer empties. That's dead simple to implement and aggressively coalesces inquiries (even a 45s sleep did not break stats.sql), but it starves inquirers if the socket receive buffer stays full persistently. Ideally, I'd want to process inquiries when the buffer empties _or_ when the oldest inquiry is X seconds old. I don't have a more-specific design in mind, though. Thanks, nm [1] http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=shearwater&dt=2015-09-23%2002%3A08%3A31 [2] http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl&dt=2015-08-04%2019%3A31%3A22
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index ab018c4..e6f04b0 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -3314,6 +3314,7 @@ pgstat_send_bgwriter(void) NON_EXEC_STATIC void PgstatCollectorMain(int argc, char *argv[]) { + unsigned total = 0; int len; PgStat_Msg msg; int wr; @@ -3425,6 +3426,10 @@ PgstatCollectorMain(int argc, char *argv[]) errmsg("could not read statistics message: %m"))); } + elog(LOG, "stats %d: %u + %u = %u", + msg.msg_hdr.m_type, total, len, total + len); + total += len; + /* * We ignore messages that are smaller than our common header */ @@ -3817,6 +3822,13 @@ pgstat_write_statsfiles(bool permanent, bool allDbs) */ fputc('E', fpout); + if (1) + { + PG_SETMASK(&BlockSig); + pg_usleep(3 * 1000000L); + PG_SETMASK(&UnBlockSig); + } + if (ferror(fpout)) { ereport(LOG, diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out index f5be70f..b042062 100644 --- a/src/test/regress/expected/stats.out +++ b/src/test/regress/expected/stats.out @@ -47,6 +47,9 @@ begin FROM pg_stat_user_tables AS st, pg_class AS cl, prevstats AS pr WHERE st.relname='tenk2' AND cl.relname='tenk2'; + raise log 'stats updated as of % snapshot: %', + pg_stat_get_snapshot_timestamp(), updated; + exit when updated; -- wait a little diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql index cd2d592..e87454d 100644 --- a/src/test/regress/sql/stats.sql +++ b/src/test/regress/sql/stats.sql @@ -42,6 +42,9 @@ begin FROM pg_stat_user_tables AS st, pg_class AS cl, prevstats AS pr WHERE st.relname='tenk2' AND cl.relname='tenk2'; + raise log 'stats updated as of % snapshot: %', + pg_stat_get_snapshot_timestamp(), updated; + exit when updated; -- wait a little
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers