Thanks for the --- as always --- valuable review! On Tue, 2020-10-13 at 17:55 -0500, Justin Pryzby wrote: > On Tue, Oct 13, 2020 at 01:44:41PM +0200, Laurenz Albe wrote: > > Attached is v3 with improvements. > > + <para> > + Time spent in database sessions in this database, in milliseconds. > + </para></entry> > > Should say "Total time spent *by* DB sessions..." ?
That is indeed better. Fixed. > I think these counters are only accurate as of the last state change, right? > So a session which has been idle for 1hr, that 1hr is not included. I think > the documentation should explain that, or (ideally) the implementation would > be > more precise. Maybe the timestamps should only be updated after a session > terminates (and the docs should say so). I agree, and I have added an explanation that the value doesn't include the duration of the current state. Of course it would be nice to have totally accurate values, but I think that the statistics are by nature inaccurate (datagrams can get lost), and more frequent statistics updates increase the work load. I don't think that is worth the effort. > + <entry role="catalog_table_entry"><para role="column_definition"> > + <structfield>connections</structfield> <type>bigint</type> > + </para> > + <para> > + Number of connections established to this database. > > *Total* number of connections established, otherwise it sounds like it might > mean "the number of sessions [currently] established". Fixed like that. > + Number of database sessions to this database that did not end > + with a regular client disconnection. > > Does that mean "sessions which ended irregularly" ? Or does it also include > "sessions which have not ended" ? I have added an explanation for that. > + msg.m_aborted = (!disconnect || pgStatSessionDisconnected) ? 0 : 1; > > I think this can be just: > msg.m_aborted = (bool) (disconnect && !pgStatSessionDisconnected); I mulled over this and finally decided to leave it as it is. Since "m_aborted" gets added to the total counter, I'd prefer to have it be an "int". Your proposed code works (the cast is actually not necessary, right?). But I think that my version is more readable if you think of "m_aborted" as a counter rather than a flag. > + if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL) > + result = 0; > + else > + result = ((double) dbentry->n_session_time) / 1000.0; > > I think these can say: > > double result = 0; > > if ((dbentry=..) != NULL) > > result = (double) ..; > > That not only uses fewer LOC, but also the assignment to zero is (known to be) > done at compile time (BSS) rather than runtime. I didn't know about the performance difference. Concise code (if readable) is good, so I changed the code like you propose. The code pattern is actually copied from neighboring functions, which then should also be changed like this, but that is outside the scope of this patch. Attached is v4 of the patch. Yours, Laurenz Albe
From 9e8bf3efd984306c73243736d0b4a4023cdd5f3a Mon Sep 17 00:00:00 2001 From: Laurenz Albe <laurenz.a...@cybertec.at> Date: Wed, 14 Oct 2020 11:08:20 +0200 Subject: [PATCH] Add session statistics to pg_stat_database If "track_counts" is active, track the following per database: - total number of connections - number of sessions that ended other than with a client disconnect - total time spent in database sessions - total time spent executing queries - total idle in transaction time This is useful to check if connection pooling is working. It also helps to estimate the size of the connection pool required to keep the database busy, which depends on the percentage of the transaction time that is spent idling. Discussion: https://postgr.es/m/b07e1f9953701b90c66ed368656f2aef40cac4fb.ca...@cybertec.at Reviewed-By: Soumyadeep Chakraborty, Justin Pryzby, Masahiro Ikeda (This requires a catversion bump, as well as an update to PGSTAT_FILE_FORMAT_ID) --- doc/src/sgml/monitoring.sgml | 49 +++++++++++++ src/backend/catalog/system_views.sql | 5 ++ src/backend/postmaster/pgstat.c | 105 ++++++++++++++++++++++++++- src/backend/tcop/postgres.c | 5 ++ src/backend/utils/adt/pgstatfuncs.c | 68 +++++++++++++++++ src/include/catalog/pg_proc.dat | 20 +++++ src/include/pgstat.h | 27 +++++++ src/test/regress/expected/rules.out | 5 ++ 8 files changed, 283 insertions(+), 1 deletion(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 66566765f0..a50fc025d5 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -3663,6 +3663,55 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i </para></entry> </row> + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>session_time</structfield> <type>double precision</type> + </para> + <para> + Time spent by database sessions in this database, in milliseconds + (note that statistics are only updated when the state of a session + changes, so if sessions have been idle for a long time, this idle time + won't be included) + </para></entry> + </row> + + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>active_time</structfield> <type>double precision</type> + </para> + <para> + Time spent executing SQL statements in this database, in milliseconds + </para></entry> + </row> + + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>idle_in_transaction_time</structfield> <type>double precision</type> + </para> + <para> + Time spent idling while in a transaction in this database, in milliseconds + </para></entry> + </row> + + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>connections</structfield> <type>bigint</type> + </para> + <para> + Total number of connections established to this database + </para></entry> + </row> + + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>aborted_sessions</structfield> <type>bigint</type> + </para> + <para> + Number of database sessions to this database that were terminated + by something else than a regular client disconnection + </para></entry> + </row> + <row> <entry role="catalog_table_entry"><para role="column_definition"> <structfield>stats_reset</structfield> <type>timestamp with time zone</type> diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index c29390760f..8db677948c 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -921,6 +921,11 @@ CREATE VIEW pg_stat_database AS pg_stat_get_db_checksum_last_failure(D.oid) AS checksum_last_failure, pg_stat_get_db_blk_read_time(D.oid) AS blk_read_time, pg_stat_get_db_blk_write_time(D.oid) AS blk_write_time, + pg_stat_get_db_session_time(D.oid) AS session_time, + pg_stat_get_db_active_time(D.oid) AS active_time, + pg_stat_get_db_idle_in_transaction_time(D.oid) AS idle_in_transaction_time, + pg_stat_get_db_connections(D.oid) AS connections, + pg_stat_get_db_aborted_sessions(D.oid) AS aborted_sessions, pg_stat_get_db_stat_reset_time(D.oid) AS stats_reset FROM ( SELECT 0 AS oid, NULL::name AS datname diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index 822f0ebc62..37822fe49a 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -249,6 +249,9 @@ static int pgStatXactCommit = 0; static int pgStatXactRollback = 0; PgStat_Counter pgStatBlockReadTime = 0; PgStat_Counter pgStatBlockWriteTime = 0; +static PgStat_Counter pgStatActiveTime = 0; +static PgStat_Counter pgStatTransactionIdleTime = 0; +bool pgStatSessionDisconnected = false; /* Record that's written to 2PC state file when pgstat state is persisted */ typedef struct TwoPhasePgStatRecord @@ -334,6 +337,7 @@ static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg); static void pgstat_send_funcstats(void); static void pgstat_send_slru(void); static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid); +static void pgstat_send_connstats(bool disconnect, TimestampTz last_report); static PgStat_TableStatus *get_tabstat_entry(Oid rel_id, bool isshared); @@ -369,6 +373,7 @@ static void pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len); static void pgstat_recv_recoveryconflict(PgStat_MsgRecoveryConflict *msg, int len); static void pgstat_recv_deadlock(PgStat_MsgDeadlock *msg, int len); static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len); +static void pgstat_recv_connection(PgStat_MsgConn *msg, int len); static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len); static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len); @@ -862,7 +867,7 @@ pgstat_report_stat(bool force) /* Don't expend a clock check if nothing to do */ if ((pgStatTabList == NULL || pgStatTabList->tsa_used == 0) && pgStatXactCommit == 0 && pgStatXactRollback == 0 && - !have_function_stats) + !have_function_stats && !force) return; /* @@ -873,6 +878,11 @@ pgstat_report_stat(bool force) if (!force && !TimestampDifferenceExceeds(last_report, now, PGSTAT_STAT_INTERVAL)) return; + + /* for backends, send connection statistics */ + if (MyBackendType == B_BACKEND) + pgstat_send_connstats(force, last_report); + last_report = now; /* @@ -1341,6 +1351,48 @@ pgstat_drop_relation(Oid relid) #endif /* NOT_USED */ +/* ---------- + * pgstat_send_connstats() - + * + * Tell the collector about session statistics. + * The parameter "disconnect" will be true when the session ends. + * "last_report" is the last time we were called (0 if never). + * ---------- + */ +static void +pgstat_send_connstats(bool disconnect, TimestampTz last_report) +{ + PgStat_MsgConn msg; + long secs; + int usecs; + + if (pgStatSock == PGINVALID_SOCKET || !pgstat_track_counts) + return; + + pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_CONNECTION); + msg.m_databaseid = MyDatabaseId; + + /* session time since the last report */ + TimestampDifference(((last_report == 0) ? MyStartTimestamp : last_report), + GetCurrentTimestamp(), + &secs, &usecs); + msg.m_session_time = secs * 1000000 + usecs; + + msg.m_aborted = (!disconnect || pgStatSessionDisconnected) ? 0 : 1; + + msg.m_active_time = pgStatActiveTime; + pgStatActiveTime = 0; + + msg.m_idle_in_xact_time = pgStatTransactionIdleTime; + pgStatTransactionIdleTime = 0; + + /* report a new session only the first time */ + msg.m_count = (last_report == 0) ? 1 : 0; + + pgstat_send(&msg, sizeof(PgStat_MsgConn)); +} + + /* ---------- * pgstat_reset_counters() - * @@ -3324,6 +3376,28 @@ pgstat_report_activity(BackendState state, const char *cmd_str) } current_timestamp = GetCurrentTimestamp(); + /* + * If the state has changed from "active" or "idle in transaction", + * calculate the duration. + */ + if ((beentry->st_state == STATE_RUNNING || + beentry->st_state == STATE_IDLEINTRANSACTION || + beentry->st_state == STATE_IDLEINTRANSACTION_ABORTED) && + state != beentry->st_state) + { + long secs; + int usecs; + + TimestampDifference(beentry->st_state_start_timestamp, + current_timestamp, + &secs, &usecs); + + if (beentry->st_state == STATE_RUNNING) + pgStatActiveTime += secs * 1000000 + usecs; + else + pgStatTransactionIdleTime += secs * 1000000 + usecs; + } + /* * Now update the status entry */ @@ -4876,6 +4950,10 @@ PgstatCollectorMain(int argc, char *argv[]) pgstat_recv_replslot(&msg.msg_replslot, len); break; + case PGSTAT_MTYPE_CONNECTION: + pgstat_recv_connection(&msg.msg_conn, len); + break; + default: break; } @@ -4950,6 +5028,11 @@ reset_dbentry_counters(PgStat_StatDBEntry *dbentry) dbentry->last_checksum_failure = 0; dbentry->n_block_read_time = 0; dbentry->n_block_write_time = 0; + dbentry->n_connections = 0; + dbentry->n_session_time = 0; + dbentry->n_active_time = 0; + dbentry->n_idle_in_xact_time = 0; + dbentry->n_aborted = 0; dbentry->stat_reset_timestamp = GetCurrentTimestamp(); dbentry->stats_timestamp = 0; @@ -6895,6 +6978,26 @@ pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len) } } +/* ---------- + * pgstat_recv_connection() - + * + * Process connection information. + * ---------- + */ +static void +pgstat_recv_connection(PgStat_MsgConn *msg, int len) +{ + PgStat_StatDBEntry *dbentry; + + dbentry = pgstat_get_db_entry(msg->m_databaseid, true); + + dbentry->n_connections += msg->m_count; + dbentry->n_session_time += msg->m_session_time; + dbentry->n_active_time += msg->m_active_time; + dbentry->n_idle_in_xact_time += msg->m_idle_in_xact_time; + dbentry->n_aborted += msg->m_aborted; +} + /* ---------- * pgstat_recv_tempfile() - * diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 411cfadbff..888ffea1cd 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -4498,6 +4498,11 @@ PostgresMain(int argc, char *argv[], * perform normal shutdown. */ case 'X': + /* report as normal client disconnection */ + pgStatSessionDisconnected = true; + + /* FALLTHROUGH */ + case EOF: /* diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c index 0d0d2e6d2b..657ab19492 100644 --- a/src/backend/utils/adt/pgstatfuncs.c +++ b/src/backend/utils/adt/pgstatfuncs.c @@ -1629,6 +1629,74 @@ pg_stat_get_db_blk_write_time(PG_FUNCTION_ARGS) PG_RETURN_FLOAT8(result); } +Datum +pg_stat_get_db_session_time(PG_FUNCTION_ARGS) +{ + Oid dbid = PG_GETARG_OID(0); + double result = 0.0; + PgStat_StatDBEntry *dbentry; + + /* convert counter from microsec to millisec for display */ + if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) != NULL) + result = ((double) dbentry->n_session_time) / 1000.0; + + PG_RETURN_FLOAT8(result); +} + +Datum +pg_stat_get_db_active_time(PG_FUNCTION_ARGS) +{ + Oid dbid = PG_GETARG_OID(0); + double result = 0.0; + PgStat_StatDBEntry *dbentry; + + /* convert counter from microsec to millisec for display */ + if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) != NULL) + result = ((double) dbentry->n_active_time) / 1000.0; + + PG_RETURN_FLOAT8(result); +} + +Datum +pg_stat_get_db_idle_in_transaction_time(PG_FUNCTION_ARGS) +{ + Oid dbid = PG_GETARG_OID(0); + double result = 0.0; + PgStat_StatDBEntry *dbentry; + + /* convert counter from microsec to millisec for display */ + if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) != NULL) + result = ((double) dbentry->n_idle_in_xact_time) / 1000.0; + + PG_RETURN_FLOAT8(result); +} + +Datum +pg_stat_get_db_connections(PG_FUNCTION_ARGS) +{ + Oid dbid = PG_GETARG_OID(0); + int64 result = 0.0; + PgStat_StatDBEntry *dbentry; + + if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) != NULL) + result = (int64) (dbentry->n_connections); + + PG_RETURN_INT64(result); +} + +Datum +pg_stat_get_db_aborted_sessions(PG_FUNCTION_ARGS) +{ + Oid dbid = PG_GETARG_OID(0); + int64 result = 0.0; + PgStat_StatDBEntry *dbentry; + + if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) != NULL) + result = (int64) (dbentry->n_aborted); + + PG_RETURN_INT64(result); +} + Datum pg_stat_get_bgwriter_timed_checkpoints(PG_FUNCTION_ARGS) { diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 22340baf1c..95f65eb95e 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -5430,6 +5430,26 @@ proname => 'pg_stat_get_db_blk_write_time', provolatile => 's', proparallel => 'r', prorettype => 'float8', proargtypes => 'oid', prosrc => 'pg_stat_get_db_blk_write_time' }, +{ oid => '9575', descr => 'statistics: session time, in seconds', + proname => 'pg_stat_get_db_session_time', provolatile => 's', + proparallel => 'r', prorettype => 'float8', proargtypes => 'oid', + prosrc => 'pg_stat_get_db_session_time' }, +{ oid => '9576', descr => 'statistics: session active time, in seconds', + proname => 'pg_stat_get_db_active_time', provolatile => 's', + proparallel => 'r', prorettype => 'float8', proargtypes => 'oid', + prosrc => 'pg_stat_get_db_active_time' }, +{ oid => '9577', descr => 'statistics: session idle in transaction time, in seconds', + proname => 'pg_stat_get_db_idle_in_transaction_time', provolatile => 's', + proparallel => 'r', prorettype => 'float8', proargtypes => 'oid', + prosrc => 'pg_stat_get_db_idle_in_transaction_time' }, +{ oid => '9578', descr => 'statistics: number of connections', + proname => 'pg_stat_get_db_connections', provolatile => 's', + proparallel => 'r', prorettype => 'int8', proargtypes => 'oid', + prosrc => 'pg_stat_get_db_connections' }, +{ oid => '9579', descr => 'statistics: number of aborted sessions', + proname => 'pg_stat_get_db_aborted_sessions', provolatile => 's', + proparallel => 'r', prorettype => 'int8', proargtypes => 'oid', + prosrc => 'pg_stat_get_db_aborted_sessions' }, { oid => '3195', descr => 'statistics: information about WAL archiver', proname => 'pg_stat_get_archiver', proisstrict => 'f', provolatile => 's', proparallel => 'r', prorettype => 'record', proargtypes => '', diff --git a/src/include/pgstat.h b/src/include/pgstat.h index a821ff4f15..97064cde05 100644 --- a/src/include/pgstat.h +++ b/src/include/pgstat.h @@ -71,6 +71,7 @@ typedef enum StatMsgType PGSTAT_MTYPE_DEADLOCK, PGSTAT_MTYPE_CHECKSUMFAILURE, PGSTAT_MTYPE_REPLSLOT, + PGSTAT_MTYPE_CONNECTION, } StatMsgType; /* ---------- @@ -616,6 +617,21 @@ typedef struct PgStat_MsgChecksumFailure TimestampTz m_failure_time; } PgStat_MsgChecksumFailure; +/* ---------- + * PgStat_MsgConn Sent by pgstat_connection to update connection statistics. + * ---------- + */ +typedef struct PgStat_MsgConn +{ + PgStat_MsgHdr m_hdr; + Oid m_databaseid; + PgStat_Counter m_count; + PgStat_Counter m_session_time; + PgStat_Counter m_active_time; + PgStat_Counter m_idle_in_xact_time; + PgStat_Counter m_aborted; +} PgStat_MsgConn; + /* ---------- * PgStat_Msg Union over all possible messages. @@ -648,6 +664,7 @@ typedef union PgStat_Msg PgStat_MsgTempFile msg_tempfile; PgStat_MsgChecksumFailure msg_checksumfailure; PgStat_MsgReplSlot msg_replslot; + PgStat_MsgConn msg_conn; } PgStat_Msg; @@ -690,6 +707,11 @@ typedef struct PgStat_StatDBEntry TimestampTz last_checksum_failure; PgStat_Counter n_block_read_time; /* times in microseconds */ PgStat_Counter n_block_write_time; + PgStat_Counter n_connections; + PgStat_Counter n_session_time; + PgStat_Counter n_active_time; + PgStat_Counter n_idle_in_xact_time; + PgStat_Counter n_aborted; TimestampTz stat_reset_timestamp; TimestampTz stats_timestamp; /* time of db stats file update */ @@ -1341,6 +1363,11 @@ extern PgStat_MsgWal WalStats; extern PgStat_Counter pgStatBlockReadTime; extern PgStat_Counter pgStatBlockWriteTime; +/* + * Updated in PostgresMain upon disconnect. + */ +extern bool pgStatSessionDisconnected; + /* ---------- * Functions called from postmaster * ---------- diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index cf2a9b4408..a8b9539e17 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -1848,6 +1848,11 @@ pg_stat_database| SELECT d.oid AS datid, pg_stat_get_db_checksum_last_failure(d.oid) AS checksum_last_failure, pg_stat_get_db_blk_read_time(d.oid) AS blk_read_time, pg_stat_get_db_blk_write_time(d.oid) AS blk_write_time, + pg_stat_get_db_session_time(d.oid) AS session_time, + pg_stat_get_db_active_time(d.oid) AS active_time, + pg_stat_get_db_idle_in_transaction_time(d.oid) AS idle_in_transaction_time, + pg_stat_get_db_connections(d.oid) AS connections, + pg_stat_get_db_aborted_sessions(d.oid) AS aborted_sessions, pg_stat_get_db_stat_reset_time(d.oid) AS stats_reset FROM ( SELECT 0 AS oid, NULL::name AS datname -- 2.26.2