10.08.2024, 16:40, "Kirill Reshke" <reshkekir...@gmail.com>:

On Thu, 4 Jul 2024 at 21:46, Sergey Solovev
<sergey.solov...@tantorlabs.ru> wrote:


 Hi.

 We encountered a problem with excessive logging when transaction is
 sampled.
 When it is getting sampled every statement is logged, event SELECT. This
 can
 lead to performance degradation and log polluting.
 I have added new setting to filter statements when transaction is
 sampled - log_transaction.

 Overview
 ===================
 This parameter works very similar to log_statement, but it filters out
 statements only
 in sampled transaction. It has same values as log_statement, access
 rights (only superuser),
 setup in postgresql.conf or by superuser with SET.
 But by default it has value "all" in compliance with existing behaviour
 (to log every statement).

 Example usage
 ==================
 Log every DDL, but only subset of other statements.

 postgresql.conf
  > log_transaction = ddl
  > log_statement = all
  > log_transaction_sample_rate = 1
  > log_statement_sample_rate = 0.3

 backend:
  > BEGIN;
  > CREATE TABLE t1(value text);
  > INSERT INTO t1(value) VALUES ('hello'), ('world');
  > SELECT * FROM t1;
  > DROP TABLE t1;
  > COMMIT;

 logfile:
  > LOG: duration: 6.465 ms statement: create table t1(value text);
  > LOG: statement: insert into t1(value) values ('hello'), ('world');
  > LOG: duration: 6.457 ms statement: drop table t1;

 As you can see CREATE and DROP were logged with duration (because it is
 DDL), but
 only INSERT was logged (without duration) - not SELECT.

 Testing
 ===================
 All tests are passed - default configuration is compatible with existing
 behaviour.
 Honestly, I could not find any TAP/regress tests that would test logging
 - some
 tests only use logs to ensure test results.
 I did not find suitable place for such tests, so no tests provided

 Implementation details
 ===================
 I have modified signature of check_log_duration function - this accepts
 List of
 statements that were executed. This is to decide whether we should log
 current statement if transaction is sampled.
 But this list can be empty, in that case we decide to log. NIL is passed
 only
 in fast path, PARSE and BIND functions - by default these are logged
 if transaction is sampled, so we are compliant with existing behaviour
 again.
 In first implementation version, there was boolean flag (instead of
 List), but
 it was replaced by List to defer determining (function call) whether it
 is worth logging.

 -------------
 Best regards, Sergey Solovev


Hi!

As I understand, the need of this GUC variable comes from fact, that
is the transaction is sampled for logging, all statements within this
tx are logged and this is not configurable now, correct?
Well, if this is the case, why should we add a new GUC? Maybe we
should use `log_statement` in this case too, so there is a bug, that
log_statement is honored not during tx sampling?


Also, tests are failing[1]


[1] https://cirrus-ci.com/task/5645711230894080
 

--
Best regards,
Kirill Reshke
Hi, Kirill!
 
Thanks for review.
Also, tests are failing[1]
That is my mistake: I have tested patch on REL_17BETA2, but set 18 version, and
it seems that I run tests without proper ./configure flags. I reproduced error and fixed
it in new patch.
As I understand, the need of this GUC variable comes from fact, that
is the transaction is sampled for logging, all statements within this
tx are logged and this is not configurable now, correct?
Yes. That's the point - all or nothing.
Well, if this is the case, why should we add a new GUC? Maybe we
should use `log_statement` in this case too, so there is a bug, that
log_statement is honored not during tx sampling?
That can be good solution, but I proceed from possibility of flexible logging setup.
Shown example (when all DDL statements were logged) is just one of such use cases.
Another example is security policy: when we must log every data modification -
set `log_transaction = mod`.
From b5e779771e8a7582951aff6f43a716e9e5975884 Mon Sep 17 00:00:00 2001
From: "Sergey Solovev" <sergey.solov...@tantorlabs.ru>
Date: Thu, 4 Jul 2024 17:02:13 +0300
Subject: [PATCH] Add log_transaction configuration setting

This setting has same behaviour as log_statement - it
filter out statements that should not be logged. But
works with trasactions, that were sampled using
log_transaction_sample_rate setting.
Such feature might be helpful to avoid excessive log
messages count. At least, filter out SELECTs.
---
 doc/src/sgml/config.sgml            | 23 +++++++++++++
 src/backend/tcop/fastpath.c         |  2 +-
 src/backend/tcop/postgres.c         | 51 +++++++++++++++++++++++++----
 src/backend/utils/misc/guc_tables.c | 11 +++++++
 src/include/tcop/tcopprot.h         |  3 +-
 5 files changed, 81 insertions(+), 9 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 17d84bd321..41939b4e3b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7049,6 +7049,29 @@ local0.*    /var/log/postgresql
        </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-log-transaction" xreflabel="log_transaction">
+      <term><varname>log_transaction</varname> (<type>enum</type>)
+      <indexterm>
+       <primary><varname>log_transaction</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Controls which SQL statements in sampled transaction are logged. Transaction is
+        getting sampled according to <xref linkend="guc-log-transaction-sample-rate"/> setting.
+        Valid values are same as for <xref linkend="guc-log-statement"/>: <literal>none</literal> (off),
+        <literal>ddl</literal>, <literal>mod</literal>, and <literal>all</literal> (all statements).
+        Behaviour also similar to <xref linkend="guc-log-statement"/>.
+       </para>
+
+       <para>
+        The default is <literal>all</literal>.
+        Only superusers and users with the appropriate <literal>SET</literal>
+        privilege can change this setting.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-log-startup-progress-interval" xreflabel="log_startup_progress_interval">
       <term><varname>log_startup_progress_interval</varname> (<type>integer</type>)
       <indexterm>
diff --git a/src/backend/tcop/fastpath.c b/src/backend/tcop/fastpath.c
index deac71335c..3293f75400 100644
--- a/src/backend/tcop/fastpath.c
+++ b/src/backend/tcop/fastpath.c
@@ -306,7 +306,7 @@ HandleFunctionRequest(StringInfo msgBuf)
 	/*
 	 * Emit duration logging if appropriate.
 	 */
-	switch (check_log_duration(msec_str, was_logged))
+	switch (check_log_duration(msec_str, was_logged, NIL))
 	{
 		case 1:
 			ereport(LOG,
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index e39c6804a7..7f264fa07b 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -94,6 +94,8 @@ bool		Log_disconnections = false;
 
 int			log_statement = LOGSTMT_NONE;
 
+int			log_transaction = LOGSTMT_ALL;
+
 /* GUC variable for maximum stack depth (measured in kilobytes) */
 int			max_stack_depth = 100;
 
@@ -180,6 +182,7 @@ static int	SocketBackend(StringInfo inBuf);
 static int	ReadCommand(StringInfo inBuf);
 static void forbidden_in_wal_sender(char firstchar);
 static bool check_log_statement(List *stmt_list);
+static bool	check_log_transaction_statement(List *stmt_list);
 static int	errdetail_execute(List *raw_parsetree_list);
 static int	errdetail_params(ParamListInfo params);
 static int	errdetail_abort(void);
@@ -1358,7 +1361,7 @@ exec_simple_query(const char *query_string)
 	/*
 	 * Emit duration logging if appropriate.
 	 */
-	switch (check_log_duration(msec_str, was_logged))
+	switch (check_log_duration(msec_str, was_logged, parsetree_list))
 	{
 		case 1:
 			ereport(LOG,
@@ -1594,7 +1597,7 @@ exec_parse_message(const char *query_string,	/* string to execute */
 	/*
 	 * Emit duration logging if appropriate.
 	 */
-	switch (check_log_duration(msec_str, false))
+	switch (check_log_duration(msec_str, false, NIL))
 	{
 		case 1:
 			ereport(LOG,
@@ -2048,7 +2051,7 @@ exec_bind_message(StringInfo input_message)
 	/*
 	 * Emit duration logging if appropriate.
 	 */
-	switch (check_log_duration(msec_str, false))
+	switch (check_log_duration(msec_str, false, NIL))
 	{
 		case 1:
 			ereport(LOG,
@@ -2298,7 +2301,7 @@ exec_execute_message(const char *portal_name, long max_rows)
 	/*
 	 * Emit duration logging if appropriate.
 	 */
-	switch (check_log_duration(msec_str, was_logged))
+	switch (check_log_duration(msec_str, was_logged, portal->stmts))
 	{
 		case 1:
 			ereport(LOG,
@@ -2358,6 +2361,27 @@ check_log_statement(List *stmt_list)
 	return false;
 }
 
+static bool
+check_log_transaction_statement(List *stmt_list)
+{
+	ListCell *stmt_item;
+	if (log_transaction == LOGSTMT_NONE)
+		return false;
+	if (log_transaction == LOGSTMT_ALL)
+		return true;
+
+	/* Else we have to inspect the statement(s) to see whether to log */
+	foreach (stmt_item, stmt_list)
+	{
+		Node *stmt = (Node *)lfirst(stmt_item);
+
+		if (GetCommandLogLevel(stmt) <= log_transaction)
+			return true;
+	}
+
+	return false;
+}
+
 /*
  * check_log_duration
  *		Determine whether current command's duration should be logged
@@ -2374,9 +2398,13 @@ check_log_statement(List *stmt_list)
  *
  * was_logged should be true if caller already logged query details (this
  * essentially prevents 2 from being returned).
+ *
+ * stmts can be either raw grammar output or a list of planned
+ * statements that were executed during current transaction or NIL if
+ * tag could not be determined.
  */
 int
-check_log_duration(char *msec_str, bool was_logged)
+check_log_duration(char *msec_str, bool was_logged, List *stmts)
 {
 	if (log_duration || log_min_duration_sample >= 0 ||
 		log_min_duration_statement >= 0 || xact_is_sampled)
@@ -2418,11 +2446,20 @@ check_log_duration(char *msec_str, bool was_logged)
 				(log_statement_sample_rate == 1 ||
 				 pg_prng_double(&pg_global_prng_state) <= log_statement_sample_rate);
 
-		if (exceeded_duration || in_sample || log_duration || xact_is_sampled)
+		/*
+		 * Check than current statement should be logged as transaction sampled statement according
+		 * to log_transaction paramenter. Some statements' tags could not be determined (like PARSE or BIND),
+		 * for this kinds of statements NIL passed - this is indicator to log it without paramenter check
+		 * for compatibility with old code.
+		 */
+		if (!in_sample)
+			in_sample = (stmts == NIL || check_log_transaction_statement(stmts)) && xact_is_sampled;
+
+		if (exceeded_duration || in_sample || log_duration)
 		{
 			snprintf(msec_str, 32, "%ld.%03d",
 					 secs * 1000 + msecs, usecs % 1000);
-			if ((exceeded_duration || in_sample || xact_is_sampled) && !was_logged)
+			if ((exceeded_duration || in_sample) && !was_logged)
 				return 2;
 			else
 				return 1;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d28b0bcb40..a6f80550cb 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -4864,6 +4864,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"log_transaction", PGC_SUSET, LOGGING_WHAT,
+			gettext_noop("Sets the type of statements logged when transaction is sampled."),
+			NULL
+		},
+		&log_transaction,
+		LOGSTMT_ALL, log_statement_options,
+		NULL, NULL, NULL
+	},
+
+
 	{
 		{"syslog_facility", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Sets the syslog \"facility\" to be used when syslog enabled."),
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 643ce9cffa..a1a4474b1c 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -42,6 +42,7 @@ typedef enum
 
 extern PGDLLIMPORT bool Log_disconnections;
 extern PGDLLIMPORT int log_statement;
+extern PGDLLIMPORT int log_transaction;
 
 /* Flags for restrict_nonsystem_relation_kind value */
 #define RESTRICT_RELKIND_VIEW			0x01
@@ -82,7 +83,7 @@ extern void PostgresMain(const char *dbname,
 extern long get_stack_depth_rlimit(void);
 extern void ResetUsage(void);
 extern void ShowUsage(const char *title);
-extern int	check_log_duration(char *msec_str, bool was_logged);
+extern int	check_log_duration(char *msec_str, bool was_logged, List *stmts);
 extern void set_debug_options(int debug_flag,
 							  GucContext context, GucSource source);
 extern bool set_plan_disabling_options(const char *arg,
-- 
2.34.1

Reply via email to