Re: Reduce useless changes before reassembly during logical replication

Andy Fan Thu, 22 Feb 2024 00:37:30 -0800

Hi, 
>
> This is all true but note that in successful cases (where the table is
> published) all the work done by FilterByTable(accessing caches,
> transaction-related stuff) can add noticeable overhead as anyway we do
> that later in pgoutput_change(). I think I gave the same comment
> earlier as well but didn't see any satisfactory answer or performance
> data for successful cases to back this proposal.


I did some benchmark yesterday at [1] and found it adds 20% cpu time.
then come out a basic idea, I think it deserves a share. "transaction
related stuff" comes from the syscache/systable access except the
HistorySansphot. and the syscache is required in the following
sistuations: 

1. relfilenode (from wal) -> relid.
2. relid -> namespaceid (to check if the relid is a toast relation).
3. if toast, get its origianl relid.
4. access the data from pg_publication_tables.
5. see if the relid is a partition, if yes, we may get its root
relation.

Acutally we already has a RelationSyncCache for #4, and it *only* need
to access syscache when replicate_valid is false, I think this case
should be rare, but the caller doesn't know it, so the caller must
prepare the transaction stuff in advance even in the most case they are
not used. So I think we can get a optimization here.

then the attached patch is made.

Author: yizhi.fzh <yizhi....@alibaba-inc.com>
Date:   Wed Feb 21 18:40:03 2024 +0800

    Make get_rel_sync_entry less depending on transaction state.
    
    get_rel_sync_entry needs transaction only a replicate_valid = false
    entry is found, this should be some rare case. However the caller can't
    know if a entry is valid, so they have to prepare the transaction state
    before calling this function. Such preparation is expensive.
    
    This patch makes the get_rel_sync_entry can manage a transaction stage
    only if necessary. so the callers don't need to prepare it blindly.

Then comes to #1, acutally we have RelfilenumberMapHash as a cache, when
the cache is hit (suppose this is a usual case), no transaction stuff
related.  I have two ideas then:

1. Optimize the cache hit sistuation like what we just did for
get_rel_sync_entry for the all the 5 kinds of data and only pay the
effort for cache miss case. for the data for #2, #3, #5, all the keys
are relid, so I think a same HTAB should be OK.

2. add the content for #1, #2, #3, #5 to wal when wal_level is set to
logical. 

In either case, the changes for get_rel_sync_entry should be needed. 

> Note, users can
> configure to stream_in_progress transactions in which case they
> shouldn't see such a big problem.

People would see the changes is spilled to disk, but the CPU cost for
Reorder should be still paid I think. 

[1] https://www.postgresql.org/message-id/87o7cadqj3.fsf%40163.com

-- 
Best Regards
Andy Fan

>From 90b8df330df049bd1d7e881dc6e9b108c17b0924 Mon Sep 17 00:00:00 2001
From: "yizhi.fzh" <yizhi....@alibaba-inc.com>
Date: Wed, 21 Feb 2024 18:40:03 +0800
Subject: [PATCH v1 1/1] Make get_rel_sync_entry less depending on transaction
 state.

get_rel_sync_entry needs transaction only a replicate_valid = false
entry is found, this should be some rare case. However the caller can't
know if a entry is valid, so they have to prepare the transaction state
before calling this function. Such preparation is expensive.

This patch makes the get_rel_sync_entry can manage a transaction stage
only if necessary. so the callers don't need to prepare it blindly.
---
 src/backend/replication/pgoutput/pgoutput.c | 60 ++++++++++++++++++++-
 1 file changed, 59 insertions(+), 1 deletion(-)

diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 998f92d671..25e55590a2 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,6 +13,7 @@
 #include "postgres.h"
 
 #include "access/tupconvert.h"
+#include "access/relation.h"
 #include "catalog/partition.h"
 #include "catalog/pg_publication.h"
 #include "catalog/pg_publication_rel.h"
@@ -214,6 +215,11 @@ static void init_rel_sync_cache(MemoryContext cachectx);
 static void cleanup_rel_sync_cache(TransactionId xid, bool is_commit);
 static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data,
 											 Relation relation);
+static RelationSyncEntry *get_rel_sync_entry_by_relid(PGOutputData *data,
+													  Oid relid);
+static RelationSyncEntry *get_rel_sync_entry_internal(PGOutputData *data,
+													  Relation relation,
+													  Oid relid);
 static void rel_sync_cache_relation_cb(Datum arg, Oid relid);
 static void rel_sync_cache_publication_cb(Datum arg, int cacheid,
 										  uint32 hashvalue);
@@ -1962,11 +1968,29 @@ set_schema_sent_in_streamed_txn(RelationSyncEntry *entry, TransactionId xid)
  */
 static RelationSyncEntry *
 get_rel_sync_entry(PGOutputData *data, Relation relation)
+{
+	return get_rel_sync_entry_internal(data, relation, InvalidOid);
+}
+
+static RelationSyncEntry *
+__attribute__ ((unused))
+get_rel_sync_entry_by_relid(PGOutputData *data, Oid relid)
+{
+	return get_rel_sync_entry_internal(data, NULL, relid);
+}
+
+static RelationSyncEntry *
+get_rel_sync_entry_internal(PGOutputData *data, Relation relation, Oid oid)
 {
 	RelationSyncEntry *entry;
 	bool		found;
 	MemoryContext oldctx;
-	Oid			relid = RelationGetRelid(relation);
+	Oid			relid = OidIsValid(oid) ? oid: RelationGetRelid(relation);
+	bool		started_xact = false, using_subtxn;
+
+	/* either oid or relation is provided. */
+	Assert(OidIsValid(oid) || RelationIsValid(relation));
+	Assert(!(OidIsValid(oid) && RelationIsValid(relation)));
 
 	Assert(RelationSyncCache != NULL);
 
@@ -1993,6 +2017,23 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
 		entry->attrmap = NULL;
 	}
 
+	if (!entry->replicate_valid && !IsTransactionOrTransactionBlock())
+	{
+		/*
+		 * Validating the entry needs to access syscache, which must
+		 * be in a transaction state, if that's not ready, start one.
+		 */
+
+		using_subtxn = IsTransactionOrTransactionBlock();
+
+		if (using_subtxn)
+			BeginInternalSubTransaction(__func__);
+		else
+			StartTransactionCommand();
+
+		started_xact = true;
+	}
+
 	/* Validate the entry */
 	if (!entry->replicate_valid)
 	{
@@ -2198,9 +2239,19 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
 		if (entry->pubactions.pubinsert || entry->pubactions.pubupdate ||
 			entry->pubactions.pubdelete)
 		{
+			bool rel_opened = false;
+
+			if (!RelationIsValid(relation))
+			{
+				relation = relation_open(oid, AccessShareLock);
+				rel_opened = true;
+			}
 			/* Initialize the tuple slot and map */
 			init_tuple_slot(data, relation, entry);
 
+			if (rel_opened)
+				relation_close(relation, AccessShareLock);
+
 			/* Initialize the row filter */
 			pgoutput_row_filter_init(data, rel_publications, entry);
 
@@ -2215,6 +2266,13 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
 		entry->replicate_valid = true;
 	}
 
+	if (started_xact)
+	{
+		AbortCurrentTransaction();
+		if (using_subtxn)
+			RollbackAndReleaseCurrentSubTransaction();
+	}
+
 	return entry;
 }
 
-- 
2.34.1

Re: Reduce useless changes before reassembly during logical replication

Reply via email to