On Mon, Jan 11, 2021 at 10:56 AM Masahiko Sawada <sawada.m...@gmail.com> wrote:
> Agreed. Attached the updated patch.

Thanks for the updated patch. Looks like the comment crosses the 80
char limit, I have corrected it. And also changed the variable name
from remotetypeid to remotetypid, so that logicalrep_typmap_gettypname
will not cross the 80 char limit. And also added a commit message.
Attaching v3 patch, please have a look. Both make check and make
check-world passes.

> > I quickly searched in places where error callbacks are being used, I
> > think we need a similar kind of fix for conversion_error_callback in
> > postgres_fdw.c, because get_attname and get_rel_name are being used
> > which do catalogue lookups. ISTM that all the other error callbacks
> > are good in the sense that they are not doing sys catalogue lookups.
>
> Indeed. If we need to disallow the catalog lookup during executing
> error callbacks we might want to have an assertion checking that in
> SearchCatCacheInternal(), in addition to Assert(IsTransactionState()).

I tried to add(as attached in
v1-0001-Avoid-Sys-Cache-Lookups-in-Error-Callbacks.patch) the
Assert(!error_context_stack); in SearchCatCacheInternal, initdb itself
fails [1]. That means, we are doing a bunch of catalogue access from
error context callbacks. Given this, I'm not quite sure whether we can
have such an assertion in SearchCatCacheInternal.

Although unrelated to what we are discussing here -  when I looked at
SearchCatCacheInternal, I found that the function SearchCatCache has
SearchCatCacheInternal in the function comment, I think we should
correct it. Thoughts? If okay, I will post a separate patch for this
minor comment fix.

[1]
running bootstrap script ... TRAP:
FailedAssertion("error_context_stack", File: "catcache.c", Line: 1220,
PID: 310728)
/home/bharath/workspace/postgres/instnew/bin/postgres(ExceptionalCondition+0xd0)[0x56056984c8ba]
/home/bharath/workspace/postgres/instnew/bin/postgres(+0x76eb1a)[0x560569826b1a]
/home/bharath/workspace/postgres/instnew/bin/postgres(SearchCatCache1+0x3a)[0x5605698269af]
/home/bharath/workspace/postgres/instnew/bin/postgres(SearchSysCache1+0xc1)[0x5605698448b2]
/home/bharath/workspace/postgres/instnew/bin/postgres(get_typtype+0x1f)[0x56056982e389]
/home/bharath/workspace/postgres/instnew/bin/postgres(CheckAttributeType+0x29)[0x5605692bafe9]
/home/bharath/workspace/postgres/instnew/bin/postgres(CheckAttributeNamesTypes+0x2c9)[0x5605692bafac]
/home/bharath/workspace/postgres/instnew/bin/postgres(heap_create_with_catalog+0x11f)[0x5605692bc436]
/home/bharath/workspace/postgres/instnew/bin/postgres(boot_yyparse+0x7f0)[0x5605692a0d3f]
/home/bharath/workspace/postgres/instnew/bin/postgres(+0x1ecb36)[0x5605692a4b36]
/home/bharath/workspace/postgres/instnew/bin/postgres(AuxiliaryProcessMain+0x5e0)[0x5605692a4997]
/home/bharath/workspace/postgres/instnew/bin/postgres(main+0x268)[0x5605694c6bce]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fa2777ce0b3]
/home/bharath/workspace/postgres/instnew/bin/postgres(_start+0x2e)[0x5605691794de]

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
From a806415a49e3fc8c1e71a102aefb0c10c08deb05 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddy@enterprisedb.com>
Date: Mon, 11 Jan 2021 12:26:12 +0530
Subject: [PATCH v3] fix slot store error callback

Avoid accessing system catalogues inside slot_store_error_callback
error context callback, because the the entire transaction might
have been broken at this point.

Since logicalrep_typmap_gettypname() could search the sys cache by
calling to format_type_be(), store both local and remote type names
to SlotErrCallbackArg so that we can just set the names in the error
callback without sys cache lookup.

Author: Masahiko Sawada
---
 src/backend/replication/logical/worker.c | 50 +++++++++++++-----------
 1 file changed, 28 insertions(+), 22 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 1b1d70ed68..95f28c9acf 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -132,8 +132,9 @@ static dlist_head lsn_mapping = DLIST_STATIC_INIT(lsn_mapping);
 typedef struct SlotErrCallbackArg
 {
 	LogicalRepRelMapEntry *rel;
-	int			local_attnum;
 	int			remote_attnum;
+	char		*local_typename;
+	char		*remote_typename;
 } SlotErrCallbackArg;
 
 /*
@@ -430,30 +431,19 @@ static void
 slot_store_error_callback(void *arg)
 {
 	SlotErrCallbackArg *errarg = (SlotErrCallbackArg *) arg;
-	LogicalRepRelMapEntry *rel;
-	char	   *remotetypname;
-	Oid			remotetypoid,
-				localtypoid;
+	LogicalRepRelMapEntry *rel = errarg->rel;
 
 	/* Nothing to do if remote attribute number is not set */
 	if (errarg->remote_attnum < 0)
 		return;
 
-	rel = errarg->rel;
-	remotetypoid = rel->remoterel.atttyps[errarg->remote_attnum];
-
-	/* Fetch remote type name from the LogicalRepTypMap cache */
-	remotetypname = logicalrep_typmap_gettypname(remotetypoid);
-
-	/* Fetch local type OID from the local sys cache */
-	localtypoid = get_atttype(rel->localreloid, errarg->local_attnum + 1);
-
+	Assert(errarg->local_typename);
+	Assert(errarg->remote_typename);
 	errcontext("processing remote data for replication target relation \"%s.%s\" column \"%s\", "
 			   "remote type %s, local type %s",
 			   rel->remoterel.nspname, rel->remoterel.relname,
 			   rel->remoterel.attnames[errarg->remote_attnum],
-			   remotetypname,
-			   format_type_be(localtypoid));
+			   errarg->remote_typename, errarg->local_typename);
 }
 
 /*
@@ -474,8 +464,9 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
 
 	/* Push callback + info on the error context stack */
 	errarg.rel = rel;
-	errarg.local_attnum = -1;
 	errarg.remote_attnum = -1;
+	errarg.local_typename = NULL;
+	errarg.remote_typename = NULL;
 	errcallback.callback = slot_store_error_callback;
 	errcallback.arg = (void *) &errarg;
 	errcallback.previous = error_context_stack;
@@ -491,11 +482,17 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
 		if (!att->attisdropped && remoteattnum >= 0)
 		{
 			StringInfo	colvalue = &tupleData->colvalues[remoteattnum];
+			Oid remotetypid = rel->remoterel.atttyps[remoteattnum];
 
 			Assert(remoteattnum < tupleData->ncols);
 
-			errarg.local_attnum = i;
+			/*
+			 * Set the attribute number and both local and remote type names
+			 * from the local sys cache and LogicalRepTypMap respectively.
+			 */
 			errarg.remote_attnum = remoteattnum;
+			errarg.local_typename = format_type_be(att->atttypid);
+			errarg.remote_typename = logicalrep_typmap_gettypname(remotetypid);
 
 			if (tupleData->colstatus[remoteattnum] == LOGICALREP_COLUMN_TEXT)
 			{
@@ -543,8 +540,9 @@ slot_store_data(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
 				slot->tts_isnull[i] = true;
 			}
 
-			errarg.local_attnum = -1;
 			errarg.remote_attnum = -1;
+			errarg.local_typename = NULL;
+			errarg.remote_typename = NULL;
 		}
 		else
 		{
@@ -600,8 +598,9 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
 
 	/* For error reporting, push callback + info on the error context stack */
 	errarg.rel = rel;
-	errarg.local_attnum = -1;
 	errarg.remote_attnum = -1;
+	errarg.local_typename = NULL;
+	errarg.remote_typename = NULL;
 	errcallback.callback = slot_store_error_callback;
 	errcallback.arg = (void *) &errarg;
 	errcallback.previous = error_context_stack;
@@ -622,9 +621,15 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
 		if (tupleData->colstatus[remoteattnum] != LOGICALREP_COLUMN_UNCHANGED)
 		{
 			StringInfo	colvalue = &tupleData->colvalues[remoteattnum];
+			Oid remotetypid = rel->remoterel.atttyps[remoteattnum];
 
-			errarg.local_attnum = i;
+			/*
+			 * Set the attribute number and both local and remote type names
+			 * from the local sys cache and LogicalRepTypMap respectively.
+			 */
 			errarg.remote_attnum = remoteattnum;
+			errarg.local_typename =	format_type_be(att->atttypid);
+			errarg.remote_typename = logicalrep_typmap_gettypname(remotetypid);
 
 			if (tupleData->colstatus[remoteattnum] == LOGICALREP_COLUMN_TEXT)
 			{
@@ -668,8 +673,9 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
 				slot->tts_isnull[i] = true;
 			}
 
-			errarg.local_attnum = -1;
 			errarg.remote_attnum = -1;
+			errarg.local_typename = NULL;
+			errarg.remote_typename = NULL;
 		}
 	}
 
-- 
2.25.1

From 80e4f9e15d02a4337287627d2b23af558def5cb7 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddy@enterprisedb.com>
Date: Mon, 11 Jan 2021 12:48:28 +0530
Subject: [PATCH v1] Avoid Sys Cache Lookups in Error Callbacks

When in SearchCatCacheInternal, Make sure we're not in an error
callback. It's unsafe to access the system cahe or catalogue
within an error callback because the xact might have been broken.
---
 src/backend/utils/cache/catcache.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index ba95755867..c58e1116ab 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -1212,6 +1212,13 @@ SearchCatCacheInternal(CatCache *cache,
 	/* Make sure we're in an xact, even if this ends up being a cache hit */
 	Assert(IsTransactionState());
 
+	/*
+	 * Make sure we're not in an error callback. It's unsafe to access the
+	 * system cahe or catalogue within an error callback because the xact might
+	 * have been broken.
+	 */
+	Assert(!error_context_stack);
+
 	Assert(cache->cc_nkeys == nkeys);
 
 	/*
-- 
2.25.1

Reply via email to