Thanks Amit for the review comments.

On Mon, Oct 5, 2020 at 9:39 AM Amit Kapila <amit.kapil...@gmail.com> wrote:
>
> Few comments:
> ==============
> 1.
> + /* Report error with names of the missing localrel column(s). */
> + if (!bms_is_empty(missingatts))
> + {
> + StringInfoData missingattsbuf;
> + int    missingattcnt = 0;
> + remoterel->nspname,
> + remoterel->relname,
> + missingattsbuf.data)));
> + }
>
> I think it is better to move the above code in a separate function
> (say logicalrep_report_missing_attrs or something like that).
>

Added a new function logicalrep_report_missing_attrs().

>
> 2. I think we always need to call bms_free(missingatts) because it is
> possible that there is no missing attribute and in that case, we won't
> free the memory allocated in bms_add_range.
>

Done. Yes we palloc memory for missingatts bitmap irrespective of missing
attributes. Added bms_free() out of if(!bms_is_empty(missingatts)) as well.
I also kept bms_free() before ereport(ERROR,..) to free up before throwing
the error. In anycase, only one bms_free() would get hit.

if (!bms_is_empty(missingatts))
{
StringInfoData missingattsbuf;
int   missingattcnt = 0;

bms_free(missingatts);
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg_plural("logical replication target relation \"%s.%s\" is missing
replicated column: %s",
 "logical replication target relation \"%s.%s\" is missing replicated
columns: %s",
 missingattcnt,
 remoterel->nspname,
 remoterel->relname,
 missingattsbuf.data)));
}
bms_free(missingatts);

>
> 3. The patch doesn't seem to be freeing the memory allocated for
missingattsbuf.
>

I don't think we need to do that. We are passing missingattsbuf.data to
ereport and we are safe without freeing up missingattsbuf(we don't reach
the code after ereprot(ERROR,...)as the table sync worker anyways goes away
after throwing missing attributes error( if (sigsetjmp(local_sigjmp_buf, 1)
!= 0) in StartBackgroundWorker and then proc_exit(1)). Each time a new
table sync bg worker is spawned.

2020-10-06 10:18:27.063 IST [1599963] ERROR:  logical replication target
relation "public.t1" is missing replicated column: "@C1"
2020-10-06 10:18:47.179 IST [1600134] ERROR:  logical replication target
relation "public.t1" is missing replicated column: "@C1"
2020-10-06 10:18:57.234 IST [1600214] ERROR:  logical replication target
relation "public.t1" is missing replicated column: "@C1"

2020-10-06 10:19:27.415 IST [1600458] ERROR:  logical replication target
relation "public.t1" is missing replicated columns: "%b1", "@C1"
2020-10-06 10:19:42.506 IST [1600588] ERROR:  logical replication target
relation "public.t1" is missing replicated columns: "%b1", "@C1"
2020-10-06 10:19:52.565 IST [1600669] ERROR:  logical replication target
relation "public.t1" is missing replicated columns: "%b1", "@C1"

>
> 4.
>   ereport(ERROR,
>   (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> - errmsg("logical replication target relation \"%s.%s\" is missing "
>
> From the second line onwards, the message lines are not aligned in
> errmsg_plural.
>

Done.

>
> 5. Also, in the above message, keep the error string in a single line.
> For ex. see one of the existing messages:
> errmsg_plural("WAL segment size must be a power of two between 1 MB
> and 1 GB, but the control file specifies %d byte", .. I think it will
> be easy to read that way. I know this is not exactly related to your
> patch but improving it while changing this message seems fine.
>

Done.

Attaching v7 patch please consider it for further review.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
From c050262f5ca70712f324129c32ccd4527657b2c5 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddy@enterprisedb.com>
Date: Tue, 6 Oct 2020 10:49:25 +0530
Subject: [PATCH v7] Detail message with names of missing columns in logical
 replication

In logical replication when a subscriber is missing some columns,
it currently emits an error message that says "some" columns are
missing(see logicalrep_rel_open()), but it doesn't specify what
the missing column names are. The comment there also says that
finding the missing column names is a todo item(/* TODO, detail
message with names of missing columns */).

This patch finds the missing columns on the subscriber relation
using the publisher relation columns and show them in the error
message which makes error to be more informative to the user.
---
 src/backend/replication/logical/relation.c | 56 ++++++++++++++++++----
 1 file changed, 46 insertions(+), 10 deletions(-)

diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 2bb8e7d57b..123e49d953 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -228,6 +228,47 @@ logicalrep_rel_att_by_name(LogicalRepRelation *remoterel, const char *attname)
 	return -1;
 }
 
+/*
+ * Report error with names of the missing local relation column(s) if any,
+ * otherwise return.
+ */
+static void
+logicalrep_report_missing_attrs(LogicalRepRelation *remoterel,
+								Bitmapset *missingatts)
+{
+	int i;
+
+	/* Report error with names of the missing localrel column(s). */
+	if (!bms_is_empty(missingatts))
+	{
+		StringInfoData missingattsbuf;
+		int			   missingattcnt = 0;
+
+		initStringInfo(&missingattsbuf);
+		while ((i = bms_first_member(missingatts)) >= 0)
+		{
+			missingattcnt++;
+			if (missingattcnt == 1)
+				appendStringInfo(&missingattsbuf, _("\"%s\""),
+									remoterel->attnames[i]);
+			else
+				appendStringInfo(&missingattsbuf, _(", \"%s\""),
+									remoterel->attnames[i]);
+		}
+
+		bms_free(missingatts);
+		ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+						errmsg_plural("logical replication target relation \"%s.%s\" is missing replicated column: %s",
+									  "logical replication target relation \"%s.%s\" is missing replicated columns: %s",
+									  missingattcnt,
+									  remoterel->nspname,
+									  remoterel->relname,
+									  missingattsbuf.data)));
+	}
+
+	bms_free(missingatts);
+}
+
 /*
  * Open the local relation associated with the remote one.
  *
@@ -286,11 +327,11 @@ logicalrep_rel_open(LogicalRepRelId remoteid, LOCKMODE lockmode)
 	if (!entry->localrelvalid)
 	{
 		Oid			relid;
-		int			found;
 		Bitmapset  *idkey;
 		TupleDesc	desc;
 		MemoryContext oldctx;
 		int			i;
+		Bitmapset   *missingatts;
 
 		/* Try to find and lock the relation by name. */
 		relid = RangeVarGetRelid(makeRangeVar(remoterel->nspname,
@@ -318,7 +359,7 @@ logicalrep_rel_open(LogicalRepRelId remoteid, LOCKMODE lockmode)
 		entry->attrmap = make_attrmap(desc->natts);
 		MemoryContextSwitchTo(oldctx);
 
-		found = 0;
+		missingatts = bms_add_range(NULL, 0, remoterel->natts - 1);
 		for (i = 0; i < desc->natts; i++)
 		{
 			int			attnum;
@@ -335,16 +376,11 @@ logicalrep_rel_open(LogicalRepRelId remoteid, LOCKMODE lockmode)
 
 			entry->attrmap->attnums[i] = attnum;
 			if (attnum >= 0)
-				found++;
+				missingatts = bms_del_member(missingatts, attnum);
 		}
 
-		/* TODO, detail message with names of missing columns */
-		if (found < remoterel->natts)
-			ereport(ERROR,
-					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-					 errmsg("logical replication target relation \"%s.%s\" is missing "
-							"some replicated columns",
-							remoterel->nspname, remoterel->relname)));
+		/* Report error with names of the missing localrel column(s) if any. */
+		logicalrep_report_missing_attrs(remoterel, missingatts);
 
 		/*
 		 * Check that replica identity matches. We allow for stricter replica
-- 
2.25.1

Reply via email to