Thanks Amit for the review comments. On Mon, Oct 5, 2020 at 9:39 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > Few comments: > ============== > 1. > + /* Report error with names of the missing localrel column(s). */ > + if (!bms_is_empty(missingatts)) > + { > + StringInfoData missingattsbuf; > + int missingattcnt = 0; > + remoterel->nspname, > + remoterel->relname, > + missingattsbuf.data))); > + } > > I think it is better to move the above code in a separate function > (say logicalrep_report_missing_attrs or something like that). >
Added a new function logicalrep_report_missing_attrs(). > > 2. I think we always need to call bms_free(missingatts) because it is > possible that there is no missing attribute and in that case, we won't > free the memory allocated in bms_add_range. > Done. Yes we palloc memory for missingatts bitmap irrespective of missing attributes. Added bms_free() out of if(!bms_is_empty(missingatts)) as well. I also kept bms_free() before ereport(ERROR,..) to free up before throwing the error. In anycase, only one bms_free() would get hit. if (!bms_is_empty(missingatts)) { StringInfoData missingattsbuf; int missingattcnt = 0; bms_free(missingatts); ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg_plural("logical replication target relation \"%s.%s\" is missing replicated column: %s", "logical replication target relation \"%s.%s\" is missing replicated columns: %s", missingattcnt, remoterel->nspname, remoterel->relname, missingattsbuf.data))); } bms_free(missingatts); > > 3. The patch doesn't seem to be freeing the memory allocated for missingattsbuf. > I don't think we need to do that. We are passing missingattsbuf.data to ereport and we are safe without freeing up missingattsbuf(we don't reach the code after ereprot(ERROR,...)as the table sync worker anyways goes away after throwing missing attributes error( if (sigsetjmp(local_sigjmp_buf, 1) != 0) in StartBackgroundWorker and then proc_exit(1)). Each time a new table sync bg worker is spawned. 2020-10-06 10:18:27.063 IST [1599963] ERROR: logical replication target relation "public.t1" is missing replicated column: "@C1" 2020-10-06 10:18:47.179 IST [1600134] ERROR: logical replication target relation "public.t1" is missing replicated column: "@C1" 2020-10-06 10:18:57.234 IST [1600214] ERROR: logical replication target relation "public.t1" is missing replicated column: "@C1" 2020-10-06 10:19:27.415 IST [1600458] ERROR: logical replication target relation "public.t1" is missing replicated columns: "%b1", "@C1" 2020-10-06 10:19:42.506 IST [1600588] ERROR: logical replication target relation "public.t1" is missing replicated columns: "%b1", "@C1" 2020-10-06 10:19:52.565 IST [1600669] ERROR: logical replication target relation "public.t1" is missing replicated columns: "%b1", "@C1" > > 4. > ereport(ERROR, > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > - errmsg("logical replication target relation \"%s.%s\" is missing " > > From the second line onwards, the message lines are not aligned in > errmsg_plural. > Done. > > 5. Also, in the above message, keep the error string in a single line. > For ex. see one of the existing messages: > errmsg_plural("WAL segment size must be a power of two between 1 MB > and 1 GB, but the control file specifies %d byte", .. I think it will > be easy to read that way. I know this is not exactly related to your > patch but improving it while changing this message seems fine. > Done. Attaching v7 patch please consider it for further review. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
From c050262f5ca70712f324129c32ccd4527657b2c5 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddy@enterprisedb.com> Date: Tue, 6 Oct 2020 10:49:25 +0530 Subject: [PATCH v7] Detail message with names of missing columns in logical replication In logical replication when a subscriber is missing some columns, it currently emits an error message that says "some" columns are missing(see logicalrep_rel_open()), but it doesn't specify what the missing column names are. The comment there also says that finding the missing column names is a todo item(/* TODO, detail message with names of missing columns */). This patch finds the missing columns on the subscriber relation using the publisher relation columns and show them in the error message which makes error to be more informative to the user. --- src/backend/replication/logical/relation.c | 56 ++++++++++++++++++---- 1 file changed, 46 insertions(+), 10 deletions(-) diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c index 2bb8e7d57b..123e49d953 100644 --- a/src/backend/replication/logical/relation.c +++ b/src/backend/replication/logical/relation.c @@ -228,6 +228,47 @@ logicalrep_rel_att_by_name(LogicalRepRelation *remoterel, const char *attname) return -1; } +/* + * Report error with names of the missing local relation column(s) if any, + * otherwise return. + */ +static void +logicalrep_report_missing_attrs(LogicalRepRelation *remoterel, + Bitmapset *missingatts) +{ + int i; + + /* Report error with names of the missing localrel column(s). */ + if (!bms_is_empty(missingatts)) + { + StringInfoData missingattsbuf; + int missingattcnt = 0; + + initStringInfo(&missingattsbuf); + while ((i = bms_first_member(missingatts)) >= 0) + { + missingattcnt++; + if (missingattcnt == 1) + appendStringInfo(&missingattsbuf, _("\"%s\""), + remoterel->attnames[i]); + else + appendStringInfo(&missingattsbuf, _(", \"%s\""), + remoterel->attnames[i]); + } + + bms_free(missingatts); + ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg_plural("logical replication target relation \"%s.%s\" is missing replicated column: %s", + "logical replication target relation \"%s.%s\" is missing replicated columns: %s", + missingattcnt, + remoterel->nspname, + remoterel->relname, + missingattsbuf.data))); + } + + bms_free(missingatts); +} + /* * Open the local relation associated with the remote one. * @@ -286,11 +327,11 @@ logicalrep_rel_open(LogicalRepRelId remoteid, LOCKMODE lockmode) if (!entry->localrelvalid) { Oid relid; - int found; Bitmapset *idkey; TupleDesc desc; MemoryContext oldctx; int i; + Bitmapset *missingatts; /* Try to find and lock the relation by name. */ relid = RangeVarGetRelid(makeRangeVar(remoterel->nspname, @@ -318,7 +359,7 @@ logicalrep_rel_open(LogicalRepRelId remoteid, LOCKMODE lockmode) entry->attrmap = make_attrmap(desc->natts); MemoryContextSwitchTo(oldctx); - found = 0; + missingatts = bms_add_range(NULL, 0, remoterel->natts - 1); for (i = 0; i < desc->natts; i++) { int attnum; @@ -335,16 +376,11 @@ logicalrep_rel_open(LogicalRepRelId remoteid, LOCKMODE lockmode) entry->attrmap->attnums[i] = attnum; if (attnum >= 0) - found++; + missingatts = bms_del_member(missingatts, attnum); } - /* TODO, detail message with names of missing columns */ - if (found < remoterel->natts) - ereport(ERROR, - (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), - errmsg("logical replication target relation \"%s.%s\" is missing " - "some replicated columns", - remoterel->nspname, remoterel->relname))); + /* Report error with names of the missing localrel column(s) if any. */ + logicalrep_report_missing_attrs(remoterel, missingatts); /* * Check that replica identity matches. We allow for stricter replica -- 2.25.1