On Wed, Dec 20, 2023 at 12:02 PM Peter Smith <smithpb2...@gmail.com> wrote: > > Here are some comments for the patch v50-0002.
Thank You for the feedback. I have addressed these in v52. > ====== > GENERAL > > (I made a short study of all the ereports in this patch -- here are > some findings) > > ~~~ > > 0.1 Don't need the parentheses. > > Checking all the ereports I see that half of them have the redundant > parentheses and half of them do not; You might as well make them all > use the new style where the extra parentheses are not needed. > > e.g. > + ereport(LOG, > + (errmsg("skipping slot synchronization"), > + errdetail("enable_syncslot is disabled."))); > > e.g. > + ereport(ERROR, > + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > + errmsg("cannot drop replication slot \"%s\"", name), > + errdetail("This slot is being synced from the primary server."))); > > and many more like this. Search for all the ereports. > > ~~~ > > 0.2 > + ereport(LOG, > + (errmsg("dropped replication slot \"%s\" of dbid %d as it " > + "was not sync-ready", NameStr(s->data.name), > + s->data.database))); > > I felt maybe that could be: > > errmsg("dropped replication slot \"%s\" of dbid %d", ... > errdetail("It was not sync-ready.") > > (now this shares the same errmsg with another ereport) > > ~~~ > > 0.3. > + ereport(ERROR, > + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > + errmsg("skipping sync of slot \"%s\" as it is a user created" > + " slot", remote_slot->name), > + errdetail("This slot has failover enabled on the primary and" > + " thus is sync candidate but user created slot with" > + " the same name already exists on the standby."))); > > This seemed too wordy. Can't it be shortened (maybe like below) > without losing any of the vital information? > > errmsg("skipping sync of slot \"%s\"", ...) > errdetail("A user-created slot with the same name already exists on > the standby.") I have modified it a little bit more. Please see now. I wanted to add the info that slot-sync worker is exiting instead of skipping a slot and that the concerned slot is a failover slot on primary. These were the other comments around the same. > ~~~ > > 0.4 > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("exiting from slot synchronization due to bad configuration"), > + /* translator: second %s is a GUC variable name */ > + errdetail("The primary slot \"%s\" specified by %s is not valid.", > + PrimarySlotName, "primary_slot_name"))); > > /The primary slot/The primary server slot/ > > ~~~ > > 0.5 > + ereport(ERROR, > + (errmsg("could not fetch primary_slot_name \"%s\" info from the " > + "primary: %s", PrimarySlotName, res->err))); > > /primary:/primary server:/ > > ~~~ > > 0.6 > The continuations for long lines are inconsistent. Sometimes there are > trailing spaces and sometimes there are leading spaces. And sometimes > there are both at the same time which would cause double-spacing in > the message! Please make them all the same. I think using leading > spaces is easier but YMMV. > > e.g. > + elog(ERROR, > + "not synchronizing local slot \"%s\" LSN(%X/%X)" > + " to remote slot's LSN(%X/%X) as synchronization " > + " would move it backwards", remote_slot->name, > + LSN_FORMAT_ARGS(slot->data.restart_lsn), > + LSN_FORMAT_ARGS(remote_slot->restart_lsn)); > > ====== > src/backend/replication/logical/slotsync.c > > 1. check_primary_info > > + /* No need to check further, return that we are cascading standby */ > + if (remote_in_recovery) > + { > + *am_cascading_standby = true; > + ExecClearTuple(tupslot); > + walrcv_clear_result(res); > + CommitTransactionCommand(); > + return; > + } > + > + valid = DatumGetBool(slot_getattr(tupslot, 2, &isnull)); > + Assert(!isnull); > + > + if (!valid) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("exiting from slot synchronization due to bad configuration"), > + /* translator: second %s is a GUC variable name */ > + errdetail("The primary slot \"%s\" specified by %s is not valid.", > + PrimarySlotName, "primary_slot_name"))); > + ExecClearTuple(tupslot); > + walrcv_clear_result(res); > + CommitTransactionCommand(); > +} > > Now that there is a common cleanup/return code this function be > reduced further like below: > > SUGGESTION > > if (remote_in_recovery) > { > /* No need to check further, return that we are cascading standby */ > *am_cascading_standby = true; > } > else > { > /* We are a normal standby. */ > > valid = DatumGetBool(slot_getattr(tupslot, 2, &isnull)); > Assert(!isnull); > > if (!valid) > ... > } > > ExecClearTuple(tupslot); > walrcv_clear_result(res); > CommitTransactionCommand(); > } > > ~~~ > > 2. ReplSlotSyncWorkerMain > > + /* > + * One can promote the standby and we can no longer be a cascading > + * standby. So recheck here. > + */ > + if (am_cascading_standby) > + check_primary_info(wrconn, &am_cascading_standby); > > Minor rewording of that new comment. > > SUGGESTION > If the standby was promoted then what was previously a cascading > standby might no longer be one, so recheck each time. > > ====== > src/test/recovery/t/050_verify_slot_order.pl > > 3. > +################################################## > +# Test that a synchronized slot can not be decoded, altered and > dropped by the user > +################################################## > > /and dropped/or dropped/ > > ~~~ > > 4. > + > +($result, $stdout, $stderr) = $standby1->psql( > + 'postgres', > + qq[ALTER_REPLICATION_SLOT lsub1_slot (failover);], > + replication => 'database'); > +ok($stderr =~ /ERROR: cannot alter replication slot "lsub1_slot"/, > + "synced slot on standby cannot be altered"); > + > > Add a comment for this test part > > SUGGESTION > Attempting to alter a synced slot should result in an error > > ~~~ > > 5. > IMO it would be better if the tests were done in the same order > mentioned in the comment. So either change the tests or change the > comment. > > ====== > Kind Regards, > Peter Smith. > Fujitsu Australia