Re: Skipping schema changes in publication

Peter Smith Sun, 20 Jul 2025 23:47:34 -0700

Hi Shlok.

Some review comments for patch v16-0003.


======
Commit message

1.
The column "prexcept" of system catalog "pg_publication_rel" is set to
"true" when publication is created with EXCEPT table or EXCEPT column
list. If column "prattrs" of system catalog "pg_publication_rel" is also
set or column "puballtables" of system catalog "pg_publication" is
"false", it indicates the column list is specified with EXCEPT clause
and columns in "prattrs" are excluded from being published.

~

Somehow, this seems to contain too much information, making it a bit
confusing. Can't you chop this down to something like below?

SUGESTION
When column "prexcept" of system catalog "pg_publication_rel" is set
to "true", and column "prattrs" of system catalog "pg_publication_rel"
is not NULL, that means the publication was created with "EXCEPT
(column-list)", and the columns in "prattrs" will be excluded from
being published.

======
doc/src/sgml/logical-replication.sgml

2.
    Generated columns can also be specified in a column list. This allows
    generated columns to be published, regardless of the publication parameter
    <link linkend="sql-createpublication-params-with-publish-generated-columns">
+   <literal>publish_generated_columns</literal></link>. Generated
columns can be
+   specified in a column list using the <literal>EXCEPT</literal> clause. This
+   excludes the specified generated columns from being published, regardless of
+   the <link 
linkend="sql-createpublication-params-with-publish-generated-columns">
+   <literal>publish_generated_columns</literal></link> setting. However, for
+   generated columns that are not listed in the <literal>EXCEPT</literal>
+   clause, whether they are published or not still depends on the value of
+   <link linkend="sql-createpublication-params-with-publish-generated-columns">
    <literal>publish_generated_columns</literal></link>. See
    <xref linkend="logical-replication-gencols"/> for details.
   </para>

~~

For this part:

"Generated columns can be specified in a column list using the
<literal>EXCEPT</literal> clause. This excludes the specified
generated columns from being published, regardless of..."

I think the whole paragraph already said "Generated columns can also
be specified in a column list", so you don't need to repeat it.
Instead, maybe say something like below.

SUGGESTION
Specifying generated columns in a column list using the
<literal>EXCEPT</literal> clause excludes those columns from being
published, regardless of...

~~~

3.
-                               Publication p1
-  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
-----------+------------+---------+---------+---------+-----------+----------
- postgres | f          | t       | t       | t       | t         | f
+                                        Publication p1
+ Owner  | All tables | Inserts | Updates | Deletes | Truncates |
Generated columns | Via root
+--------+------------+---------+---------+---------+-----------+-------------------+----------
+ ubuntu | f          | t       | t       | t       | t         | none
             | f
 Tables:
     "public.t1" (id, a, b, d)
+    "public.t2" EXCEPT (a, d)
 </programlisting></para>


I noticed the Owner changed from "postgres" to "ubuntu". Do you think
it is better to keep this as "postgres" for the example?

======
doc/src/sgml/ref/create_publication.sgml

4.
The tables added to a publication that publishes UPDATE and/or DELETE
operations must have REPLICA IDENTITY defined. Otherwise those
operations will be disallowed on those tables.

In order for UPDATE or DELETE operations to work, all the REPLICA
IDENTITY columns must be published. So, any column list must name all
REPLICA IDENTITY columns, and any EXCEPT column list must not name any
REPLICA IDENTITY columns.

A row filter expression (i.e., the WHERE clause) must contain only
columns that are covered by the REPLICA IDENTITY, in order for UPDATE
and DELETE operations to be published. For publication of INSERT
operations, any column may be used in the WHERE expression. The row
filter allows simple expressions that don't have user-defined
functions, user-defined operators, user-defined types, user-defined
collations, non-immutable built-in functions, or references to system
columns.

The generated columns that are part of the column list specified with
the EXCEPT clause are not published, regardless of the
publish_generated_columns option. However, generated columns that are
not part of the column list specified with the EXCEPT clause are
published according to the value of the publish_generated_columns
option. See Section 29.6 for details.

The generated columns that are part of REPLICA IDENTITY must be
published explicitly either by listing them in the column list or by
enabling the publish_generated_columns option, in order for UPDATE and
DELETE operations to be published.

~~

Notice all those 5 paragraphs (above) are talking about REPLICA
IDENTITY, except the 4th paragraph. Maybe the 4th paragraph should be
moved to last, to keep all the REPLICA IDENTITY stuff together.

======
src/backend/catalog/pg_publication.c

5. pub_form_cols_map

  * Returns a bitmap representing the columns of the specified table.
  *
  * Generated columns are included if include_gencols_type is
- * PUBLISH_GENCOLS_STORED.
+ * PUBLISH_GENCOLS_STORED. Columns that are in the exceptcols are excluded from
+ * the column list.
  */
 Bitmapset *
-pub_form_cols_map(Relation relation, PublishGencolsType include_gencols_type)
+pub_form_cols_map(Relation relation, PublishGencolsType include_gencols_type,
+   Bitmapset *except_cols)

Forgot to add the underscore in the function comment.

/exceptcols/except_cols/

~~~

6. pg_get_publication_tables

+
+ /*
+ * We fetch pubtuple if publication is not FOR ALL TABLES and not
+ * FOR TABLES IN SCHEMA. So if prexcept is true, it indicates that
+ * prattrs contains columns to be excluded for replication.
+ */
+ exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple,
+   Anum_pg_publication_rel_prexcept,
+   &isnull);
+
+ if (!isnull && DatumGetBool(exceptDatum) && !nulls[2])
+ except_columns = pub_collist_to_bitmapset(NULL, values[2], NULL);

But, you cannot have EXCEPT for null column list, so shouldn't the
!nulls[2] check be done to also guard the SysCacheGetAttr call?

======
src/backend/parser/gram.y

7.

Shlok wrote [1-reply #11]
The main reason I used a separate 'opt_except_column_list' is because
'opt_column_list' can also be NULL. But the column list specified with
EXCEPT not be NULL. So, 'opt_except_column_list' is defined such that
it cannot be null.

~

Yeah, but IMO that leads to excessive duplicated code. I think the
code can perhaps be a lot simpler if the grammar is written more like
the synopsis:

e.g. TABLE name opt_EXCEPT opt_column_list

where - opt_EXCEPT is null, and opt_column_list is null... means no col list
where - opt_EXCEPT is null, and opt_column_list is not null... means
normal col list
where - opt_EXCEPT is not null, and opt_column_list not null... means
EXCEPT col list
where - opt_EXCEPT is not null, and opt_column_list null... SYNTAX ERROR

So code it something like this (just adding opt_EXCEPT to the existing
productions)

%type <boolean> opt_ordinality opt_without_overlaps opt_EXCEPT
...
opt_EXCEPT:
EXCEPT { $$ = true; }
| /*EMPTY*/ { $$ = false; }
;
...
TABLE relation_expr opt_EXCEPT opt_column_list OptWhereClause
{
  $$ = makeNode(PublicationObjSpec);
  $$->pubobjtype = PUBLICATIONOBJ_TABLE;
  $$->pubtable = makeNode(PublicationTable);
  $$->pubtable->relation = $2;
  $$->pubtable->except = $3;
  $$->pubtable->columns = $4;
  if ($3 && !$4)
    ereport(ERROR,
      (errcode(ERRCODE_SYNTAX_ERROR),
      errmsg("EXCEPT without column list"),
      parser_errposition(@3)));
  $$->pubtable->whereClause = $5;
  $$->location = @1;
}

etc.

======
src/bin/psql/describe.c

8.
  if (!PQgetisnull(res, i, 3))
+ {
+ if (!PQgetisnull(res, i, 4) && strcmp(PQgetvalue(res, i, 4), "t") == 0)
+ appendPQExpBuffer(buf, " EXCEPT");
  appendPQExpBuffer(buf, " (%s)", PQgetvalue(res, i, 3));
+ }

This growing list of columns makes it hard to understand this function
without looking back at the caller all the time. Maybe you can add a
function comment that at least explains what those attributes 1,2,3,4
represent?

======
src/bin/psql/tab-complete.in.c

9.
+ else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|SET",
"TABLE", MatchAny))
+ COMPLETE_WITH("EXCEPT");

Since it is not allowed to have an EXCEPT with no column list,
shouldn't this say "EXCEPT ("?

~~~

10.
  else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "TABLE",
MatchAny) && !ends_with(prev_wd, ','))
- COMPLETE_WITH("WHERE (", "WITH (");
+ COMPLETE_WITH("EXCEPT", "WHERE (", "WITH (");

Ditto. Since it is not allowed to have an EXCEPT with no column list,
shouldn't this say "EXCEPT ("?


======
src/test/regress/expected/publication.out

11.
+-- Verify that EXCEPT col-list cannot contain RI cols (when using INDEX)
+CREATE UNIQUE INDEX pub_test_except1_ac_idx ON pub_test_except1 (a, c);
+ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
pub_test_except1_a_idx;
+ERROR:  index "pub_test_except1_a_idx" for table "pub_test_except1"
does not exist
+UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
+ERROR:  cannot update table "pub_test_except1"
+DETAIL:  Column list used by the publication does not cover the
replica identity.
+DROP INDEX pub_test_except1_ac_idx;


What's happening here? I'm not sure these are the kind of errors you
were trying to cause.

======
src/test/regress/sql/publication.sql

12.
+-- Verify that EXCEPT col-list cannot contain RI cols (when using RI FULL)
+ALTER TABLE pub_test_except1 REPLICA IDENTITY FULL;
+UPDATE pub_test_except1 SET a = 3 WHERE a = 1;


SUGGESTION. Change that comment to:
Verify fails - EXCEPT col-list cannot...

~~~

13.
+-- Verify that EXCEPT col-list cannot contain RI cols (when using INDEX)
+CREATE UNIQUE INDEX pub_test_except1_ac_idx ON pub_test_except1 (a, c);
+ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
pub_test_except1_a_idx;
+UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
+DROP INDEX pub_test_except1_ac_idx;

SUGGESTION. Change that comment to:
Verify fails - EXCEPT col-list cannot...

~~~

14.
+-- Verify that so long as no clash between RI cols and the EXCEPT
+CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a);
+ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX
pub_test_except1_a_idx;
+UPDATE pub_test_except1 SET a = 3 WHERE a = 1;
+

That comment doesn't make sense. Missing words?

======
.../t/036_rep_changes_except_table.pl

15.
(I haven't reviewed this file in detail yet, but here is a general comment)

I know this patch currently lives in the same thread as all the EXCEPT
TABLE stuff, but that seems just happenstance to me. IMO, this is a
separate enhancement that just shares the keyword EXCEPT. So, I felt
it should have quite separate tests too.

e.g. How about: 037_rep_changes_except_collist.pl

======
[1] 
https://www.postgresql.org/message-id/CANhcyEW2LK4diNeCG862DE40yQoV3VAgf59kXUq2TuR8fnw5vQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Skipping schema changes in publication

Reply via email to