Hello, On Wed, Jun 8, 2022 at 5:47 PM shiy.f...@fujitsu.com <shiy.f...@fujitsu.com> wrote: > Hi hackers, > > I saw a problem in logical replication, when the target table on subscriber > is a > partitioned table, it only checks whether the Replica Identity of partitioned > table is consistent with the publisher, and doesn't check Replica Identity of > the partition. > > For example: > -- publisher -- > create table tbl (a int not null, b int); > create unique INDEX ON tbl (a); > alter table tbl replica identity using INDEX tbl_a_idx; > create publication pub for table tbl; > > -- subscriber -- > -- table tbl (parent table) has RI index, while table child has no RI index. > create table tbl (a int not null, b int) partition by range (a); > create table child partition of tbl default; > create unique INDEX ON tbl (a); > alter table tbl replica identity using INDEX tbl_a_idx; > create subscription sub connection 'port=5432 dbname=postgres' publication > pub; > > -- publisher -- > insert into tbl values (11,11); > update tbl set a=a+1; > > It caused an assertion failure on subscriber: > TRAP: FailedAssertion("OidIsValid(idxoid) || (remoterel->replident == > REPLICA_IDENTITY_FULL)", File: "worker.c", Line: 2088, PID: 1616523) > > The backtrace is attached. > > We got the assertion failure because idxoid is invalid, as table child has no > Replica Identity or Primary Key. We have a check in > check_relation_updatable(), > but what it checked is table tbl (the parent table) and it passed the check. > > I think one approach to fix it is to check the target partition in this case, > instead of the partitioned table.
That makes sense. A normal user update of a partitioned table will only perform CheckCmdReplicaIdentity() for leaf partitions and the logical replication updates should do the same. I may have been confused at the time to think that ALTER TABLE REPLICA IDENTITY makes sure that the replica identities of all relations in a partition tree are forced to be the same at all times, though it seems that the patch to do so [1] didn't actually get in. I guess adding a test case would have helped. > When trying to fix it, I saw some other problems about updating partition map > cache. > > a) In logicalrep_partmap_invalidate_cb(), the type of the entry in > LogicalRepPartMap should be LogicalRepPartMapEntry, instead of > LogicalRepRelMapEntry. Indeed. > b) In logicalrep_partition_open(), it didn't check if the entry is valid. Yeah, that's bad. Actually, it seems that localrelvalid stuff for LogicalRepRelMapEntry came in 3d65b0593c5, but it's likely we missed in that commit that LogicalRepPartMapEntrys contain copies of, not pointers to, the relevant parent's entry. This patch fixes that oversight. > c) When the publisher send new relation mapping, only relation map cache will > be > updated, and partition map cache wouldn't. I think it also should be updated > because it has remote relation information, too. Yes, again a result of forgetting that the partition map entries have copies of relation map entries. +logicalrep_partmap_invalidate I wonder why not call this logicalrep_partmap_update() to go with logicalrep_relmap_update()? It seems confusing to have logicalrep_partmap_invalidate() right next to logicalrep_partmap_invalidate_cb(). +/* + * Invalidate the existing entry in the partition map. I think logicalrep_partmap_invalidate() may update *multiple* entries, because the hash table scan may find multiple PartMapEntrys containing a copy of the RelMapEntry with given remoteid, that is, for multiple partitions of a given local parent table mapped to that remote relation. So, please fix the comment as: Invalidate/Update the entries in the partition map that refer to 'remoterel' Likewise: + /* Invalidate the corresponding partition map as well */ Maybe, this should say: Also update all entries in the partition map that refer to 'remoterel'. In 0002: +logicalrep_check_updatable +1 to Amit's suggestion to use "mark" instead of "check". @@ -1735,6 +1735,13 @@ apply_handle_insert_internal(ApplyExecutionData *edata, static void check_relation_updatable(LogicalRepRelMapEntry *rel) { + /* + * If it is a partitioned table, we don't check it, we will check its + * partition later. + */ + if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) + return; Why do this? I mean why if logicalrep_check_updatable() doesn't care if the relation is partitioned or not -- it does all the work regardless. I suggest we don't add this check in check_relation_updatable(). + /* Check if we can do the update or delete. */ Maybe mention "leaf partition", as: Check if we can do the update or delete on the leaf partition. BTW, the following hunk in patch 0002 should really be a part of 0001. @@ -584,7 +594,6 @@ logicalrep_partition_open(LogicalRepRelMapEntry *root, Oid partOid = RelationGetRelid(partrel); AttrMap *attrmap = root->attrmap; bool found; - int i; MemoryContext oldctx; if (LogicalRepPartMap == NULL) > Thanks to Hou Zhijie for helping me in the first patch. Thank you both for the fixes. > I will add a test for it later That would be very welcome. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com [1] https://www.postgresql.org/message-id/flat/201902041630.gpadougzab7v%40alvherre.pgsql