On Wed, Jul 5, 2023 at 9:01 AM Peter Smith <smithpb2...@gmail.com> wrote: > > Hi. Here are some review comments for this patch. > > +1 for the patch idea. > > ------ > > I wasn't sure about the code comment adjustments suggested by Amit [1]: > "Accordingly, the comments atop build_replindex_scan_key(), > FindUsableIndexForReplicaIdentityFull(), IsIndexOnlyOnExpression() > should also be adjusted." > > Actually, I thought the FindUsableIndexForReplicaIdentityFull() > function comment is *already* describing the limitation about the > leftmost column (see fragment below), so IIUC the Sawada-san patch is > only trying to expose that same information in the PG docs. > > [FindUsableIndexForReplicaIdentityFull comment fragment] > * We also skip indexes if the remote relation does not contain the leftmost > * column of the index. This is because in most such cases sequential scan is > * favorable over index scan. >
This implies that the leftmost column of the index must be non-expression but I feel what the patch intends to say in docs is more straightforward and it doesn't match what the proposed docs says. > ~ > > OTOH, it may be better if these limitation rule details were not > scattered in the code. e.g. build_replindex_scan_key() function > comment can be simplified: > > CURRENT: > * This is not generic routine, it expects the idxrel to be a btree, > non-partial > * and have at least one column reference (i.e. cannot consist of only > * expressions). > > SUGGESTION: > This is not a generic routine. It expects the 'idxrel' to be an index > deemed "usable" by the function > FindUsableIndexForReplicaIdentityFull(). > Note that for PK/ReplicaIdentity, we don't even call FindUsableIndexForReplicaIdentityFull() but build_replindex_scan_key() would still be called for such index. So, I am not sure your proposed wording is an improvement. > ------ > doc/src/sgml/logical-replication.sgml > > 1. > the key. When replica identity <literal>FULL</literal> is specified, > indexes can be used on the subscriber side for searching the rows. > Candidate > indexes must be btree, non-partial, and have at least one column reference > - (i.e. cannot consist of only expressions). These restrictions > - on the non-unique index properties adhere to some of the restrictions that > - are enforced for primary keys. If there are no such suitable indexes, > + at the leftmost column indexes (i.e. cannot consist of only > expressions). These > + restrictions on the non-unique index properties adhere to some of > the restrictions > + that are enforced for primary keys. If there are no such suitable > indexes, > the search on the subscriber side can be very inefficient, therefore > replica identity <literal>FULL</literal> should only be used as a > fallback if no other solution is possible. If a replica identity other > > Isn't this using the word "indexes" with different meanings in the > same sentence? e.g. IIUC "leftmost column indexes" is referring to the > ordinal number of the index fields. TBH, I am not sure the patch > wording is even describing the limitation in quite the same way as > what the code is actually doing. > > HEAD (code comment): > * We also skip indexes if the remote relation does not contain the leftmost > * column of the index. This is because in most such cases sequential scan is > * favorable over index scan. > > HEAD (rendered docs) > Candidate indexes must be btree, non-partial, and have at least one > column reference (i.e. cannot consist of only expressions). These > restrictions on the non-unique index properties adhere to some of the > restrictions that are enforced for primary keys. > > PATCHED (rendered docs) > Candidate indexes must be btree, non-partial, and have at least one > column reference at the leftmost column indexes (i.e. cannot consist > of only expressions). These restrictions on the non-unique index > properties adhere to some of the restrictions that are enforced for > primary keys. > > MY SUGGESTION: > Candidate indexes must be btree, non-partial, and have at least one > column reference (i.e. cannot consist of only expressions). > Furthermore, the leftmost field of the candidate index must be a > column of the published table. These restrictions on the non-unique > index properties adhere to some of the restrictions that are enforced > for primary keys. > I don't know if this suggestion is what the code is actually doing. In function RemoteRelContainsLeftMostColumnOnIdx(), we have the following checks: ========== keycol = indexInfo->ii_IndexAttrNumbers[0]; if (!AttributeNumberIsValid(keycol)) return false; if (attrmap->maplen <= AttrNumberGetAttrOffset(keycol)) return false; return attrmap->attnums[AttrNumberGetAttrOffset(keycol)] >= 0; ========== The first of these checks indicates that the leftmost column of the index should be non-expression, second and third indicates what you suggest in your wording. We can also think that what you wrote in a way is a superset of "leftmost index column is a non-expression" and "leftmost index column should be present in remote rel" but I feel it would be better to explicit about the first part as it is easy to understand for users at least in docs. -- With Regards, Amit Kapila.