David Rowley wrote: > Hi Álvaro,
Hi David, Thanks for the review. Attached is a delta patch that fixes most things, except your item 14 below. Before getting into the details of the items you list, in this version I also fixed a couple of issues noted by Jaime Casanova; namely a) ALTER INDEX SET TABLESPACE threw an error for partitioned indexes. Handled the same was as for partitioned tables: silently do nothing. b) Expression indexes not considered at all. (You also commented on this point). Jaime reported it for this case: create index on t_part USING gin (string_to_array(btrim((t)::text, '-'::text), '--'::text)); I dealt with this by adding code to CompareIndexInfo that runs equal() on the expression lists, after applying map_variable_attnos() (to support the case of attaching a table with dropped or reordered columns). As far as I can see, this works correctly for very simple expressions, though I didn't test Jaime's case specifically; I suppose I should add a few more tests. On to your notes: > 1. "either partition" -> "either the partition" > 4. Extra newline > 8. Missing if (attachInfos) pfree(attachInfos); > 10. lock -> locks: > 11. "already the right" -> "already in the right" > 16. It's not exactly a problem, but I had a quick look and I don't see > any other uses of list_free() checking for a non-NIL list first: Fixed. > 2. In findDependentObjects(), should the following if test be below > the ReleaseDeletionLock call? > > + if (foundDep->deptype == DEPENDENCY_INTERNAL_AUTO) > + break; > ReleaseDeletionLock(object); No, as far as I understand we should keep that lock if we break out of the loop -- it's the lock on the object being deleted (the index partition in this case). I did break the comment in two, so that now the "First," paragraph appears below the "if () break" test. That seems to make more sense. It looks like this now: /* * 3. Not all the owning objects have been visited, so * transform this deletion request into a delete of this * owning object. * * For INTERNAL_AUTO dependencies, we don't enforce this; * in other words, we don't follow the links back to the * owning object. */ if (foundDep->deptype == DEPENDENCY_INTERNAL_AUTO) break; /* * First, release caller's lock on this object and get * deletion lock on the owning object. (We must release * caller's lock to avoid deadlock against a concurrent * deletion of the owning object.) */ ReleaseDeletionLock(object); AcquireDeletionLock(&otherObject, 0); > 3. The following existing comment indicates that we'll be creating a > disk file for the index. Should we update this comment to say that > this is the case only for RELKIND_INDEX? > Maybe: > > /* > * create the index relation's relcache entry and for non-partitioned > * indexes, a physical disk file. (If we fail further down, it's the > * smgr's responsibility to remove the disk file again.) > */ I used this: /* * create the index relation's relcache entry and, if necessary, the * physical disk file. (If we fail further down, it's the smgr's * responsibility to remove the disk file again, if any.) */ > 5. Do you think it's out of scope to support expression indexes? Answered above. > 6. pg_inherits.inhseqno is int, not smallint. I understand you copied > this from StoreCatalogInheritance1(), so maybe a fix should be put in > before this patch is? Hmm, yeah, will fix. > 7. Is the following a bug fix? If so, should it not be fixed and > backpatched, rather than sneaked in here? > > @@ -10119,6 +10195,7 @@ ATExecChangeOwner(Oid relationOid, Oid > newOwnerId, bool recursing, LOCKMODE lock > * relation, as well as its toast table (if it has one). > */ > if (tuple_class->relkind == RELKIND_RELATION || > + tuple_class->relkind == RELKIND_PARTITIONED_TABLE || No, it's not a bug fix -- this block is only concerned with searching index for this relkind, and prior to this patch PARTITIONED_TABLEs do not have indexes, so it's ok not to consider the case. > 9. The first OR branch of the Assert will always be false, so might as > well get rid of it. True. Removed. > 12. It seems to be RangeVarGetRelidExtended() that locks the partition > index, not the callback. > > /* no deadlock risk: our callback above already acquired the lock */ Hmm ... yeah, you're right. Fixed. > 13. We seem to only be holding an AccessShareLock on the partitioned > table. What happens if someone concurrently does ALTER TABLE > partitioned_table DETACH our_partition;? > > parentTbl = relation_open(parentIdx->rd_index->indrelid, AccessShareLock); > > and; > /* Make sure it indexes a partition of the other index's table */ > partDesc = RelationGetPartitionDesc(parentTbl); > found = false; > for (i = 0; i < partDesc->nparts; i++) > { > if (partDesc->oids[i] == state.partitionOid) > { > found = true; > break; > } > } > > Wouldn't you also need an AccessExclusiveLock to prevent another > session trying to ALTER INDEX part_index ATTACH PARTITION leaf_index; > at the same time? The concurrent session could be trying to ATTACH it > to a subpartition and this comment could be attaching to the parent. But AccessShare does conflict with AccessExclusive. So even with just AccessShare we've already prevented others from detaching/attaching partitions. I think I should spend a bit of time going over the locking considerations again and make sure it is all sound. > 14. validatePartitionedIndex does not verify that the index it is > checking is valid. Consider the following: > > create table part (a int, b int) partition by list(a); > create table part1 partition of part for values in(1) partition by list(b) > create table part1 partition of part for values in(1) partition by list(b); > create table part2 partition of part1 for values in (1); > create index on only part1 (a); > create index on only part (a); > alter index part_a_idx attach partition part1_a_idx; > \d part > Table "public.part" > Column | Type | Collation | Nullable | Default > --------+---------+-----------+----------+--------- > a | integer | | | > b | integer | | | > Partition key: LIST (a) > Indexes: > "part_a_idx" btree (a) <--- should be INVALID > Number of partitions: 1 (Use \d+ to list them.) > > \d part1 > Table "public.part1" > Column | Type | Collation | Nullable | Default > --------+---------+-----------+----------+--------- > a | integer | | | > b | integer | | | > Partition of: part FOR VALUES IN (1) > Partition key: LIST (b) > Indexes: > "part1_a_idx" btree (a) INVALID > Number of partitions: 1 (Use \d+ to list them.) > > But that probably means you also need more code to search up to parent > partitions and try and validate any invalid indexes once a > sub-partition index is made valid. > > As, if the part_a_idx had correctly been marked as INVALID, and then we do: > > CREATE INDEX part2_a_idx ON part2 (a); > ALTER INDEX part1_a_idx ATTACH PARTITION part2_a_idx; > > This would validate part1_a_idx, but we'd also then need to attempt > validation on part_a_idx. (I'm still reading, so perhaps you've > already thought of that.) In prior incarnations of the patch, I had an if-test to prevent attaching invalid indexes, but I decided to remove it at some point -- mainly thinking of attaching a partition for which a CREATE INDEX CONCURRENTLY was running which already had the index as invalid and was later expected to become valid. I suppose that doesn't really work anyway because of locking considerations (you can't attach a partition in which CIC is concurrently running, can you). I'll think some more about this case and post an updated version later. > 15. Can you explain why you do the following for CREATE INDEX, but not > for CREATE INDEX IF NOT EXISTS? I cannot -- just an oversight. Fixed. It's true that makeNode() fixes the hole, but we're supposed to be strict about initializing structures. > 17. Just a stylistic thing; I think it would be neater to set isindex > using the existing switch(). It would just need some case shuffling. > Or maybe it's even better to set relation->rd_amroutine->amoptions > into a local variable and use it unconditionally in the call to > extractRelOptions(). Yeah, that last suggestion leads to simpler code, so I did it that way. > 18. At most it can't do any harm, but is the following still needed? I > originally thought this would have been for pg_index changes. What's > changed? > > -#define CATALOG_VERSION_NO 201712251 > +#define CATALOG_VERSION_NO 201712291 Yeah, probably not needed anymore. Will lose it. > The tests seem fine, but I'd like to see a test for #14 once that's fixed. Sure thing. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 192891bc4f..71e20f2740 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -3005,14 +3005,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l will be disallowed outright (we'll tell the user to issue a <command>DROP</command> against the referenced object, instead). While a regular internal dependency will prevent - the dependent object from being dropped while any such dependencies remain, - <literal>DEPENDENCY_INTERNAL_AUTO</literal> will allow such a drop as long - as the object can be found by following any of such dependencies. - Example: an index - on a partition is made internal-auto-dependent on both the partition - itself as well as on the index on the parent partitioned table; - so the partition index is dropped together with either partition it indexes, - or with the parent index it is attached to. + the dependent object from being dropped while any such dependencies + remain, <literal>DEPENDENCY_INTERNAL_AUTO</literal> will allow such + a drop as long as the object can be found by following any of such + dependencies. + Example: an index on a partition is made internal-auto-dependent on + both the partition itself as well as on the index on the parent + partitioned table; so the partition index is dropped together with + either the partition it indexes, or with the parent index it is + attached to. </para> </listitem> </varlistentry> diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c index a46e859488..be60270ea5 100644 --- a/src/backend/catalog/dependency.c +++ b/src/backend/catalog/dependency.c @@ -637,14 +637,16 @@ findDependentObjects(const ObjectAddress *object, * For INTERNAL_AUTO dependencies, we don't enforce this; * in other words, we don't follow the links back to the * owning object. - * + */ + if (foundDep->deptype == DEPENDENCY_INTERNAL_AUTO) + break; + + /* * First, release caller's lock on this object and get * deletion lock on the owning object. (We must release * caller's lock to avoid deadlock against a concurrent * deletion of the owning object.) */ - if (foundDep->deptype == DEPENDENCY_INTERNAL_AUTO) - break; ReleaseDeletionLock(object); AcquireDeletionLock(&otherObject, 0); diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index 6c4649c594..4e10026297 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -42,6 +42,7 @@ #include "catalog/pg_constraint.h" #include "catalog/pg_constraint_fn.h" #include "catalog/pg_depend.h" +#include "catalog/pg_inherits_fn.h" #include "catalog/pg_operator.h" #include "catalog/pg_opclass.h" #include "catalog/pg_tablespace.h" @@ -56,6 +57,7 @@ #include "nodes/nodeFuncs.h" #include "optimizer/clauses.h" #include "parser/parser.h" +#include "rewrite/rewriteManip.h" #include "storage/bufmgr.h" #include "storage/lmgr.h" #include "storage/predicate.h" @@ -869,9 +871,9 @@ index_create(Relation heapRelation, } /* - * create the index relation's relcache entry and physical disk file. (If - * we fail further down, it's the smgr's responsibility to remove the disk - * file again.) + * create the index relation's relcache entry and, if necessary, the + * physical disk file. (If we fail further down, it's the smgr's + * responsibility to remove the disk file again, if any.) */ indexRelation = heap_create(indexRelationName, namespaceId, @@ -1044,7 +1046,6 @@ index_create(Relation heapRelation, recordDependencyOn(&myself, &referenced, DEPENDENCY_INTERNAL_AUTO); } - /* Store dependency on collations */ /* The default collation is pinned, so don't bother recording it */ for (i = 0; i < indexInfo->ii_NumIndexAttrs; i++) @@ -1745,7 +1746,7 @@ BuildIndexInfo(Relation index) * attmap is an attribute map where info2 is input and info1 is output. */ bool -CompareIndexInfo(IndexInfo *info1, IndexInfo *info2, AttrNumber *attmap) +CompareIndexInfo(IndexInfo *info1, IndexInfo *info2, AttrNumber *attmap, int maplen) { int i; @@ -1759,7 +1760,9 @@ CompareIndexInfo(IndexInfo *info1, IndexInfo *info2, AttrNumber *attmap) /* * and columns match through the attribute map (actual attribute numbers - * might differ!) + * might differ!) Note that this implies that index columns that are + * expressions appear in the same positions. We will next compare the + * expressions themselves. */ for (i = 0; i < info1->ii_NumIndexAttrs; i++) { @@ -1769,11 +1772,24 @@ CompareIndexInfo(IndexInfo *info1, IndexInfo *info2, AttrNumber *attmap) } /* - * Expression indexes are currently not considered equal. Not needed for - * current callers. + * For expression indexes: either both are expression indexes, or neither + * is; if they are, make sure the expressions match. */ - if (info1->ii_Expressions != NIL || info2->ii_Expressions != NIL) + if ((info1->ii_Expressions != NIL) != (info2->ii_Expressions != NIL)) return false; + else + { + bool found_whole_row; + Node *mapped; + + mapped = map_variable_attnos((Node *) info2->ii_Expressions, + 1, 0, attmap, maplen, + InvalidOid, &found_whole_row); + if (found_whole_row) + elog(ERROR, "having fun with whole-row expressions, are we?"); + + return equal(info1->ii_Expressions, mapped); + } /* Index predicates must be identical */ if (!equal(info1->ii_Predicate, info2->ii_Predicate)) diff --git a/src/backend/catalog/pg_inherits.c b/src/backend/catalog/pg_inherits.c index bde5176cab..973e8d5835 100644 --- a/src/backend/catalog/pg_inherits.c +++ b/src/backend/catalog/pg_inherits.c @@ -410,7 +410,7 @@ typeInheritsFrom(Oid subclassTypeId, Oid superclassTypeId) * Create a single pg_inherits row with the given data */ void -StoreSingleInheritance(Oid relationId, Oid parentOid, int16 seqNumber) +StoreSingleInheritance(Oid relationId, Oid parentOid, int32 seqNumber) { Datum values[Natts_pg_inherits]; bool nulls[Natts_pg_inherits]; @@ -424,7 +424,7 @@ StoreSingleInheritance(Oid relationId, Oid parentOid, int16 seqNumber) */ values[Anum_pg_inherits_inhrelid - 1] = ObjectIdGetDatum(relationId); values[Anum_pg_inherits_inhparent - 1] = ObjectIdGetDatum(parentOid); - values[Anum_pg_inherits_inhseqno - 1] = Int16GetDatum(seqNumber); + values[Anum_pg_inherits_inhseqno - 1] = Int32GetDatum(seqNumber); memset(nulls, 0, sizeof(nulls)); diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index 9b7f29c7fc..1a921f70a2 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -816,7 +816,7 @@ DefineIndex(Oid relationId, parentDesc, gettext_noop("could not convert row type")); - if (CompareIndexInfo(cldIdxInfo, indexInfo, attmap)) + if (CompareIndexInfo(cldIdxInfo, indexInfo, attmap, parentDesc->natts)) { /* * Found a match. Attach index to parent and we're diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index a9a1d7fdd7..649a644158 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -4206,9 +4206,13 @@ ATRewriteTables(AlterTableStmt *parsetree, List **wqueue, LOCKMODE lockmode) { AlteredTableInfo *tab = (AlteredTableInfo *) lfirst(ltab); - /* Foreign tables have no storage, nor do partitioned tables. */ + /* + * Foreign tables have no storage, nor do partitioned tables and + * indexes. + */ if (tab->relkind == RELKIND_FOREIGN_TABLE || - tab->relkind == RELKIND_PARTITIONED_TABLE) + tab->relkind == RELKIND_PARTITIONED_TABLE || + tab->relkind == RELKIND_PARTITIONED_INDEX) continue; /* @@ -14084,7 +14088,8 @@ ATExecAttachPartition(List **wqueue, Relation rel, PartitionCmd *cmd) if (has_superclass(RelationGetRelid(attachrelIdxRels[i]))) continue; - if (CompareIndexInfo(info, attachInfos[i], attmap)) + if (CompareIndexInfo(info, attachInfos[i], attmap, + RelationGetDescr(rel)->natts)) { /* bingo. */ IndexSetParentIndex(attachrelIdxRels[i], idx); @@ -14121,6 +14126,8 @@ ATExecAttachPartition(List **wqueue, Relation rel, PartitionCmd *cmd) pfree(idxes); if (attachRelIdxs) pfree(attachRelIdxs); + if (attachInfos) + pfree(attachInfos); } /* @@ -14283,8 +14290,7 @@ ATExecDetachPartition(Relation rel, RangeVar *name) if (!has_superclass(idxid)) continue; - Assert(!has_superclass(idxid) || - (IndexGetRelation(get_partition_parent(idxid), false) == + Assert((IndexGetRelation(get_partition_parent(idxid), false) == RelationGetRelid(rel))); idx = index_open(idxid, AccessExclusiveLock); @@ -14402,16 +14408,16 @@ ATExecAttachPartitionIdx(List **wqueue, Relation parentIdx, RangeVar *name) (errcode(ERRCODE_UNDEFINED_OBJECT), errmsg("index \"%s\" does not exist", name->relname))); - /* no deadlock risk: our callback above already acquired the lock */ + /* no deadlock risk: RangeVarGetRelidExtended already acquired the lock */ partIdx = relation_open(partIdxId, AccessExclusiveLock); - /* we already hold lock on both tables, so this is safe: */ + /* we already hold locks on both tables, so this is safe: */ parentTbl = relation_open(parentIdx->rd_index->indrelid, AccessShareLock); partTbl = relation_open(partIdx->rd_index->indrelid, NoLock); ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partIdx)); - /* Silently do nothing if already the right state */ + /* Silently do nothing if already in the right state */ currParent = !has_superclass(partIdxId) ? InvalidOid : get_partition_parent(partIdxId); if (currParent != RelationGetRelid(parentIdx)) @@ -14463,7 +14469,8 @@ ATExecAttachPartitionIdx(List **wqueue, Relation parentIdx, RangeVar *name) attmap = convert_tuples_by_name_map(RelationGetDescr(parentTbl), RelationGetDescr(partTbl), gettext_noop("could not convert row type")); - if (!CompareIndexInfo(parentInfo, childInfo, attmap)) + if (!CompareIndexInfo(parentInfo, childInfo, attmap, + RelationGetDescr(partTbl)->natts)) ereport(ERROR, (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), errmsg("cannot attach index \"%s\" as a partition of index \"%s\"", diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 04ef741a4c..5bf8dc2e56 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -7390,6 +7390,7 @@ IndexStmt: CREATE opt_unique INDEX opt_concurrently opt_index_name n->concurrent = $4; n->idxname = $8; n->relation = $10; + n->relationId = InvalidOid; n->accessMethod = $11; n->indexParams = $13; n->options = $15; diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c index 16c4f8fad4..1a29dda783 100644 --- a/src/backend/tcop/utility.c +++ b/src/backend/tcop/utility.c @@ -1353,8 +1353,7 @@ ProcessUtilitySlow(ParseState *pstate, commandCollected = true; EventTriggerAlterTableEnd(); - if (inheritors) - list_free(inheritors); + list_free(inheritors); } break; diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index a28046f857..2e4e968a90 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -430,20 +430,26 @@ static void RelationParseRelOptions(Relation relation, HeapTuple tuple) { bytea *options; - bool isindex; + amoptions_function amoptsfn; relation->rd_options = NULL; - /* Fall out if relkind should not have options */ + /* + * Look up any AM-specific parse function; fall out if relkind should not + * have options. + */ switch (relation->rd_rel->relkind) { case RELKIND_RELATION: case RELKIND_TOASTVALUE: - case RELKIND_INDEX: - case RELKIND_PARTITIONED_INDEX: case RELKIND_VIEW: case RELKIND_MATVIEW: case RELKIND_PARTITIONED_TABLE: + amoptsfn = NULL; + break; + case RELKIND_INDEX: + case RELKIND_PARTITIONED_INDEX: + amoptsfn = relation->rd_amroutine->amoptions; break; default: return; @@ -454,12 +460,7 @@ RelationParseRelOptions(Relation relation, HeapTuple tuple) * we might not have any other for pg_class yet (consider executing this * code for pg_class itself) */ - isindex = relation->rd_rel->relkind == RELKIND_INDEX || - relation->rd_rel->relkind == RELKIND_PARTITIONED_INDEX; - options = extractRelOptions(tuple, - GetPgClassDescriptor(), - isindex ? relation->rd_amroutine->amoptions : - NULL); + options = extractRelOptions(tuple, GetPgClassDescriptor(), amoptions); /* * Copy parsed data into CacheMemoryContext. To guard against the diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h index 2b2c166266..f1765af4ba 100644 --- a/src/include/catalog/catversion.h +++ b/src/include/catalog/catversion.h @@ -53,6 +53,6 @@ */ /* yyyymmddN */ -#define CATALOG_VERSION_NO 201712291 +#define CATALOG_VERSION_NO 201712251 #endif diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h index 67d8a20be3..46c271a46c 100644 --- a/src/include/catalog/dependency.h +++ b/src/include/catalog/dependency.h @@ -60,7 +60,7 @@ * Example: an index on a partition is made internal-auto-dependent on * both the partition itself as well as on the index on the parent * partitioned table; so the partition index is dropped together with - * either partition it indexes, or with the parent index it is attached + * either the partition it indexes, or with the parent index it is attached * to. * DEPENDENCY_EXTENSION ('e'): the dependent object is a member of the diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h index 139365c3b3..8ff3a0732e 100644 --- a/src/include/catalog/index.h +++ b/src/include/catalog/index.h @@ -87,7 +87,8 @@ extern void index_drop(Oid indexId, bool concurrent); extern IndexInfo *BuildIndexInfo(Relation index); -extern bool CompareIndexInfo(IndexInfo *info1, IndexInfo *info2, AttrNumber *attmap); +extern bool CompareIndexInfo(IndexInfo *info1, IndexInfo *info2, + AttrNumber *attmap, int maplen); extern void BuildSpeculativeIndexInfo(Relation index, IndexInfo *ii); diff --git a/src/include/catalog/pg_inherits_fn.h b/src/include/catalog/pg_inherits_fn.h index b587f2d042..d0abde4168 100644 --- a/src/include/catalog/pg_inherits_fn.h +++ b/src/include/catalog/pg_inherits_fn.h @@ -24,6 +24,6 @@ extern bool has_subclass(Oid relationId); extern bool has_superclass(Oid relationId); extern bool typeInheritsFrom(Oid subclassTypeId, Oid superclassTypeId); extern void StoreSingleInheritance(Oid relationId, Oid parentOid, - int16 seqNumber); + int32 seqNumber); #endif /* PG_INHERITS_FN_H */