Thanks for the updated patches. On 2018/03/18 13:17, Alvaro Herrera wrote: > Alvaro Herrera wrote: > >> I think what I should be doing is the same as the returning stuff: keep >> a tupdesc around, and use a single slot, whose descriptor is changed >> just before the projection. > > Yes, this works, though it's ugly. Not any uglier than what's already > there, though, so I think it's okay. > > The only thing that I remain unhappy about this patch is the whole > adjust_and_expand_partition_tlist() thing. I fear we may be doing > redundant and/or misplaced work. I'll look into it next week. > > 0001 should be pretty much ready to push -- adjustments to ExecInsert > and ModifyTableState I already mentioned.
This seems like good cleanup. While at it, why not also get rid of mt_onconflict in favor of always just using its counterpart in ModifyTable -- onConflictAction? > 0002 is stuff I would like to get rid of completely -- changes to > planner code so that it better supports functionality we need for > 0003. Hmm. I'm not sure if we can completely get rid of this, because we do need the adjust_inherited_tlist() facility to translate TargetEntry resnos in any case. But as I just said in reply to Pavan's email suggesting deferring onConflistSet expansion to execution time, we don't need the hack in adjust_inherited_tlist() if we go with the suggestion. > 0003 is the main patch. Compared to the previous version, this one > reuses slots by switching them to different tupdescs as needed. Your proposed change to use just one slot (the existing mt_conflproj slot) sounds good. Instead, it seems now we have an array to hold tupleDescs for the onConflistSet target lists for each partition. Some comments: 1. I noticed a bug that crashes a test in insert_conflit.sql that uses DO NOTHING instead of DO UPDATE SET. It's illegal for ExecInitPartitionInfo to expect mt_conflproj_tupdesc to be valid in the DO NOTHING case, because ExecInitModifyTable would only set it if (onConflictAction == DO_UPDATE). 2. It seems better to name the new array field in PartitionTupleRouting partition_conflproj_tupdescs rather than partition_onconfl_tupdescs to be consistent with the new field in ModifyTableState. 3. I think it was an oversight in my original patch, but it seems we should allocate the partition_onconfl_tdescs array only if DO UPDATE action is used. Also, ri_onConflictSetProj, ri_onConflictSetWhere should be only set in that case. OTOH, we always need to set partition_arbiter_indexes, that is, for both DO NOTHING and DO UPDATE SET actions. 4. Need to remove the comments for partition_conflproj_slots and partition_existing_slots, fields of PartitionTupleRouting that no longer exist. Instead one for partition_conflproj_tupdescs should be added. 5. I know the following is so as not to break the Assert in adjust_inherited_tlist(), so why not have a parentOid argument for adjust_and_expand_partition_tlist()? + appinfo.parent_reloid = 1; // dummy parentRel->rd_id; 6. There is a sentence in the comment above adjust_inherited_tlist(): Note that this is not needed for INSERT because INSERT isn't inheritable. Maybe, we need to delete that and mention that we do need it in the case of INSERT ON CONFLICT DO UPDATE on partitioned tables for translating DO UPDATE SET target list. 7. In ExecInsert, it'd be better to have a partArbiterIndexes, just like partConflTupdesc in the outermost scope and then do: + /* Use the appropriate list of arbiter indexes. */ + if (mtstate->mt_partition_tuple_routing != NULL) + arbiterIndexes = partArbiterIndexes; + else + arbiterIndexes = node->arbiterIndexes; and + /* Use the appropriate tuple descriptor. */ + if (mtstate->mt_partition_tuple_routing != NULL) + onconfl_tupdesc = partConflTupdesc; + else + onconfl_tupdesc = mtstate->mt_conflproj_tupdesc; using arbiterIndexes and onconfl_tupdesc declared in the appropriate scopes. I have tried to make these changes and attached are the updated patches containing those, including the change I suggested for 0001 (that is, getting rid of mt_onconflict). I also expanded some comments in 0003 while making those changes. Thanks, Amit
>From e208f2fb3c2eaac6f7932f8c15afab789679e5ee Mon Sep 17 00:00:00 2001 From: Alvaro Herrera <alvhe...@alvh.no-ip.org> Date: Fri, 16 Mar 2018 14:29:28 -0300 Subject: [PATCH v5 1/3] Simplify ExecInsert API re. ON CONFLICT data Instead of passing the ON CONFLICT-related members of ModifyTableState into ExecInsert(), we can have that routine obtain them from the node, since that is already an argument into the function. While at it, remove arbiterIndexes from ModifyTableState, since that's just a copy of the list already in the ModifyTable node, to which the state node already has access. --- src/backend/executor/execPartition.c | 4 ++-- src/backend/executor/nodeModifyTable.c | 34 +++++++++++++++++++--------------- src/include/nodes/execnodes.h | 3 --- 3 files changed, 21 insertions(+), 20 deletions(-) diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c index f6fe7cd61d..ce9a4e16cf 100644 --- a/src/backend/executor/execPartition.c +++ b/src/backend/executor/execPartition.c @@ -363,8 +363,8 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, if (partrel->rd_rel->relhasindex && leaf_part_rri->ri_IndexRelationDescs == NULL) ExecOpenIndices(leaf_part_rri, - (mtstate != NULL && - mtstate->mt_onconflict != ONCONFLICT_NONE)); + (node != NULL && + node->onConflictAction != ONCONFLICT_NONE)); /* * Build WITH CHECK OPTION constraints for the partition. Note that we diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c index 3332ae4bf3..745be7ba30 100644 --- a/src/backend/executor/nodeModifyTable.c +++ b/src/backend/executor/nodeModifyTable.c @@ -258,8 +258,6 @@ static TupleTableSlot * ExecInsert(ModifyTableState *mtstate, TupleTableSlot *slot, TupleTableSlot *planSlot, - List *arbiterIndexes, - OnConflictAction onconflict, EState *estate, bool canSetTag) { @@ -271,6 +269,8 @@ ExecInsert(ModifyTableState *mtstate, List *recheckIndexes = NIL; TupleTableSlot *result = NULL; TransitionCaptureState *ar_insert_trig_tcs; + ModifyTable *node = (ModifyTable *) mtstate->ps.plan; + OnConflictAction onconflict = node->onConflictAction; /* * get the heap tuple out of the tuple table slot, making sure we have a @@ -455,6 +455,7 @@ ExecInsert(ModifyTableState *mtstate, else { WCOKind wco_kind; + bool check_partition_constr; /* * We always check the partition constraint, including when the tuple @@ -463,8 +464,7 @@ ExecInsert(ModifyTableState *mtstate, * trigger might modify the tuple such that the partition constraint * is no longer satisfied, so we need to check in that case. */ - bool check_partition_constr = - (resultRelInfo->ri_PartitionCheck != NIL); + check_partition_constr = (resultRelInfo->ri_PartitionCheck != NIL); /* * Constraints might reference the tableoid column, so initialize @@ -510,6 +510,9 @@ ExecInsert(ModifyTableState *mtstate, uint32 specToken; ItemPointerData conflictTid; bool specConflict; + List *arbiterIndexes; + + arbiterIndexes = node->arbiterIndexes; /* * Do a non-conclusive check for conflicts first. @@ -627,7 +630,7 @@ ExecInsert(ModifyTableState *mtstate, if (resultRelInfo->ri_NumIndices > 0) recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self), estate, false, NULL, - arbiterIndexes); + NIL); } } @@ -1217,8 +1220,8 @@ lreplace:; Assert(mtstate->rootResultRelInfo != NULL); estate->es_result_relation_info = mtstate->rootResultRelInfo; - ret_slot = ExecInsert(mtstate, slot, planSlot, NULL, - ONCONFLICT_NONE, estate, canSetTag); + ret_slot = ExecInsert(mtstate, slot, planSlot, + estate, canSetTag); /* * Revert back the active result relation and the active @@ -1582,6 +1585,7 @@ ExecOnConflictUpdate(ModifyTableState *mtstate, static void fireBSTriggers(ModifyTableState *node) { + ModifyTable *plan = (ModifyTable *) node->ps.plan; ResultRelInfo *resultRelInfo = node->resultRelInfo; /* @@ -1596,7 +1600,7 @@ fireBSTriggers(ModifyTableState *node) { case CMD_INSERT: ExecBSInsertTriggers(node->ps.state, resultRelInfo); - if (node->mt_onconflict == ONCONFLICT_UPDATE) + if (plan->onConflictAction == ONCONFLICT_UPDATE) ExecBSUpdateTriggers(node->ps.state, resultRelInfo); break; @@ -1640,12 +1644,13 @@ getTargetResultRelInfo(ModifyTableState *node) static void fireASTriggers(ModifyTableState *node) { + ModifyTable *plan = (ModifyTable *) node->ps.plan; ResultRelInfo *resultRelInfo = getTargetResultRelInfo(node); switch (node->operation) { case CMD_INSERT: - if (node->mt_onconflict == ONCONFLICT_UPDATE) + if (plan->onConflictAction == ONCONFLICT_UPDATE) ExecASUpdateTriggers(node->ps.state, resultRelInfo, node->mt_oc_transition_capture); @@ -1673,6 +1678,7 @@ fireASTriggers(ModifyTableState *node) static void ExecSetupTransitionCaptureState(ModifyTableState *mtstate, EState *estate) { + ModifyTable *plan = (ModifyTable *) mtstate->ps.plan; ResultRelInfo *targetRelInfo = getTargetResultRelInfo(mtstate); /* Check for transition tables on the directly targeted relation. */ @@ -1680,8 +1686,8 @@ ExecSetupTransitionCaptureState(ModifyTableState *mtstate, EState *estate) MakeTransitionCaptureState(targetRelInfo->ri_TrigDesc, RelationGetRelid(targetRelInfo->ri_RelationDesc), mtstate->operation); - if (mtstate->operation == CMD_INSERT && - mtstate->mt_onconflict == ONCONFLICT_UPDATE) + if (plan->operation == CMD_INSERT && + plan->onConflictAction == ONCONFLICT_UPDATE) mtstate->mt_oc_transition_capture = MakeTransitionCaptureState(targetRelInfo->ri_TrigDesc, RelationGetRelid(targetRelInfo->ri_RelationDesc), @@ -2052,7 +2058,6 @@ ExecModifyTable(PlanState *pstate) { case CMD_INSERT: slot = ExecInsert(node, slot, planSlot, - node->mt_arbiterindexes, node->mt_onconflict, estate, node->canSetTag); break; case CMD_UPDATE: @@ -2136,8 +2141,6 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags) mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans); mtstate->mt_nplans = nplans; - mtstate->mt_onconflict = node->onConflictAction; - mtstate->mt_arbiterindexes = node->arbiterIndexes; /* set up epqstate with dummy subplan data for the moment */ EvalPlanQualInit(&mtstate->mt_epqstate, estate, NULL, NIL, node->epqParam); @@ -2180,7 +2183,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags) if (resultRelInfo->ri_RelationDesc->rd_rel->relhasindex && operation != CMD_DELETE && resultRelInfo->ri_IndexRelationDescs == NULL) - ExecOpenIndices(resultRelInfo, mtstate->mt_onconflict != ONCONFLICT_NONE); + ExecOpenIndices(resultRelInfo, + node->onConflictAction != ONCONFLICT_NONE); /* * If this is an UPDATE and a BEFORE UPDATE trigger is present, the diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h index a953820f43..3b926014b6 100644 --- a/src/include/nodes/execnodes.h +++ b/src/include/nodes/execnodes.h @@ -989,9 +989,6 @@ typedef struct ModifyTableState List **mt_arowmarks; /* per-subplan ExecAuxRowMark lists */ EPQState mt_epqstate; /* for evaluating EvalPlanQual rechecks */ bool fireBSTriggers; /* do we need to fire stmt triggers? */ - OnConflictAction mt_onconflict; /* ON CONFLICT type */ - List *mt_arbiterindexes; /* unique index OIDs to arbitrate taking - * alt path */ TupleTableSlot *mt_existing; /* slot to store existing target tuple in */ List *mt_excludedtlist; /* the excluded pseudo relation's tlist */ TupleTableSlot *mt_conflproj; /* CONFLICT ... SET ... projection target */ -- 2.11.0
>From fa2667d9546ccaff21e2221b4766eec1c160d482 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera <alvhe...@alvh.no-ip.org> Date: Thu, 1 Mar 2018 19:58:50 -0300 Subject: [PATCH v5 2/3] Make some static functions work on TupleDesc rather than Relation --- src/backend/optimizer/prep/preptlist.c | 23 ++++++++--------- src/backend/optimizer/prep/prepunion.c | 45 ++++++++++++++++++---------------- 2 files changed, 36 insertions(+), 32 deletions(-) diff --git a/src/backend/optimizer/prep/preptlist.c b/src/backend/optimizer/prep/preptlist.c index 8603feef2b..b6e658fe81 100644 --- a/src/backend/optimizer/prep/preptlist.c +++ b/src/backend/optimizer/prep/preptlist.c @@ -54,7 +54,7 @@ static List *expand_targetlist(List *tlist, int command_type, - Index result_relation, Relation rel); + Index result_relation, TupleDesc tupdesc); /* @@ -116,7 +116,8 @@ preprocess_targetlist(PlannerInfo *root) tlist = parse->targetList; if (command_type == CMD_INSERT || command_type == CMD_UPDATE) tlist = expand_targetlist(tlist, command_type, - result_relation, target_relation); + result_relation, + RelationGetDescr(target_relation)); /* * Add necessary junk columns for rowmarked rels. These values are needed @@ -230,7 +231,7 @@ preprocess_targetlist(PlannerInfo *root) expand_targetlist(parse->onConflict->onConflictSet, CMD_UPDATE, result_relation, - target_relation); + RelationGetDescr(target_relation)); if (target_relation) heap_close(target_relation, NoLock); @@ -247,13 +248,13 @@ preprocess_targetlist(PlannerInfo *root) /* * expand_targetlist - * Given a target list as generated by the parser and a result relation, - * add targetlist entries for any missing attributes, and ensure the - * non-junk attributes appear in proper field order. + * Given a target list as generated by the parser and a result relation's + * tuple descriptor, add targetlist entries for any missing attributes, and + * ensure the non-junk attributes appear in proper field order. */ static List * expand_targetlist(List *tlist, int command_type, - Index result_relation, Relation rel) + Index result_relation, TupleDesc tupdesc) { List *new_tlist = NIL; ListCell *tlist_item; @@ -266,14 +267,14 @@ expand_targetlist(List *tlist, int command_type, * The rewriter should have already ensured that the TLEs are in correct * order; but we have to insert TLEs for any missing attributes. * - * Scan the tuple description in the relation's relcache entry to make - * sure we have all the user attributes in the right order. + * Scan the tuple description to make sure we have all the user attributes + * in the right order. */ - numattrs = RelationGetNumberOfAttributes(rel); + numattrs = tupdesc->natts; for (attrno = 1; attrno <= numattrs; attrno++) { - Form_pg_attribute att_tup = TupleDescAttr(rel->rd_att, attrno - 1); + Form_pg_attribute att_tup = TupleDescAttr(tupdesc, attrno - 1); TargetEntry *new_tle = NULL; if (tlist_item != NULL) diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index b586f941a8..d0d9812da6 100644 --- a/src/backend/optimizer/prep/prepunion.c +++ b/src/backend/optimizer/prep/prepunion.c @@ -113,10 +113,10 @@ static void expand_single_inheritance_child(PlannerInfo *root, PlanRowMark *top_parentrc, Relation childrel, List **appinfos, RangeTblEntry **childrte_p, Index *childRTindex_p); -static void make_inh_translation_list(Relation oldrelation, - Relation newrelation, - Index newvarno, - List **translated_vars); +static List *make_inh_translation_list(TupleDesc old_tupdesc, + TupleDesc new_tupdesc, + char *new_rel_name, + Index newvarno); static Bitmapset *translate_col_privs(const Bitmapset *parent_privs, List *translated_vars); static Node *adjust_appendrel_attrs_mutator(Node *node, @@ -1730,8 +1730,11 @@ expand_single_inheritance_child(PlannerInfo *root, RangeTblEntry *parentrte, appinfo->child_relid = childRTindex; appinfo->parent_reltype = parentrel->rd_rel->reltype; appinfo->child_reltype = childrel->rd_rel->reltype; - make_inh_translation_list(parentrel, childrel, childRTindex, - &appinfo->translated_vars); + appinfo->translated_vars = + make_inh_translation_list(RelationGetDescr(parentrel), + RelationGetDescr(childrel), + RelationGetRelationName(childrel), + childRTindex); appinfo->parent_reloid = parentOID; *appinfos = lappend(*appinfos, appinfo); @@ -1788,22 +1791,23 @@ expand_single_inheritance_child(PlannerInfo *root, RangeTblEntry *parentrte, /* * make_inh_translation_list - * Build the list of translations from parent Vars to child Vars for - * an inheritance child. + * Build the list of translations from parent Vars ("old" rel) to child + * Vars ("new" rel) for an inheritance child. * * For paranoia's sake, we match type/collation as well as attribute name. */ -static void -make_inh_translation_list(Relation oldrelation, Relation newrelation, - Index newvarno, - List **translated_vars) +static List * +make_inh_translation_list(TupleDesc old_tupdesc, TupleDesc new_tupdesc, + char *new_rel_name, + Index newvarno) { List *vars = NIL; - TupleDesc old_tupdesc = RelationGetDescr(oldrelation); - TupleDesc new_tupdesc = RelationGetDescr(newrelation); int oldnatts = old_tupdesc->natts; int newnatts = new_tupdesc->natts; int old_attno; + bool equal_tupdescs; + + equal_tupdescs = equalTupleDescs(old_tupdesc, new_tupdesc); for (old_attno = 0; old_attno < oldnatts; old_attno++) { @@ -1827,10 +1831,9 @@ make_inh_translation_list(Relation oldrelation, Relation newrelation, attcollation = att->attcollation; /* - * When we are generating the "translation list" for the parent table - * of an inheritance set, no need to search for matches. + * When the tupledescs are identical, no need to search for matches. */ - if (oldrelation == newrelation) + if (equal_tupdescs) { vars = lappend(vars, makeVar(newvarno, (AttrNumber) (old_attno + 1), @@ -1867,16 +1870,16 @@ make_inh_translation_list(Relation oldrelation, Relation newrelation, } if (new_attno >= newnatts) elog(ERROR, "could not find inherited attribute \"%s\" of relation \"%s\"", - attname, RelationGetRelationName(newrelation)); + attname, new_rel_name); } /* Found it, check type and collation match */ if (atttypid != att->atttypid || atttypmod != att->atttypmod) elog(ERROR, "attribute \"%s\" of relation \"%s\" does not match parent's type", - attname, RelationGetRelationName(newrelation)); + attname, new_rel_name); if (attcollation != att->attcollation) elog(ERROR, "attribute \"%s\" of relation \"%s\" does not match parent's collation", - attname, RelationGetRelationName(newrelation)); + attname, new_rel_name); vars = lappend(vars, makeVar(newvarno, (AttrNumber) (new_attno + 1), @@ -1886,7 +1889,7 @@ make_inh_translation_list(Relation oldrelation, Relation newrelation, 0)); } - *translated_vars = vars; + return vars; } /* -- 2.11.0
>From 6861c5632f09405dbcc544c97b72e1c5458e9147 Mon Sep 17 00:00:00 2001 From: amit <amitlangot...@gmail.com> Date: Wed, 28 Feb 2018 17:58:00 +0900 Subject: [PATCH v5 3/3] Fix ON CONFLICT to work with partitioned tables --- doc/src/sgml/ddl.sgml | 15 --- src/backend/catalog/heap.c | 2 +- src/backend/catalog/partition.c | 62 ++++++--- src/backend/commands/tablecmds.c | 15 ++- src/backend/executor/execPartition.c | 179 ++++++++++++++++++++++++-- src/backend/executor/nodeModifyTable.c | 70 ++++++++-- src/backend/optimizer/prep/preptlist.c | 25 ++-- src/backend/optimizer/prep/prepunion.c | 45 ++++++- src/backend/parser/analyze.c | 7 - src/include/catalog/partition.h | 2 +- src/include/executor/execPartition.h | 10 ++ src/include/nodes/execnodes.h | 1 + src/include/optimizer/prep.h | 11 ++ src/test/regress/expected/insert_conflict.out | 73 +++++++++-- src/test/regress/sql/insert_conflict.sql | 64 +++++++-- 15 files changed, 480 insertions(+), 101 deletions(-) diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index 3a54ba9d5a..8805b88d82 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -3324,21 +3324,6 @@ ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02 <listitem> <para> - Using the <literal>ON CONFLICT</literal> clause with partitioned tables - will cause an error if the conflict target is specified (see - <xref linkend="sql-on-conflict" /> for more details on how the clause - works). Therefore, it is not possible to specify - <literal>DO UPDATE</literal> as the alternative action, because - specifying the conflict target is mandatory in that case. On the other - hand, specifying <literal>DO NOTHING</literal> as the alternative action - works fine provided the conflict target is not specified. In that case, - unique constraints (or exclusion constraints) of the individual leaf - partitions are considered. - </para> - </listitem> - - <listitem> - <para> When an <command>UPDATE</command> causes a row to move from one partition to another, there is a chance that another concurrent <command>UPDATE</command> or <command>DELETE</command> misses this row. diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index 3d80ff9e5b..13489162df 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -1776,7 +1776,7 @@ heap_drop_with_catalog(Oid relid) elog(ERROR, "cache lookup failed for relation %u", relid); if (((Form_pg_class) GETSTRUCT(tuple))->relispartition) { - parentOid = get_partition_parent(relid); + parentOid = get_partition_parent(relid, false); LockRelationOid(parentOid, AccessExclusiveLock); /* diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c index 786c05df73..8dc73ae092 100644 --- a/src/backend/catalog/partition.c +++ b/src/backend/catalog/partition.c @@ -192,6 +192,7 @@ static int get_partition_bound_num_indexes(PartitionBoundInfo b); static int get_greatest_modulus(PartitionBoundInfo b); static uint64 compute_hash_value(int partnatts, FmgrInfo *partsupfunc, Datum *values, bool *isnull); +static Oid get_partition_parent_recurse(Relation inhRel, Oid relid, bool getroot); /* * RelationBuildPartitionDesc @@ -1384,24 +1385,43 @@ check_default_allows_bound(Relation parent, Relation default_rel, /* * get_partition_parent + * Obtain direct parent or topmost ancestor of given relation * - * Returns inheritance parent of a partition by scanning pg_inherits + * Returns direct inheritance parent of a partition by scanning pg_inherits; + * or, if 'getroot' is true, the topmost parent in the inheritance hierarchy. * * Note: Because this function assumes that the relation whose OID is passed * as an argument will have precisely one parent, it should only be called * when it is known that the relation is a partition. */ Oid -get_partition_parent(Oid relid) +get_partition_parent(Oid relid, bool getroot) +{ + Relation inhRel; + Oid parentOid; + + inhRel = heap_open(InheritsRelationId, AccessShareLock); + + parentOid = get_partition_parent_recurse(inhRel, relid, getroot); + if (parentOid == InvalidOid) + elog(ERROR, "could not find parent of relation %u", relid); + + heap_close(inhRel, AccessShareLock); + + return parentOid; +} + +/* + * get_partition_parent_recurse + * Recursive part of get_partition_parent + */ +static Oid +get_partition_parent_recurse(Relation inhRel, Oid relid, bool getroot) { - Form_pg_inherits form; - Relation catalogRelation; SysScanDesc scan; ScanKeyData key[2]; HeapTuple tuple; - Oid result; - - catalogRelation = heap_open(InheritsRelationId, AccessShareLock); + Oid result = InvalidOid; ScanKeyInit(&key[0], Anum_pg_inherits_inhrelid, @@ -1412,18 +1432,26 @@ get_partition_parent(Oid relid) BTEqualStrategyNumber, F_INT4EQ, Int32GetDatum(1)); - scan = systable_beginscan(catalogRelation, InheritsRelidSeqnoIndexId, true, + /* Obtain the direct parent, and release resources before recursing */ + scan = systable_beginscan(inhRel, InheritsRelidSeqnoIndexId, true, NULL, 2, key); - tuple = systable_getnext(scan); - if (!HeapTupleIsValid(tuple)) - elog(ERROR, "could not find tuple for parent of relation %u", relid); - - form = (Form_pg_inherits) GETSTRUCT(tuple); - result = form->inhparent; - + if (HeapTupleIsValid(tuple)) + result = ((Form_pg_inherits) GETSTRUCT(tuple))->inhparent; systable_endscan(scan); - heap_close(catalogRelation, AccessShareLock); + + /* + * If we were asked to recurse, do so now. Except that if we didn't get a + * valid parent, then the 'relid' argument was already the topmost parent, + * so return that. + */ + if (getroot) + { + if (OidIsValid(result)) + return get_partition_parent_recurse(inhRel, result, getroot); + else + return relid; + } return result; } @@ -2505,7 +2533,7 @@ generate_partition_qual(Relation rel) return copyObject(rel->rd_partcheck); /* Grab at least an AccessShareLock on the parent table */ - parent = heap_open(get_partition_parent(RelationGetRelid(rel)), + parent = heap_open(get_partition_parent(RelationGetRelid(rel), false), AccessShareLock); /* Get pg_class.relpartbound */ diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 218224a156..6003afdd03 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -1292,7 +1292,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid, */ if (is_partition && relOid != oldRelOid) { - state->partParentOid = get_partition_parent(relOid); + state->partParentOid = get_partition_parent(relOid, false); if (OidIsValid(state->partParentOid)) LockRelationOid(state->partParentOid, AccessExclusiveLock); } @@ -5843,7 +5843,8 @@ ATExecDropNotNull(Relation rel, const char *colName, LOCKMODE lockmode) /* If rel is partition, shouldn't drop NOT NULL if parent has the same */ if (rel->rd_rel->relispartition) { - Oid parentId = get_partition_parent(RelationGetRelid(rel)); + Oid parentId = get_partition_parent(RelationGetRelid(rel), + false); Relation parent = heap_open(parentId, AccessShareLock); TupleDesc tupDesc = RelationGetDescr(parent); AttrNumber parent_attnum; @@ -14360,7 +14361,7 @@ ATExecDetachPartition(Relation rel, RangeVar *name) if (!has_superclass(idxid)) continue; - Assert((IndexGetRelation(get_partition_parent(idxid), false) == + Assert((IndexGetRelation(get_partition_parent(idxid, false), false) == RelationGetRelid(rel))); idx = index_open(idxid, AccessExclusiveLock); @@ -14489,7 +14490,7 @@ ATExecAttachPartitionIdx(List **wqueue, Relation parentIdx, RangeVar *name) /* Silently do nothing if already in the right state */ currParent = !has_superclass(partIdxId) ? InvalidOid : - get_partition_parent(partIdxId); + get_partition_parent(partIdxId, false); if (currParent != RelationGetRelid(parentIdx)) { IndexInfo *childInfo; @@ -14722,8 +14723,10 @@ validatePartitionedIndex(Relation partedIdx, Relation partedTbl) /* make sure we see the validation we just did */ CommandCounterIncrement(); - parentIdxId = get_partition_parent(RelationGetRelid(partedIdx)); - parentTblId = get_partition_parent(RelationGetRelid(partedTbl)); + parentIdxId = get_partition_parent(RelationGetRelid(partedIdx), + false); + parentTblId = get_partition_parent(RelationGetRelid(partedTbl), + false); parentIdx = relation_open(parentIdxId, AccessExclusiveLock); parentTbl = relation_open(parentTblId, AccessExclusiveLock); Assert(!parentIdx->rd_index->indisvalid); diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c index ce9a4e16cf..4c0812c5d2 100644 --- a/src/backend/executor/execPartition.c +++ b/src/backend/executor/execPartition.c @@ -19,6 +19,7 @@ #include "executor/executor.h" #include "mb/pg_wchar.h" #include "miscadmin.h" +#include "optimizer/prep.h" #include "utils/lsyscache.h" #include "utils/rls.h" #include "utils/ruleutils.h" @@ -64,6 +65,8 @@ ExecSetupPartitionTupleRouting(ModifyTableState *mtstate, Relation rel) int num_update_rri = 0, update_rri_index = 0; PartitionTupleRouting *proute; + int nparts; + ModifyTable *node = mtstate ? (ModifyTable *) mtstate->ps.plan : NULL; /* * Get the information about the partition tree after locking all the @@ -74,20 +77,16 @@ ExecSetupPartitionTupleRouting(ModifyTableState *mtstate, Relation rel) proute->partition_dispatch_info = RelationGetPartitionDispatchInfo(rel, &proute->num_dispatch, &leaf_parts); - proute->num_partitions = list_length(leaf_parts); - proute->partitions = (ResultRelInfo **) palloc(proute->num_partitions * - sizeof(ResultRelInfo *)); + proute->num_partitions = nparts = list_length(leaf_parts); + proute->partitions = + (ResultRelInfo **) palloc(nparts * sizeof(ResultRelInfo *)); proute->parent_child_tupconv_maps = - (TupleConversionMap **) palloc0(proute->num_partitions * - sizeof(TupleConversionMap *)); - proute->partition_oids = (Oid *) palloc(proute->num_partitions * - sizeof(Oid)); + (TupleConversionMap **) palloc0(nparts * sizeof(TupleConversionMap *)); + proute->partition_oids = (Oid *) palloc(nparts * sizeof(Oid)); /* Set up details specific to the type of tuple routing we are doing. */ - if (mtstate && mtstate->operation == CMD_UPDATE) + if (node && node->operation == CMD_UPDATE) { - ModifyTable *node = (ModifyTable *) mtstate->ps.plan; - update_rri = mtstate->resultRelInfo; num_update_rri = list_length(node->plans); proute->subplan_partition_offsets = @@ -109,6 +108,21 @@ ExecSetupPartitionTupleRouting(ModifyTableState *mtstate, Relation rel) */ proute->partition_tuple_slot = MakeTupleTableSlot(NULL); + /* + * We might need these arrays for conflict checking and handling the + * DO UPDATE action + */ + if (node && node->onConflictAction != ONCONFLICT_NONE) + { + /* Indexes are always needed. */ + proute->partition_arbiter_indexes = + (List **) palloc(nparts * sizeof(List *)); + /* Only needed for the DO UPDATE action. */ + if (node->onConflictAction == ONCONFLICT_UPDATE) + proute->partition_conflproj_tdescs = + (TupleDesc *) palloc(nparts * sizeof(TupleDesc)); + } + i = 0; foreach(cell, leaf_parts) { @@ -475,9 +489,6 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, &mtstate->ps, RelationGetDescr(partrel)); } - Assert(proute->partitions[partidx] == NULL); - proute->partitions[partidx] = leaf_part_rri; - /* * Save a tuple conversion map to convert a tuple routed to this partition * from the parent's type to the partition's. @@ -487,6 +498,148 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, RelationGetDescr(partrel), gettext_noop("could not convert row type")); + /* + * Initialize information about this partition that's needed to handle + * the ON CONFLICT clause. + */ + if (node && node->onConflictAction != ONCONFLICT_NONE) + { + TupleConversionMap *map = proute->parent_child_tupconv_maps[partidx]; + int firstVarno = mtstate->resultRelInfo[0].ri_RangeTableIndex; + Relation firstResultRel = mtstate->resultRelInfo[0].ri_RelationDesc; + TupleDesc partrelDesc = RelationGetDescr(partrel); + TupleDesc rootrelDesc = RelationGetDescr(firstResultRel); + ExprContext *econtext = mtstate->ps.ps_ExprContext; + ListCell *lc; + List *my_arbiterindexes = NIL; + + if (node->onConflictAction == ONCONFLICT_UPDATE) + { + List *onconflset; + TupleDesc tupDesc; + + /* + * Expand the ON CONFLICT DO UPDATE SET target list so that it + * contains any attributes of partition that are missing in the + * original list (including any dropped columns). We may need to + * adjust it for inheritance translation of attributes if the + * partition's tuple descriptor doesn't match the root parent's, + * so pass it through adjust_and_expand_partition_tlist() instead + * of directly calling expand_targetlist(). + */ + if (map != NULL) + { + /* + * Convert the Vars to contain partition's atttribute numbers + */ + + /* First convert references to EXCLUDED pseudo-relation. */ + onconflset = map_partition_varattnos(node->onConflictSet, + INNER_VAR, + partrel, + firstResultRel, NULL); + /* Then convert references to main target relation. */ + onconflset = map_partition_varattnos(onconflset, + firstVarno, + partrel, + firstResultRel, NULL); + + /* + * We also need to change TargetEntry nodes to have correct + * resnos. + */ + onconflset = + adjust_and_expand_partition_tlist(rootrelDesc, + partrelDesc, + RelationGetRelationName(partrel), + firstVarno, + RelationGetRelid(firstResultRel), + onconflset); + } + else + /* Just expand. */ + onconflset = expand_targetlist(node->onConflictSet, + CMD_UPDATE, + firstVarno, + partrelDesc); + + /* + * We must set mtstate->mt_conflproj's tuple descriptor to this + * before trying to use it for projection. + */ + tupDesc = ExecTypeFromTL(onconflset, partrelDesc->tdhasoid); + PinTupleDesc(tupDesc); + proute->partition_conflproj_tdescs[partidx] = tupDesc; + + leaf_part_rri->ri_onConflictSetProj = + ExecBuildProjectionInfo(onconflset, econtext, + mtstate->mt_conflproj, + &mtstate->ps, partrelDesc); + + if (node->onConflictWhere) + { + if (map != NULL) + { + /* + * Convert the Vars to contain partition's atttribute + * numbers + */ + List *onconflwhere; + + /* First convert references to EXCLUDED pseudo-relation. */ + onconflwhere = map_partition_varattnos((List *) + node->onConflictWhere, + INNER_VAR, + partrel, + firstResultRel, NULL); + /* Then convert references to main target relation. */ + onconflwhere = map_partition_varattnos((List *) + onconflwhere, + firstVarno, + partrel, + firstResultRel, NULL); + leaf_part_rri->ri_onConflictSetWhere = + ExecInitQual(onconflwhere, &mtstate->ps); + } + else + /* Just reuse the original one. */ + leaf_part_rri->ri_onConflictSetWhere = + resultRelInfo->ri_onConflictSetWhere; + } + } + + /* Initialize arbiter indexes list, if any. */ + foreach(lc, ((ModifyTable *) mtstate->ps.plan)->arbiterIndexes) + { + Oid parentArbiterIndexOid = lfirst_oid(lc); + int i; + + /* + * Find parentArbiterIndexOid's child in this partition and add it + * to my_arbiterindexes. + */ + for (i = 0; i < leaf_part_rri->ri_NumIndices; i++) + { + Relation index = leaf_part_rri->ri_IndexRelationDescs[i]; + Oid indexOid = RelationGetRelid(index); + + if (parentArbiterIndexOid == + get_partition_parent(indexOid, true)) + my_arbiterindexes = lappend_oid(my_arbiterindexes, + indexOid); + } + } + + /* + * Use this list instead of the original one containing parent's + * indexes. + */ + proute->partition_arbiter_indexes[partidx] = my_arbiterindexes; + } + + Assert(proute->partitions[partidx] == NULL); + proute->partitions[partidx] = leaf_part_rri; + MemoryContextSwitchTo(oldContext); return leaf_part_rri; diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c index 745be7ba30..21466920f1 100644 --- a/src/backend/executor/nodeModifyTable.c +++ b/src/backend/executor/nodeModifyTable.c @@ -56,6 +56,7 @@ static bool ExecOnConflictUpdate(ModifyTableState *mtstate, ResultRelInfo *resultRelInfo, + TupleDesc onConflictSetTupdesc, ItemPointer conflictTid, TupleTableSlot *planSlot, TupleTableSlot *excludedSlot, @@ -269,8 +270,10 @@ ExecInsert(ModifyTableState *mtstate, List *recheckIndexes = NIL; TupleTableSlot *result = NULL; TransitionCaptureState *ar_insert_trig_tcs; + TupleDesc partConflTupdesc = NULL; ModifyTable *node = (ModifyTable *) mtstate->ps.plan; OnConflictAction onconflict = node->onConflictAction; + List *partArbiterIndexes = NIL; /* * get the heap tuple out of the tuple table slot, making sure we have a @@ -286,8 +289,8 @@ ExecInsert(ModifyTableState *mtstate, /* Determine the partition to heap_insert the tuple into */ if (mtstate->mt_partition_tuple_routing) { - int leaf_part_index; PartitionTupleRouting *proute = mtstate->mt_partition_tuple_routing; + int leaf_part_index; /* * Away we go ... If we end up not finding a partition after all, @@ -374,6 +377,16 @@ ExecInsert(ModifyTableState *mtstate, tuple, proute->partition_tuple_slot, &slot); + + /* determine this partition's ON CONFLICT information */ + if (onconflict != ONCONFLICT_NONE) + { + partArbiterIndexes = + proute->partition_arbiter_indexes[leaf_part_index]; + if (onconflict == ONCONFLICT_UPDATE) + partConflTupdesc = + proute->partition_conflproj_tdescs[leaf_part_index]; + } } resultRelationDesc = resultRelInfo->ri_RelationDesc; @@ -512,7 +525,11 @@ ExecInsert(ModifyTableState *mtstate, bool specConflict; List *arbiterIndexes; - arbiterIndexes = node->arbiterIndexes; + /* Use the appropriate list of arbiter indexes. */ + if (mtstate->mt_partition_tuple_routing != NULL) + arbiterIndexes = partArbiterIndexes; + else + arbiterIndexes = node->arbiterIndexes; /* * Do a non-conclusive check for conflicts first. @@ -541,8 +558,16 @@ ExecInsert(ModifyTableState *mtstate, * tuple. */ TupleTableSlot *returning = NULL; + TupleDesc onconfl_tupdesc; + + /* Use the appropriate tuple descriptor. */ + if (mtstate->mt_partition_tuple_routing != NULL) + onconfl_tupdesc = partConflTupdesc; + else + onconfl_tupdesc = mtstate->mt_conflproj_tupdesc; if (ExecOnConflictUpdate(mtstate, resultRelInfo, + onconfl_tupdesc, &conflictTid, planSlot, slot, estate, canSetTag, &returning)) { @@ -1149,6 +1174,18 @@ lreplace:; TupleConversionMap *tupconv_map; /* + * Disallow an INSERT ON CONFLICT DO UPDATE that causes the + * original row to migrate to a different partition. Maybe this + * can be implemented some day, but it seems a fringe feature with + * little redeeming value. + */ + if (((ModifyTable *) mtstate->ps.plan)->onConflictAction == ONCONFLICT_UPDATE) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("invalid ON UPDATE specification"), + errdetail("The result tuple would appear in a different partition than the original tuple."))); + + /* * When an UPDATE is run on a leaf partition, we will not have * partition tuple routing set up. In that case, fail with * partition constraint violation error. @@ -1399,6 +1436,7 @@ lreplace:; static bool ExecOnConflictUpdate(ModifyTableState *mtstate, ResultRelInfo *resultRelInfo, + TupleDesc onConflictSetTupdesc, ItemPointer conflictTid, TupleTableSlot *planSlot, TupleTableSlot *excludedSlot, @@ -1514,6 +1552,7 @@ ExecOnConflictUpdate(ModifyTableState *mtstate, ExecCheckHeapTupleVisible(estate, &tuple, buffer); /* Store target's existing tuple in the state's dedicated slot */ + ExecSetSlotDescriptor(mtstate->mt_existing, RelationGetDescr(relation)); ExecStoreTuple(&tuple, mtstate->mt_existing, buffer, false); /* @@ -1557,6 +1596,7 @@ ExecOnConflictUpdate(ModifyTableState *mtstate, } /* Project the new tuple version */ + ExecSetSlotDescriptor(mtstate->mt_conflproj, onConflictSetTupdesc); ExecProject(resultRelInfo->ri_onConflictSetProj); /* @@ -2163,8 +2203,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags) subplan = (Plan *) lfirst(l); /* Initialize the usesFdwDirectModify flag */ - resultRelInfo->ri_usesFdwDirectModify = bms_is_member(i, - node->fdwDirectModifyPlans); + resultRelInfo->ri_usesFdwDirectModify = + bms_is_member(i, node->fdwDirectModifyPlans); /* * Verify result relation is a valid target for the current operation @@ -2237,7 +2277,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags) if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE && (operation == CMD_INSERT || update_tuple_routing_needed)) mtstate->mt_partition_tuple_routing = - ExecSetupPartitionTupleRouting(mtstate, rel); + ExecSetupPartitionTupleRouting(mtstate, rel); /* * Build state for collecting transition tuples. This requires having a @@ -2353,9 +2393,13 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags) econtext = mtstate->ps.ps_ExprContext; relationDesc = resultRelInfo->ri_RelationDesc->rd_att; - /* initialize slot for the existing tuple */ - mtstate->mt_existing = - ExecInitExtraTupleSlot(mtstate->ps.state, relationDesc); + /* + * Initialize slot for the existing tuple. We determine which + * tupleDesc to use for this after we have determined which relation + * the insert/update will be applied to, possibly after performing + * tuple routing. + */ + mtstate->mt_existing = ExecInitExtraTupleSlot(mtstate->ps.state, NULL); /* carried forward solely for the benefit of explain */ mtstate->mt_excludedtlist = node->exclRelTlist; @@ -2363,8 +2407,16 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags) /* create target slot for UPDATE SET projection */ tupDesc = ExecTypeFromTL((List *) node->onConflictSet, relationDesc->tdhasoid); + PinTupleDesc(tupDesc); + mtstate->mt_conflproj_tupdesc = tupDesc; + + /* + * Just like the "existing tuple" slot, we'll defer deciding which + * tupleDesc to use for this slot to a point where tuple routing has + * been performed. + */ mtstate->mt_conflproj = - ExecInitExtraTupleSlot(mtstate->ps.state, tupDesc); + ExecInitExtraTupleSlot(mtstate->ps.state, NULL); /* build UPDATE SET projection state */ resultRelInfo->ri_onConflictSetProj = diff --git a/src/backend/optimizer/prep/preptlist.c b/src/backend/optimizer/prep/preptlist.c index b6e658fe81..6eda8be4b1 100644 --- a/src/backend/optimizer/prep/preptlist.c +++ b/src/backend/optimizer/prep/preptlist.c @@ -53,10 +53,6 @@ #include "utils/rel.h" -static List *expand_targetlist(List *tlist, int command_type, - Index result_relation, TupleDesc tupdesc); - - /* * preprocess_targetlist * Driver for preprocessing the parse tree targetlist. @@ -227,11 +223,20 @@ preprocess_targetlist(PlannerInfo *root) * while we have the relation open. */ if (parse->onConflict) - parse->onConflict->onConflictSet = - expand_targetlist(parse->onConflict->onConflictSet, - CMD_UPDATE, - result_relation, - RelationGetDescr(target_relation)); + { + Assert(target_relation != NULL); + /* + * For partitioned tables, there is no point in expanding here. + * We rather do it when we know which one of its partitions is chosen + * for a given tuple and use its tuple descriptor for expansion. + */ + if (target_relation->rd_rel->relkind != RELKIND_PARTITIONED_TABLE) + parse->onConflict->onConflictSet = + expand_targetlist(parse->onConflict->onConflictSet, + CMD_UPDATE, + result_relation, + RelationGetDescr(target_relation)); + } if (target_relation) heap_close(target_relation, NoLock); @@ -252,7 +257,7 @@ preprocess_targetlist(PlannerInfo *root) * tuple descriptor, add targetlist entries for any missing attributes, and * ensure the non-junk attributes appear in proper field order. */ -static List * +List * expand_targetlist(List *tlist, int command_type, Index result_relation, TupleDesc tupdesc) { diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index d0d9812da6..cb30cfa2d7 100644 --- a/src/backend/optimizer/prep/prepunion.c +++ b/src/backend/optimizer/prep/prepunion.c @@ -2354,7 +2354,10 @@ adjust_child_relids_multilevel(PlannerInfo *root, Relids relids, * therefore the TargetEntry nodes are fresh copies that it's okay to * scribble on. * - * Note that this is not needed for INSERT because INSERT isn't inheritable. + * This is also used for INSERT ON CONFLICT DO UPDATE performed on partitioned + * tables, to translate the DO UPDATE SET target list from root parent + * attribute numbers to the chosen partition's attribute numbers, which means + * this function is called from the executor. */ static List * adjust_inherited_tlist(List *tlist, AppendRelInfo *context) @@ -2438,6 +2441,46 @@ adjust_inherited_tlist(List *tlist, AppendRelInfo *context) } /* + * Given a targetlist for the parentRel of the given varno, adjust it to be in + * the correct order and to contain all the needed elements for the given + * partition. + */ +List * +adjust_and_expand_partition_tlist(TupleDesc parentDesc, + TupleDesc partitionDesc, + char *partitionRelname, + int parentVarno, + int parentOid, + List *targetlist) +{ + AppendRelInfo appinfo; + List *result_tl; + + /* + * Fist, fix the target entries' resnos, by using inheritance translation. + */ + appinfo.type = T_AppendRelInfo; + appinfo.parent_relid = parentVarno; + appinfo.parent_reltype = InvalidOid; + appinfo.child_relid = 0; + appinfo.child_reltype = InvalidOid; + appinfo.parent_reloid = parentOid; + appinfo.translated_vars = + make_inh_translation_list(parentDesc, partitionDesc, + partitionRelname, 1); + result_tl = adjust_inherited_tlist((List *) targetlist, &appinfo); + + /* + * Add any attributes that are missing in the source list, such + * as dropped columns in the partition. + */ + result_tl = expand_targetlist(result_tl, CMD_UPDATE, + parentVarno, partitionDesc); + + return result_tl; +} + +/* * adjust_appendrel_attrs_multilevel * Apply Var translations from a toplevel appendrel parent down to a child. * diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c index c3a9617f67..92696f0607 100644 --- a/src/backend/parser/analyze.c +++ b/src/backend/parser/analyze.c @@ -1025,13 +1025,6 @@ transformOnConflictClause(ParseState *pstate, TargetEntry *te; int attno; - if (targetrel->rd_partdesc) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("%s cannot be applied to partitioned table \"%s\"", - "ON CONFLICT DO UPDATE", - RelationGetRelationName(targetrel)))); - /* * All INSERT expressions have been parsed, get ready for potentially * existing SET statements that need to be processed like an UPDATE. diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h index 2faf0ca26e..70ddb225a1 100644 --- a/src/include/catalog/partition.h +++ b/src/include/catalog/partition.h @@ -51,7 +51,7 @@ extern PartitionBoundInfo partition_bounds_copy(PartitionBoundInfo src, extern void check_new_partition_bound(char *relname, Relation parent, PartitionBoundSpec *spec); -extern Oid get_partition_parent(Oid relid); +extern Oid get_partition_parent(Oid relid, bool getroot); extern List *get_qual_from_partbound(Relation rel, Relation parent, PartitionBoundSpec *spec); extern List *map_partition_varattnos(List *expr, int fromrel_varno, diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h index 03a599ad57..93f490233e 100644 --- a/src/include/executor/execPartition.h +++ b/src/include/executor/execPartition.h @@ -90,6 +90,14 @@ typedef struct PartitionDispatchData *PartitionDispatch; * given leaf partition's rowtype after that * partition is chosen for insertion by * tuple-routing. + * partition_conflproj_tdescs Array of TupleDescs per partition, each + * describing the record type of the ON CONFLICT + * DO UPDATE SET target list as applied to a + * given partition + * partition_arbiter_indexes Array of Lists with each slot containing the + * list of OIDs of a given partition's indexes + * that are to be used as arbiter indexes for + * ON CONFLICT checking *----------------------- */ typedef struct PartitionTupleRouting @@ -106,6 +114,8 @@ typedef struct PartitionTupleRouting int num_subplan_partition_offsets; TupleTableSlot *partition_tuple_slot; TupleTableSlot *root_tuple_slot; + TupleDesc *partition_conflproj_tdescs; + List **partition_arbiter_indexes; } PartitionTupleRouting; extern PartitionTupleRouting *ExecSetupPartitionTupleRouting(ModifyTableState *mtstate, diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h index 3b926014b6..0e394c9dce 100644 --- a/src/include/nodes/execnodes.h +++ b/src/include/nodes/execnodes.h @@ -992,6 +992,7 @@ typedef struct ModifyTableState TupleTableSlot *mt_existing; /* slot to store existing target tuple in */ List *mt_excludedtlist; /* the excluded pseudo relation's tlist */ TupleTableSlot *mt_conflproj; /* CONFLICT ... SET ... projection target */ + TupleDesc mt_conflproj_tupdesc; /* tuple descriptor for it */ struct PartitionTupleRouting *mt_partition_tuple_routing; /* Tuple-routing support info */ struct TransitionCaptureState *mt_transition_capture; diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h index 38608770a2..4d3e8a9b90 100644 --- a/src/include/optimizer/prep.h +++ b/src/include/optimizer/prep.h @@ -14,6 +14,7 @@ #ifndef PREP_H #define PREP_H +#include "access/tupdesc.h" #include "nodes/plannodes.h" #include "nodes/relation.h" @@ -42,6 +43,9 @@ extern List *preprocess_targetlist(PlannerInfo *root); extern PlanRowMark *get_plan_rowmark(List *rowmarks, Index rtindex); +extern List *expand_targetlist(List *tlist, int command_type, + Index result_relation, TupleDesc tupdesc); + /* * prototypes for prepunion.c */ @@ -65,4 +69,11 @@ extern SpecialJoinInfo *build_child_join_sjinfo(PlannerInfo *root, extern Relids adjust_child_relids_multilevel(PlannerInfo *root, Relids relids, Relids child_relids, Relids top_parent_relids); +extern List *adjust_and_expand_partition_tlist(TupleDesc parentDesc, + TupleDesc partitionDesc, + char *partitionRelname, + int parentVarno, + int parentOid, + List *targetlist); + #endif /* PREP_H */ diff --git a/src/test/regress/expected/insert_conflict.out b/src/test/regress/expected/insert_conflict.out index 2650faedee..a9677f06e6 100644 --- a/src/test/regress/expected/insert_conflict.out +++ b/src/test/regress/expected/insert_conflict.out @@ -786,16 +786,67 @@ select * from selfconflict; (3 rows) drop table selfconflict; --- check that the following works: --- insert into partitioned_table on conflict do nothing -create table parted_conflict_test (a int, b char) partition by list (a); -create table parted_conflict_test_1 partition of parted_conflict_test (b unique) for values in (1); +-- check ON CONFLICT handling with partitioned tables +create table parted_conflict_test (a int unique, b char) partition by list (a); +create table parted_conflict_test_1 partition of parted_conflict_test (b unique) for values in (1, 2); +-- no indexes required here insert into parted_conflict_test values (1, 'a') on conflict do nothing; -insert into parted_conflict_test values (1, 'a') on conflict do nothing; --- however, on conflict do update is not supported yet -insert into parted_conflict_test values (1) on conflict (b) do update set a = excluded.a; -ERROR: ON CONFLICT DO UPDATE cannot be applied to partitioned table "parted_conflict_test" --- but it works OK if we target the partition directly -insert into parted_conflict_test_1 values (1) on conflict (b) do -update set a = excluded.a; +-- index on a required, which does exist in parent +insert into parted_conflict_test values (1, 'a') on conflict (a) do nothing; +insert into parted_conflict_test values (1, 'a') on conflict (a) do update set b = excluded.b; +-- targeting partition directly will work +insert into parted_conflict_test_1 values (1, 'a') on conflict (a) do nothing; +insert into parted_conflict_test_1 values (1, 'b') on conflict (a) do update set b = excluded.b; +-- index on b required, which doesn't exist in parent +insert into parted_conflict_test values (2, 'b') on conflict (b) do update set a = excluded.a; +ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification +-- targeting partition directly will work +insert into parted_conflict_test_1 values (2, 'b') on conflict (b) do update set a = excluded.a; +-- should see (2, 'b') +select * from parted_conflict_test order by a; + a | b +---+--- + 2 | b +(1 row) + +-- now check that DO UPDATE works correctly for target partition with +-- different attribute numbers +create table parted_conflict_test_2 (b char, a int unique); +alter table parted_conflict_test attach partition parted_conflict_test_2 for values in (3); +truncate parted_conflict_test; +insert into parted_conflict_test values (3, 'a') on conflict (a) do update set b = excluded.b; +insert into parted_conflict_test values (3, 'b') on conflict (a) do update set b = excluded.b; +-- should see (3, 'b') +select * from parted_conflict_test order by a; + a | b +---+--- + 3 | b +(1 row) + +-- case where parent will have a dropped column, but the partition won't +alter table parted_conflict_test drop b, add b char; +create table parted_conflict_test_3 partition of parted_conflict_test for values in (4); +truncate parted_conflict_test; +insert into parted_conflict_test (a, b) values (4, 'a') on conflict (a) do update set b = excluded.b; +insert into parted_conflict_test (a, b) values (4, 'b') on conflict (a) do update set b = excluded.b where parted_conflict_test.b = 'a'; +-- should see (4, 'b') +select * from parted_conflict_test order by a; + a | b +---+--- + 4 | b +(1 row) + +-- case with multi-level partitioning +create table parted_conflict_test_4 partition of parted_conflict_test for values in (5) partition by list (a); +create table parted_conflict_test_4_1 partition of parted_conflict_test_4 for values in (5); +truncate parted_conflict_test; +insert into parted_conflict_test (a, b) values (5, 'a') on conflict (a) do update set b = excluded.b; +insert into parted_conflict_test (a, b) values (5, 'b') on conflict (a) do update set b = excluded.b where parted_conflict_test.b = 'a'; +-- should see (5, 'b') +select * from parted_conflict_test order by a; + a | b +---+--- + 5 | b +(1 row) + drop table parted_conflict_test; diff --git a/src/test/regress/sql/insert_conflict.sql b/src/test/regress/sql/insert_conflict.sql index 32c647e3f8..73122479a3 100644 --- a/src/test/regress/sql/insert_conflict.sql +++ b/src/test/regress/sql/insert_conflict.sql @@ -472,15 +472,59 @@ select * from selfconflict; drop table selfconflict; --- check that the following works: --- insert into partitioned_table on conflict do nothing -create table parted_conflict_test (a int, b char) partition by list (a); -create table parted_conflict_test_1 partition of parted_conflict_test (b unique) for values in (1); +-- check ON CONFLICT handling with partitioned tables +create table parted_conflict_test (a int unique, b char) partition by list (a); +create table parted_conflict_test_1 partition of parted_conflict_test (b unique) for values in (1, 2); + +-- no indexes required here insert into parted_conflict_test values (1, 'a') on conflict do nothing; -insert into parted_conflict_test values (1, 'a') on conflict do nothing; --- however, on conflict do update is not supported yet -insert into parted_conflict_test values (1) on conflict (b) do update set a = excluded.a; --- but it works OK if we target the partition directly -insert into parted_conflict_test_1 values (1) on conflict (b) do -update set a = excluded.a; + +-- index on a required, which does exist in parent +insert into parted_conflict_test values (1, 'a') on conflict (a) do nothing; +insert into parted_conflict_test values (1, 'a') on conflict (a) do update set b = excluded.b; + +-- targeting partition directly will work +insert into parted_conflict_test_1 values (1, 'a') on conflict (a) do nothing; +insert into parted_conflict_test_1 values (1, 'b') on conflict (a) do update set b = excluded.b; + +-- index on b required, which doesn't exist in parent +insert into parted_conflict_test values (2, 'b') on conflict (b) do update set a = excluded.a; + +-- targeting partition directly will work +insert into parted_conflict_test_1 values (2, 'b') on conflict (b) do update set a = excluded.a; + +-- should see (2, 'b') +select * from parted_conflict_test order by a; + +-- now check that DO UPDATE works correctly for target partition with +-- different attribute numbers +create table parted_conflict_test_2 (b char, a int unique); +alter table parted_conflict_test attach partition parted_conflict_test_2 for values in (3); +truncate parted_conflict_test; +insert into parted_conflict_test values (3, 'a') on conflict (a) do update set b = excluded.b; +insert into parted_conflict_test values (3, 'b') on conflict (a) do update set b = excluded.b; + +-- should see (3, 'b') +select * from parted_conflict_test order by a; + +-- case where parent will have a dropped column, but the partition won't +alter table parted_conflict_test drop b, add b char; +create table parted_conflict_test_3 partition of parted_conflict_test for values in (4); +truncate parted_conflict_test; +insert into parted_conflict_test (a, b) values (4, 'a') on conflict (a) do update set b = excluded.b; +insert into parted_conflict_test (a, b) values (4, 'b') on conflict (a) do update set b = excluded.b where parted_conflict_test.b = 'a'; + +-- should see (4, 'b') +select * from parted_conflict_test order by a; + +-- case with multi-level partitioning +create table parted_conflict_test_4 partition of parted_conflict_test for values in (5) partition by list (a); +create table parted_conflict_test_4_1 partition of parted_conflict_test_4 for values in (5); +truncate parted_conflict_test; +insert into parted_conflict_test (a, b) values (5, 'a') on conflict (a) do update set b = excluded.b; +insert into parted_conflict_test (a, b) values (5, 'b') on conflict (a) do update set b = excluded.b where parted_conflict_test.b = 'a'; + +-- should see (5, 'b') +select * from parted_conflict_test order by a; + drop table parted_conflict_test; -- 2.11.0