Fujita-san, Thanks for the review.
On 2018/03/05 22:00, Etsuro Fujita wrote: > I started reviewing this. I think the analysis you mentioned upthread > would be correct, but I'm not sure the patch is the right way to go > because I think that exception handling added by the patch throughout > ExecInsert such as: > > @@ -408,7 +408,10 @@ ExecInsert(ModifyTableState *mtstate, > slot = ExecBRInsertTriggers(estate, resultRelInfo, slot); > > if (slot == NULL) /* "do nothing" */ > - return NULL; > + { > + result = NULL; > + goto restore_estate_result_rel; > + } > > would reduce the code readability. Hmm yeah, it is a bit hacky. > An alternative fix for this would be to handle the set/reset of > estate->es_result_relation_info in a higher level ie, ExecModifyTable, > like the attached: (1) before calling ExecInsert, we do the preparation > work for tuple routing of one tuple (eg, determine the partition for the > tuple and convert the format of the tuple in a separate function, which > also sets estate->es_result_relation_info to the partition for ExecInsert > to work on it; then (2) we call ExecInsert, which just inserts into the > partition; then (3) we reset estate->es_result_relation_info back to the > root partitioned table. One good thing about the alternative is to return > ExecInsert somehow to PG9.6, which would make maintenance easier. COPY > has the same issue, so the attached also fixes that. (Maybe we could do > some refactoring to use the same code in both cases, but I'm not sure we > should do that as a fix.) What do you think about the alternative? Your patch seems like a good cleanup overall, fixes this bug, and seems to make future maintenance easier. So, +1. I agree that doing a surgery like this on COPY is better left for later. I noticed (as you may have too) that it cannot be applied to the 10 branch as is. I made the necessary changes so that it can be. See attached patch with filename suffixed "-10backpatch". Also attached is your patch unchanged to have both of them together. Thanks, Amit
*** a/src/backend/commands/copy.c --- b/src/backend/commands/copy.c *************** *** 2768,2779 **** CopyFrom(CopyState cstate) * tuples inserted by an INSERT command. */ processed++; ! if (saved_resultRelInfo) ! { ! resultRelInfo = saved_resultRelInfo; ! estate->es_result_relation_info = resultRelInfo; ! } } } --- 2768,2780 ---- * tuples inserted by an INSERT command. */ processed++; + } ! /* Restore the saved ResultRelInfo */ ! if (saved_resultRelInfo) ! { ! resultRelInfo = saved_resultRelInfo; ! estate->es_result_relation_info = resultRelInfo; } } *** a/src/backend/executor/nodeModifyTable.c --- b/src/backend/executor/nodeModifyTable.c *************** *** 67,72 **** static void ExecSetupChildParentMapForTcs(ModifyTableState *mtstate); --- 67,77 ---- static void ExecSetupChildParentMapForSubplan(ModifyTableState *mtstate); static TupleConversionMap *tupconv_map_for_subplan(ModifyTableState *node, int whichplan); + static TupleTableSlot *ExecPrepareTupleRouting(ModifyTableState *mtstate, + PartitionTupleRouting *proute, + EState *estate, + ResultRelInfo *rootRelInfo, + TupleTableSlot *slot); /* * Verify that the tuples to be produced by INSERT or UPDATE match the *************** *** 265,271 **** ExecInsert(ModifyTableState *mtstate, { HeapTuple tuple; ResultRelInfo *resultRelInfo; - ResultRelInfo *saved_resultRelInfo = NULL; Relation resultRelationDesc; Oid newId; List *recheckIndexes = NIL; --- 270,275 ---- *************** *** 282,381 **** ExecInsert(ModifyTableState *mtstate, * get information on the (current) result relation */ resultRelInfo = estate->es_result_relation_info; - - /* Determine the partition to heap_insert the tuple into */ - if (mtstate->mt_partition_tuple_routing) - { - int leaf_part_index; - PartitionTupleRouting *proute = mtstate->mt_partition_tuple_routing; - - /* - * Away we go ... If we end up not finding a partition after all, - * ExecFindPartition() does not return and errors out instead. - * Otherwise, the returned value is to be used as an index into arrays - * proute->partitions[] and proute->partition_tupconv_maps[] that will - * get us the ResultRelInfo and TupleConversionMap for the partition, - * respectively. - */ - leaf_part_index = ExecFindPartition(resultRelInfo, - proute->partition_dispatch_info, - slot, - estate); - Assert(leaf_part_index >= 0 && - leaf_part_index < proute->num_partitions); - - /* - * Save the old ResultRelInfo and switch to the one corresponding to - * the selected partition. (We might need to initialize it first.) - */ - saved_resultRelInfo = resultRelInfo; - resultRelInfo = proute->partitions[leaf_part_index]; - if (resultRelInfo == NULL) - { - resultRelInfo = ExecInitPartitionInfo(mtstate, - saved_resultRelInfo, - proute, estate, - leaf_part_index); - Assert(resultRelInfo != NULL); - } - - /* We do not yet have a way to insert into a foreign partition */ - if (resultRelInfo->ri_FdwRoutine) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("cannot route inserted tuples to a foreign table"))); - - /* For ExecInsertIndexTuples() to work on the partition's indexes */ - estate->es_result_relation_info = resultRelInfo; - - /* - * If we're capturing transition tuples, we might need to convert from - * the partition rowtype to parent rowtype. - */ - if (mtstate->mt_transition_capture != NULL) - { - if (resultRelInfo->ri_TrigDesc && - (resultRelInfo->ri_TrigDesc->trig_insert_before_row || - resultRelInfo->ri_TrigDesc->trig_insert_instead_row)) - { - /* - * If there are any BEFORE or INSTEAD triggers on the - * partition, we'll have to be ready to convert their result - * back to tuplestore format. - */ - mtstate->mt_transition_capture->tcs_original_insert_tuple = NULL; - - mtstate->mt_transition_capture->tcs_map = - TupConvMapForLeaf(proute, saved_resultRelInfo, - leaf_part_index); - } - else - { - /* - * Otherwise, just remember the original unconverted tuple, to - * avoid a needless round trip conversion. - */ - mtstate->mt_transition_capture->tcs_original_insert_tuple = tuple; - mtstate->mt_transition_capture->tcs_map = NULL; - } - } - if (mtstate->mt_oc_transition_capture != NULL) - { - mtstate->mt_oc_transition_capture->tcs_map = - TupConvMapForLeaf(proute, saved_resultRelInfo, - leaf_part_index); - } - - /* - * We might need to convert from the parent rowtype to the partition - * rowtype. - */ - tuple = ConvertPartitionTupleSlot(proute->parent_child_tupconv_maps[leaf_part_index], - tuple, - proute->partition_tuple_slot, - &slot); - } - resultRelationDesc = resultRelInfo->ri_RelationDesc; /* --- 286,291 ---- *************** *** 495,501 **** ExecInsert(ModifyTableState *mtstate, * No need though if the tuple has been routed, and a BR trigger * doesn't exist. */ ! if (saved_resultRelInfo != NULL && !(resultRelInfo->ri_TrigDesc && resultRelInfo->ri_TrigDesc->trig_insert_before_row)) check_partition_constr = false; --- 405,411 ---- * No need though if the tuple has been routed, and a BR trigger * doesn't exist. */ ! if (resultRelInfo->ri_PartitionRoot != NULL && !(resultRelInfo->ri_TrigDesc && resultRelInfo->ri_TrigDesc->trig_insert_before_row)) check_partition_constr = false; *************** *** 686,694 **** ExecInsert(ModifyTableState *mtstate, if (resultRelInfo->ri_projectReturning) result = ExecProcessReturning(resultRelInfo, slot, planSlot); - if (saved_resultRelInfo) - estate->es_result_relation_info = saved_resultRelInfo; - return result; } --- 596,601 ---- *************** *** 1209,1230 **** lreplace:; proute->root_tuple_slot, &slot); ! ! /* ! * For ExecInsert(), make it look like we are inserting into the ! * root. ! */ Assert(mtstate->rootResultRelInfo != NULL); ! estate->es_result_relation_info = mtstate->rootResultRelInfo; ret_slot = ExecInsert(mtstate, slot, planSlot, NULL, ONCONFLICT_NONE, estate, canSetTag); /* ! * Revert back the active result relation and the active ! * transition capture map that we changed above. */ estate->es_result_relation_info = resultRelInfo; if (mtstate->mt_transition_capture) { mtstate->mt_transition_capture->tcs_original_insert_tuple = NULL; --- 1116,1139 ---- proute->root_tuple_slot, &slot); ! /* Prepare for tuple routing of the tuple */ Assert(mtstate->rootResultRelInfo != NULL); ! slot = ExecPrepareTupleRouting(mtstate, proute, estate, ! mtstate->rootResultRelInfo, slot); ret_slot = ExecInsert(mtstate, slot, planSlot, NULL, ONCONFLICT_NONE, estate, canSetTag); /* ! * Revert back the active result relation that ! * ExecPrepareTupleRouting would have changed. */ estate->es_result_relation_info = resultRelInfo; + + /* + * Also, revert back the active transition capture map that we + * changed above. + */ if (mtstate->mt_transition_capture) { mtstate->mt_transition_capture->tcs_original_insert_tuple = NULL; *************** *** 1835,1840 **** tupconv_map_for_subplan(ModifyTableState *mtstate, int whichplan) --- 1744,1851 ---- } } + /* + * ExecPrepareTupleRouting --- prepare for tuple routing of one tuple + */ + static TupleTableSlot * + ExecPrepareTupleRouting(ModifyTableState *mtstate, + PartitionTupleRouting *proute, + EState *estate, + ResultRelInfo *rootRelInfo, + TupleTableSlot *slot) + { + ResultRelInfo *resultRelInfo; + HeapTuple tuple; + int leaf_part_index; + + Assert(proute != NULL); + + /* + * Away we go ... If we end up not finding a partition after all, + * ExecFindPartition() does not return and errors out instead. + * Otherwise, the returned value is to be used as an index into arrays + * proute->partitions[] and proute->partition_tupconv_maps[] that will + * get us the ResultRelInfo and TupleConversionMap for the partition, + * respectively. + */ + leaf_part_index = ExecFindPartition(rootRelInfo, + proute->partition_dispatch_info, + slot, + estate); + Assert(leaf_part_index >= 0 && leaf_part_index < proute->num_partitions); + + /* Get the ResultRelInfo corresponding to the selected partition. */ + resultRelInfo = proute->partitions[leaf_part_index]; + if (resultRelInfo == NULL) + { + resultRelInfo = ExecInitPartitionInfo(mtstate, rootRelInfo, + proute, estate, + leaf_part_index); + Assert(resultRelInfo != NULL); + } + + /* We do not yet have a way to insert into a foreign partition */ + if (resultRelInfo->ri_FdwRoutine) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot route inserted tuples to a foreign table"))); + + /* + * For ExecInsert(), make it look like we are inserting into the partition. + */ + estate->es_result_relation_info = resultRelInfo; + + /* Get the heap tuple out of the tuple table slot. */ + tuple = ExecMaterializeSlot(slot); + + /* + * If we're capturing transition tuples, we might need to convert from the + * partition rowtype to parent rowtype. + */ + if (mtstate->mt_transition_capture != NULL) + { + if (resultRelInfo->ri_TrigDesc && + (resultRelInfo->ri_TrigDesc->trig_insert_before_row || + resultRelInfo->ri_TrigDesc->trig_insert_instead_row)) + { + /* + * If there are any BEFORE or INSTEAD triggers on the partition, + * we'll have to be ready to convert their result back to + * tuplestore format. + */ + mtstate->mt_transition_capture->tcs_original_insert_tuple = NULL; + mtstate->mt_transition_capture->tcs_map = + TupConvMapForLeaf(proute, rootRelInfo, leaf_part_index); + } + else + { + /* + * Otherwise, just remember the original unconverted tuple, to + * avoid a needless round trip conversion. + */ + mtstate->mt_transition_capture->tcs_original_insert_tuple = tuple; + mtstate->mt_transition_capture->tcs_map = NULL; + } + } + if (mtstate->mt_oc_transition_capture != NULL) + { + mtstate->mt_oc_transition_capture->tcs_map = + TupConvMapForLeaf(proute, rootRelInfo, leaf_part_index); + } + + /* + * We might need to convert from the parent rowtype to the partition + * rowtype. + */ + tuple = ConvertPartitionTupleSlot(proute->parent_child_tupconv_maps[leaf_part_index], + tuple, + proute->partition_tuple_slot, + &slot); + + return slot; + } + + /* ---------------------------------------------------------------- * ExecModifyTable * *************** *** 1846,1851 **** static TupleTableSlot * --- 1857,1863 ---- ExecModifyTable(PlanState *pstate) { ModifyTableState *node = castNode(ModifyTableState, pstate); + PartitionTupleRouting *proute = node->mt_partition_tuple_routing; EState *estate = node->ps.state; CmdType operation = node->operation; ResultRelInfo *saved_resultRelInfo; *************** *** 2051,2059 **** ExecModifyTable(PlanState *pstate) --- 2063,2082 ---- switch (operation) { case CMD_INSERT: + + /* Prepare for tuple routing of the tuple if needed */ + if (proute) + slot = ExecPrepareTupleRouting(node, proute, estate, + resultRelInfo, slot); slot = ExecInsert(node, slot, planSlot, node->mt_arbiterindexes, node->mt_onconflict, estate, node->canSetTag); + /* + * Revert back the active result relation that + * ExecPrepareTupleRouting would have changed. + */ + if (proute) + estate->es_result_relation_info = resultRelInfo; break; case CMD_UPDATE: slot = ExecUpdate(node, tupleid, oldtuple, slot, planSlot,
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index ef301b6328..ef1c5337f5 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -2803,12 +2803,13 @@ CopyFrom(CopyState cstate) * tuples inserted by an INSERT command. */ processed++; + } - if (saved_resultRelInfo) - { - resultRelInfo = saved_resultRelInfo; - estate->es_result_relation_info = resultRelInfo; - } + /* Restore the saved ResultRelInfo */ + if (saved_resultRelInfo) + { + resultRelInfo = saved_resultRelInfo; + estate->es_result_relation_info = resultRelInfo; } } diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c index 4742593374..d3c459282c 100644 --- a/src/backend/executor/nodeModifyTable.c +++ b/src/backend/executor/nodeModifyTable.c @@ -259,7 +259,6 @@ ExecInsert(ModifyTableState *mtstate, { HeapTuple tuple; ResultRelInfo *resultRelInfo; - ResultRelInfo *saved_resultRelInfo = NULL; Relation resultRelationDesc; Oid newId; List *recheckIndexes = NIL; @@ -275,100 +274,6 @@ ExecInsert(ModifyTableState *mtstate, * get information on the (current) result relation */ resultRelInfo = estate->es_result_relation_info; - - /* Determine the partition to heap_insert the tuple into */ - if (mtstate->mt_partition_dispatch_info) - { - int leaf_part_index; - TupleConversionMap *map; - - /* - * Away we go ... If we end up not finding a partition after all, - * ExecFindPartition() does not return and errors out instead. - * Otherwise, the returned value is to be used as an index into arrays - * mt_partitions[] and mt_partition_tupconv_maps[] that will get us - * the ResultRelInfo and TupleConversionMap for the partition, - * respectively. - */ - leaf_part_index = ExecFindPartition(resultRelInfo, - mtstate->mt_partition_dispatch_info, - slot, - estate); - Assert(leaf_part_index >= 0 && - leaf_part_index < mtstate->mt_num_partitions); - - /* - * Save the old ResultRelInfo and switch to the one corresponding to - * the selected partition. - */ - saved_resultRelInfo = resultRelInfo; - resultRelInfo = mtstate->mt_partitions + leaf_part_index; - - /* We do not yet have a way to insert into a foreign partition */ - if (resultRelInfo->ri_FdwRoutine) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("cannot route inserted tuples to a foreign table"))); - - /* For ExecInsertIndexTuples() to work on the partition's indexes */ - estate->es_result_relation_info = resultRelInfo; - - /* - * If we're capturing transition tuples, we might need to convert from - * the partition rowtype to parent rowtype. - */ - if (mtstate->mt_transition_capture != NULL) - { - if (resultRelInfo->ri_TrigDesc && - (resultRelInfo->ri_TrigDesc->trig_insert_before_row || - resultRelInfo->ri_TrigDesc->trig_insert_instead_row)) - { - /* - * If there are any BEFORE or INSTEAD triggers on the - * partition, we'll have to be ready to convert their result - * back to tuplestore format. - */ - mtstate->mt_transition_capture->tcs_original_insert_tuple = NULL; - mtstate->mt_transition_capture->tcs_map = - mtstate->mt_transition_tupconv_maps[leaf_part_index]; - } - else - { - /* - * Otherwise, just remember the original unconverted tuple, to - * avoid a needless round trip conversion. - */ - mtstate->mt_transition_capture->tcs_original_insert_tuple = tuple; - mtstate->mt_transition_capture->tcs_map = NULL; - } - } - if (mtstate->mt_oc_transition_capture != NULL) - mtstate->mt_oc_transition_capture->tcs_map = - mtstate->mt_transition_tupconv_maps[leaf_part_index]; - - /* - * We might need to convert from the parent rowtype to the partition - * rowtype. - */ - map = mtstate->mt_partition_tupconv_maps[leaf_part_index]; - if (map) - { - Relation partrel = resultRelInfo->ri_RelationDesc; - - tuple = do_convert_tuple(tuple, map); - - /* - * We must use the partition's tuple descriptor from this point - * on, until we're finished dealing with the partition. Use the - * dedicated slot for that. - */ - slot = mtstate->mt_partition_tuple_slot; - Assert(slot != NULL); - ExecSetSlotDescriptor(slot, RelationGetDescr(partrel)); - ExecStoreTuple(tuple, slot, InvalidBuffer, true); - } - } - resultRelationDesc = resultRelInfo->ri_RelationDesc; /* @@ -477,7 +382,7 @@ ExecInsert(ModifyTableState *mtstate, * No need though if the tuple has been routed, and a BR trigger * doesn't exist. */ - if (saved_resultRelInfo != NULL && + if (resultRelInfo->ri_PartitionRoot != NULL && !(resultRelInfo->ri_TrigDesc && resultRelInfo->ri_TrigDesc->trig_insert_before_row)) check_partition_constr = false; @@ -645,9 +550,6 @@ ExecInsert(ModifyTableState *mtstate, if (resultRelInfo->ri_projectReturning) result = ExecProcessReturning(resultRelInfo, slot, planSlot); - if (saved_resultRelInfo) - estate->es_result_relation_info = saved_resultRelInfo; - return result; } @@ -1378,6 +1280,110 @@ ExecOnConflictUpdate(ModifyTableState *mtstate, return true; } +/* + * ExecPrepareTupleRouting --- prepare for tuple routing of one tuple + */ +static TupleTableSlot * +ExecPrepareTupleRouting(ModifyTableState *mtstate, + EState *estate, + ResultRelInfo *rootRelInfo, + TupleTableSlot *slot) +{ + ResultRelInfo *resultRelInfo; + HeapTuple tuple; + int leaf_part_index; + TupleConversionMap *map; + + /* + * Away we go ... If we end up not finding a partition after all, + * ExecFindPartition() does not return and errors out instead. + * Otherwise, the returned value is to be used as an index into arrays + * mtstate->mt_partitions[] and mtstate->mt_partition_tupconv_maps[] that + * will get us the ResultRelInfo and TupleConversionMap for the partition, + * respectively. + */ + leaf_part_index = ExecFindPartition(rootRelInfo, + mtstate->mt_partition_dispatch_info, + slot, + estate); + Assert(leaf_part_index >= 0 && + leaf_part_index < mtstate->mt_num_partitions); + + /* Get the ResultRelInfo corresponding to the selected partition. */ + resultRelInfo = &mtstate->mt_partitions[leaf_part_index]; + + /* We do not yet have a way to insert into a foreign partition */ + if (resultRelInfo->ri_FdwRoutine) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot route inserted tuples to a foreign table"))); + + /* + * For ExecInsert(), make it look like we are inserting into the partition. + */ + estate->es_result_relation_info = resultRelInfo; + + /* Get the heap tuple out of the tuple table slot. */ + tuple = ExecMaterializeSlot(slot); + + /* + * If we're capturing transition tuples, we might need to convert from the + * partition rowtype to parent rowtype. + */ + if (mtstate->mt_transition_capture != NULL) + { + if (resultRelInfo->ri_TrigDesc && + (resultRelInfo->ri_TrigDesc->trig_insert_before_row || + resultRelInfo->ri_TrigDesc->trig_insert_instead_row)) + { + /* + * If there are any BEFORE or INSTEAD triggers on the partition, + * we'll have to be ready to convert their result back to + * tuplestore format. + */ + mtstate->mt_transition_capture->tcs_original_insert_tuple = NULL; + mtstate->mt_transition_capture->tcs_map = + mtstate->mt_transition_tupconv_maps[leaf_part_index]; + } + else + { + /* + * Otherwise, just remember the original unconverted tuple, to + * avoid a needless round trip conversion. + */ + mtstate->mt_transition_capture->tcs_original_insert_tuple = tuple; + mtstate->mt_transition_capture->tcs_map = NULL; + } + } + if (mtstate->mt_oc_transition_capture != NULL) + mtstate->mt_oc_transition_capture->tcs_map = + mtstate->mt_transition_tupconv_maps[leaf_part_index]; + + /* + * We might need to convert from the parent rowtype to the partition + * rowtype. + */ + map = mtstate->mt_partition_tupconv_maps[leaf_part_index]; + if (map) + { + Relation partrel = resultRelInfo->ri_RelationDesc; + + tuple = do_convert_tuple(tuple, map); + + /* + * We must use the partition's tuple descriptor from this point + * on, until we're finished dealing with the partition. Use the + * dedicated slot for that. + */ + slot = mtstate->mt_partition_tuple_slot; + Assert(slot != NULL); + ExecSetSlotDescriptor(slot, RelationGetDescr(partrel)); + ExecStoreTuple(tuple, slot, InvalidBuffer, true); + } + + return slot; +} + /* * Process BEFORE EACH STATEMENT triggers @@ -1545,6 +1551,7 @@ ExecSetupTransitionCaptureState(ModifyTableState *mtstate, EState *estate) } } + /* ---------------------------------------------------------------- * ExecModifyTable * @@ -1763,9 +1770,20 @@ ExecModifyTable(PlanState *pstate) switch (operation) { case CMD_INSERT: + + /* Prepare for tuple routing of the tuple if needed */ + if (node->mt_partition_dispatch_info) + slot = ExecPrepareTupleRouting(node, estate, + resultRelInfo, slot); slot = ExecInsert(node, slot, planSlot, node->mt_arbiterindexes, node->mt_onconflict, estate, node->canSetTag); + /* + * Revert back the active result relation that + * ExecPrepareTupleRouting would have changed. + */ + if (node->mt_partition_dispatch_info) + estate->es_result_relation_info = resultRelInfo; break; case CMD_UPDATE: slot = ExecUpdate(node, tupleid, oldtuple, slot, planSlot,