On 13/07/2020 08:47, Amit Langote wrote:
It's been over 11 months since there was any significant commentary on
the contents of the patches themselves, so perhaps I should reiterate
what the patches are about and why it might still be a good idea to
consider them.
The thread started with some very valid criticism of the way
executor's partition tuple routing logic looks randomly sprinkled over
in nodeModifyTable.c, execPartition.c. In the process of making it
look less random, we decided to get rid of the global variable
es_result_relation_info to avoid complex maneuvers of
setting/resetting it correctly when performing partition tuple
routing, causing some other churn beside the partitioning code. Same
with another global variable TransitionCaptureState.tcs_map. So, the
patches neither add any new capabilities, nor improve performance, but
they do make the code in this area a bit easier to follow.
Actually, there is a problem that some of the changes here conflict
with patches being discussed on other threads ([1], [2]), so much so
that I decided to absorb some changes here into another "refactoring"
patch that I have posted at [2].
Thanks for the summary. It's been a bit hard to follow what depends on
what across these threads, and how they work together. It seems that
this patch set is the best place to start.
Attached rebased patches.
0001 contains preparatory FDW API changes to stop relying on
es_result_relation_info being set correctly.
Makes sense. The only thing I don't like about this is the way the
ForeignScan->resultRelIndex field is set. make_foreignscan() initializes
it to -1, and the FDW's PlanDirectModify() function is expected to set
it, like you did in postgres_fdw:
@@ -2319,6 +2322,11 @@ postgresPlanDirectModify(PlannerInfo *root,
rebuild_fdw_scan_tlist(fscan, returningList);
}
+ /*
+ * Set the index of the subplan result rel.
+ */
+ fscan->resultRelIndex = subplan_index;
+
table_close(rel, NoLock);
return true;
}
It has to be set to that value (subplan_index is an argument to
PlanDirectModify()), the FDW doesn't have any choice there, so this is
just additional boilerplate code that has to be copied to every FDW that
implements direct modify. Furthermore, if the FDW doesn't set it
correctly, you could have some very interesting results, like updating
wrong table. It would be better to set it in make_modifytable(), just
after calling PlanDirectModify().
I'm also a bit unhappy with the way it's updated in set_plan_refs():
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -904,6 +904,13 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
rc->rti += rtoffset;
rc->prti += rtoffset;
}
+ /*
+ * Caution: Do not change the relative ordering
of this loop
+ * and the statement below that adds the result
relations to
+ * root->glob->resultRelations, because we need
to use the
+ * current value of
list_length(root->glob->resultRelations)
+ * in some plans.
+ */
foreach(l, splan->plans)
{
lfirst(l) = set_plan_refs(root,
@@ -1243,6 +1250,14 @@ set_foreignscan_references(PlannerInfo *root,
}
fscan->fs_relids = offset_relid_set(fscan->fs_relids, rtoffset);
+
+ /*
+ * Adjust resultRelIndex if it's valid (note that we are called before
+ * adding the RT indexes of ModifyTable result relations to the global
+ * list)
+ */
+ if (fscan->resultRelIndex >= 0)
+ fscan->resultRelIndex +=
list_length(root->glob->resultRelations);
}
/*
That "Caution" comment is well deserved, but could we make this more
robust to begin with? The most straightforward solution would be to pass
down the "current resultRelIndex" as an extra parameter to
set_plan_refs(), similar to rtoffset. If we did that, we wouldn't
actually need to set it before setrefs.c processing at all.
I'm a bit wary of adding another argument to set_plan_refs() because
that's a lot of code churn, but it does seem like the most natural
solution to me. Maybe create a new context struct to hold the
PlannerInfo, rtoffset, and the new "currentResultRelIndex" value,
similar to fix_scan_expr_context, to avoid passing through so many
arguments.
Another idea is to merge "resultRelIndex" and a "range table index" into
one value. Range table entries that are updated would have a
ResultRelInfo, others would not. I'm not sure if that would end up being
cleaner or messier than what we have now, but might be worth trying.
0002 removes es_result_relation_info in favor passing the active
result relation around as a parameter in the various functions that
need it
Looks good.
0003 Moves UPDATE tuple-routing logic into a new function
0004 removes the global variable TransitionCaptureState.tcs_map which
needed to be set/reset whenever the active result relation relation
changes in favor of a new field in ResultRelInfo to store the same map
I didn't look closely, but these make sense at a quick glance.
- Heikki