On Fri, Mar 9, 2018 at 6:55 AM, Pavan Deolasee <pavan.deola...@gmail.com> wrote: >> * Is this actually needed at all?: >> >> > + /* In MERGE when and condition, no system column is allowed */ >> > + if (pstate->p_expr_kind == EXPR_KIND_MERGE_WHEN_AND && >> > + attnum < InvalidAttrNumber && >> > + !(attnum == TableOidAttributeNumber || attnum == >> > ObjectIdAttributeNumber)) >> > + ereport(ERROR, >> > + (errcode(ERRCODE_INVALID_COLUMN_REFERENCE), >> > + errmsg("system column \"%s\" reference in WHEN AND >> > condition is invalid", >> > + colname), >> > + parser_errposition(pstate, location))); >> >> We're talking about the scantuple here. It's not like excluded.*.
I often care about things like system columns not because of the user-visible functionality, but because it reassures me that the design is robust. >> I think that this is a blocker, unfortunately. > > You mean OVERRIDING or ruleutils? I meant OVERRIDING, but ruleutils seems like something we need, too. >> >> * I still think we probably need to add something to ruleutils.c, so >> >> that MERGE Query structs can be deparsed -- see get_query_def(). >> > >> > Ok. I will look at it. Not done in this version though. >> >> I also wonder what it would take to support referencing CTEs. Other >> implementations do have this. Why can't we? > > > Hmm. I will look at them. But TBH I would like to postpone them to v12 if > they turn out to be tricky. We have already covered a very large ground in > the last few weeks, but we're approaching the feature freeze date. I quickly implemented CTE support myself (not wCTE support, since MERGE doesn't use RETURNING), and it wasn't tricky. It seems to work when I mechanically duplicate the approach taken with other types of DML statement in the parser. I have written a few tests, and so far it holds up. I also undertook something quite a bit harder: Changing the representation of the range table from parse analysis on. As I mentioned in passing at one point, I'm concerned about the fact that there are two RTEs for the target relation in all cases. You introduced a separate Query.resultRelation-style RTI index, Query.mergeTarget_relation, and we see stuff like this through every stage of query processing, from parse analysis right through to execution. One problem with the existing approach is that it leaves many cases where EXPLAIN MERGE shows the target relation alias "t" as "t_1" for some planner nodes, though not for others. More importantly, I suspect that having two distinct RTEs has introduced special cases that are not really needed, and will be more bug-prone (in fact, there are bugs already, which I'll get to at the end). I think that it's fair to say that what happens in the patch with EvalPlanQual()'s RTI argument is ugly, especially because we also have a separate resultRelInfo that we *don't* use. We should aspire to have a MERGE implementation that isn't terribly different to UPDATE or DELETE, especially within the executor. I wrote a very rough patch that arranged for the MERGE rtable to have only a single relation RTE. My approach was to teach transformFromClauseItem() and friends to recognize that they shouldn't create a new RTE for the MERGE target RangeVar. I actually added some hard coding to getRTEForSpecialRelationTypes() for this, which is ugly as sin, but the general approach is likely salvageable (we could invent a special type of RangeVar to do this, perhaps). The point here is that everything just works if there isn't a separate RTE for the join's leftarg and the setTargetTable() target, with exactly one exception: partitioning becomes *thoroughly* broken. Every single partitioning test fails with "ERROR: no relation entry for relid 1", and occasionally some other "can't happen" error. This looks like it would be hard to fix -- at least, I'd find it hard to fix. This is an extract from header comments for inheritance_planner() (master branch): * We have to handle this case differently from cases where a source relation * is an inheritance set. Source inheritance is expanded at the bottom of the * plan tree (see allpaths.c), but target inheritance has to be expanded at * the top. The reason is that for UPDATE, each target relation needs a * different targetlist matching its own column set. Fortunately, * the UPDATE/DELETE target can never be the nullable side of an outer join, * so it's OK to generate the plan this way. Of course, that isn't true of MERGE: The MERGE target *can* be the nullable side of an outer join. That's probably a big complication for using one target RTE. Your approach to implementing partitioning [1] seems to benefit from having two different RTEs, in a way -- you sidestep the restriction. As you put it, "Since MERGE need both the facilities [both INSERT and UPDATE facilities], I'd to pretty much merge both the machineries". As I understand it, you "merge" these machineries by having find_mergetarget_for_rel() work backwards. You are mapping from one relation tree to another relation tree in an ad-hoc fashion -- both trees for the same underlying RTE. (You compensate for this later, in the executor, with the special EvalPlanQual() RTI stuff I mentioned already.) I have some broad concerns here. I would especially like to hear from hackers that know more about partitioning/inheritance than I do on these concerns. They are: * Is it okay to take this approach with partitioning? I worry about things like ExecAuxRowMark handling. We avoid calling EvalPlanQualSetPlan() within ExecModifyTable() for CMD_MERGE, so I'm pretty sure that that's already broken as-is. Separately, can't see why it's okay that CMD_MERGE doesn't have mt_transition_capture initialization occur per-child, so that's probably another bug of the same general variety. To be clear, this is the specific part of the patch that avoids going through child plans as described: > @@ -1927,7 +2590,14 @@ ExecModifyTable(PlanState *pstate) > { > /* advance to next subplan if any */ > node->mt_whichplan++; > - if (node->mt_whichplan < node->mt_nplans) > + > + /* > + * If we are executing MERGE, we only need to execute the first > + * subplan since it's guranteed to return all the required tuples. > + * In fact, running remaining subplans would be a problem since we > + * will end up fetching the same tuples N times. > + */ > + if (node->mt_whichplan < node->mt_nplans && (operation != > CMD_MERGE)) > { > resultRelInfo++; > subplanstate = node->mt_plans[node->mt_whichplan]; * Is there a way to make what I describe (having a single target RTE) work with partitioning, without any big new special cases, especially in the executor? * Any thoughts on this multiple-RTEs-for-target-rel business more generally? [1] https://postgr.es/m/CABOikdPjjG+JcdNeegrL7=btpdj6yev--v4hu8kzjttwx1s...@mail.gmail.com -- Peter Geoghegan