On 1/8/21 8:22 PM, Alvaro Herrera wrote: > On 2020-Dec-31, Alvaro Herrera wrote: > >> Here's a rebase of Simon/Pavan's MERGE patch to current sources. I >> cleaned up some minor things in it, but aside from rebasing, it's pretty >> much their work (even the commit message is Simon's). > > Rebased, no changes. >
I took a look at this today. Some initial comments (perhaps nitpicking, in some cases). 1) sgml docs This probably needs more work. Some of the sentences (in mvcc.sgml) are so long I can't quite parse them. Maybe that's because I'm not a native speaker, but still ... :-/ Also, there are tags missing - UPDATE/INSERT/... should be <command> or maybe <literal>, depending on the context. I think the new docs are a bit confused which of those it should be, but I'd say <command> should be used for SQL commands and <literal> for MERGE actions? It'd be a mess to list all the places, so the attached patch (0001) tweaks this. Feel free to reject changes that you disagree with. The patch also adds a bunch of XXX comments, suggesting some changes (clarifications, removing unnecessarily duplicate text, etc.) 2) explain.c I'm a bit confused about this case CMD_MERGE: operation = "Merge"; foperation = "Foreign Merge"; break; because the commit message says foreign tables are not supported. So why do we need this? 3) nodeModifyTable.c ExecModifyTable does this: if (operation == CMD_MERGE) { ExecMerge(node, resultRelInfo, estate, slot, junkfilter); continue; } i.e. it handles the MERGE far from the other DML operations. That seems somewhat strange, especially without any comment - can't we refactor this and handle it in the switch with the other DML? 4) preptlist.c I propose to add a brief comment in preprocess_targetlist, explaining what we need to do for CMD_MERGE (see 0001). And I think it'd be good to explain why MERGE uses the same code as UPDATE (it's not obvious to me). 5) WHEN AND I admit the "WHEN AND" conditions sounds a bit cryptic - it took me a while to realize what this refers to. Is that a term established by SQL Standard, or something we invented? 6) walsender.c Huh, why does this patch touch this at all? 7) rewriteHandler.c I see MERGE "doesn't support" rewrite rules in the sense that it simply ignores them. Shouldn't it error-out instead? Seems like a foot-gun to me, because people won't realize this limitation and may not notice their rules don't fire. 8) varlena.c Again, why are these changes to length checks in a MERGE patch? 9) parsenodes.h Should we rename mergeTarget_relation to mergeTargetRelation? The current name seems like a mix between two naming schemes. 10) per valgrind, there's a bug in ExecDelete The table_tuple_delete may not set tmfd (which is no stack), leaving it not initialized. But the code later branches depending on this. The 0002 patch fixes that by simply setting it to OK before the call, which makes the valgrind error go away. But maybe it should be fixed in a different way (e.g. by setting it at the beginning of table_tuple_delete). regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
>From a332b1c279e847ec533ff34f4444fc35303672cd Mon Sep 17 00:00:00 2001 From: Tomas Vondra <to...@2ndquadrant.com> Date: Sat, 9 Jan 2021 03:17:56 +0100 Subject: [PATCH 1/2] review --- doc/src/sgml/mvcc.sgml | 20 +++++++++++--------- doc/src/sgml/ref/create_policy.sgml | 6 +++--- doc/src/sgml/ref/merge.sgml | 12 +++++++++--- src/backend/commands/explain.c | 2 +- src/backend/executor/execMerge.c | 5 ++++- src/backend/executor/nodeModifyTable.c | 3 ++- src/backend/optimizer/prep/preptlist.c | 6 ++++++ src/backend/parser/parse_agg.c | 1 + src/backend/replication/walsender.c | 4 ++-- src/backend/rewrite/rewriteHandler.c | 4 ++++ src/backend/utils/adt/varlena.c | 3 +++ src/include/nodes/parsenodes.h | 7 ++++--- 12 files changed, 50 insertions(+), 23 deletions(-) diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml index 9ec8474185..02c9f3fdea 100644 --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -429,22 +429,24 @@ COMMIT; with both <command>INSERT</command> and <command>UPDATE</command> subcommands looks similar to <command>INSERT</command> with an <literal>ON CONFLICT DO UPDATE</literal> clause but does not guarantee - that either <command>INSERT</command> and <command>UPDATE</command> will occur. + that either <command>INSERT</command> or <command>UPDATE</command> will occur. - If MERGE attempts an UPDATE or DELETE and the row is concurrently updated + /* XXX This is a very long and hard to understand sentence :-( */ + /* XXX Do we want to mention EvalPlanQual here? There's no explanation what it does in this file, so maybe elaborate what it does or leave that detail for a README? */ + If MERGE attempts an <command>UPDATE</command> or <command>DELETE</command> and the row is concurrently updated but the join condition still passes for the current target and the current - source tuple, then MERGE will behave the same as the UPDATE or DELETE commands + source tuple, then <command>MERGE</command> will behave the same as the <command>UPDATE</command> or <command>DELETE</command> commands and perform its action on the latest version of the row, using standard - EvalPlanQual. MERGE actions can be conditional, so conditions must be + EvalPlanQual. <command>MERGE</command> actions can be conditional, so conditions must be re-evaluated on the latest row, starting from the first action. On the other hand, if the row is concurrently updated or deleted so that - the join condition fails, then MERGE will execute a NOT MATCHED action, if it - exists and the AND WHEN qual evaluates to true. + the join condition fails, then <command>MERGE</command> will execute a <literal>NOT MATCHED</literal> action, if it + exists and the <literal>AND WHEN</literal> qual evaluates to true. - If MERGE attempts an INSERT and a unique index is present and a duplicate - row is concurrently inserted then a uniqueness violation is raised. MERGE - does not attempt to avoid the ERROR by attempting an UPDATE. + If <command>MERGE</command> attempts an <command>INSERT</command> and a unique index is present and a duplicate + row is concurrently inserted then a uniqueness violation is raised. <command>MERGE</command> + does not attempt to avoid the <literal>ERROR</literal> by attempting an <command>UPDATE</command>. </para> <para> diff --git a/doc/src/sgml/ref/create_policy.sgml b/doc/src/sgml/ref/create_policy.sgml index ad20230c58..0a64674f2d 100644 --- a/doc/src/sgml/ref/create_policy.sgml +++ b/doc/src/sgml/ref/create_policy.sgml @@ -97,9 +97,9 @@ CREATE POLICY <replaceable class="parameter">name</replaceable> ON <replaceable <para> No separate policy exists for <command>MERGE</command>. Instead policies - defined for <literal>SELECT</literal>, <literal>INSERT</literal>, - <literal>UPDATE</literal> and <literal>DELETE</literal> are applied - while executing MERGE, depending on the actions that are activated. + defined for <command>SELECT</command>, <command>INSERT</command>, + <command>UPDATE</command> and <command>DELETE</command> are applied + while executing <command>MERGE</command>, depending on the actions that are activated. </para> </refsect1> diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml index b2a9f67cfa..c2901b0d58 100644 --- a/doc/src/sgml/ref/merge.sgml +++ b/doc/src/sgml/ref/merge.sgml @@ -98,7 +98,7 @@ DELETE </para> <para> - There is no MERGE privilege. + There is no <literal>MERGE</literal> privilege. You must have the <literal>UPDATE</literal> privilege on the column(s) of the <replaceable class="parameter">target_table_name</replaceable> referred to in the <literal>SET</literal> clause @@ -217,6 +217,7 @@ DELETE At least one <literal>WHEN</literal> clause is required. </para> <para> + /* XXX Do we need to repeat the same thing for WHEN MATCHED and WHEN NOT MATCHED? Could this say that it applies to both? */ If the <literal>WHEN</literal> clause specifies <literal>WHEN MATCHED</literal> and the candidate change row matches a row in the <replaceable class="parameter">target_table_name</replaceable> @@ -242,6 +243,7 @@ DELETE clause will be activated and the corresponding action will occur for that row. The expression may not contain functions that possibly performs writes to the database. + /* XXX Does it mean that it has to be marked as STABLE, or that a write crashes the DB? */ </para> <para> A condition on a <literal>WHEN MATCHED</literal> clause can refer to columns @@ -379,6 +381,7 @@ DELETE <para> An expression to assign to the column. The expression can use the old values of this and other columns in the table. + /* XXX Which table? Source or target, or both? What if it's NOT MATCHED? */ </para> </listitem> </varlistentry> @@ -492,6 +495,7 @@ MERGE <replaceable class="parameter">total-count</replaceable> In summary, statement triggers for an event type (say, INSERT) will be fired whenever we <emphasis>specify</emphasis> an action of that kind. Row-level triggers will fire only for the one event type <emphasis>activated</emphasis>. + /* XXX What does "activated" mean? Maybe "executed" would be better? */ So a <command>MERGE</command> might fire statement triggers for both <command>UPDATE</command> and <command>INSERT</command>, even though only <command>UPDATE</command> row triggers were fired. @@ -504,6 +508,8 @@ MERGE <replaceable class="parameter">total-count</replaceable> rows will be used to modify the target row, later attempts to modify will cause an error. This can also occur if row triggers make changes to the target table which are then subsequently modified by <command>MERGE</command>. + /* XXX Not sure the preceding sentence makes sense. What does the 'which' refer to? Row triggers, changes, or what? */ + /* XXX It seems the rule is that INSERT actions are <literal> while INSERT statements (in SQL sense) are <command>. But this mixes that up? */ If the repeated action is an <command>INSERT</command> this will cause a uniqueness violation while a repeated <command>UPDATE</command> or <command>DELETE</command> will cause a cardinality violation; the latter behavior @@ -554,7 +560,7 @@ MERGE <replaceable class="parameter">total-count</replaceable> <title>Examples</title> <para> - Perform maintenance on CustomerAccounts based upon new Transactions. + Perform maintenance on <literal>CustomerAccounts</literal> based upon new <literal>Transactions</literal>. <programlisting> MERGE CustomerAccount CA @@ -599,7 +605,7 @@ WHEN MATCHED THEN DELETE; </programlisting> - The wine_stock_changes table might be, for example, a temporary table + The <literal>wine_stock_changes</literal> table might be, for example, a temporary table recently loaded into the database. </para> diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index fbaf50c258..f914a00e9f 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -3690,7 +3690,7 @@ show_modifytable_info(ModifyTableState *mtstate, List *ancestors, break; case CMD_MERGE: operation = "Merge"; - foperation = "Foreign Merge"; + foperation = "Foreign Merge"; /* XXX Doesn't the commit message say foreign tables are not yet supported? */ break; default: operation = "???"; diff --git a/src/backend/executor/execMerge.c b/src/backend/executor/execMerge.c index 0e245e1361..a7492a6c4b 100644 --- a/src/backend/executor/execMerge.c +++ b/src/backend/executor/execMerge.c @@ -105,7 +105,7 @@ ExecMerge(ModifyTableState *mtstate, ResultRelInfo *resultRelInfo, * A concurrent update can: * * 1. modify the target tuple so that it no longer satisfies the - * additional quals attached to the current WHEN MATCHED action OR + * additional quals attached to the current WHEN MATCHED action * * In this case, we are still dealing with a WHEN MATCHED case, but we * should recheck the list of WHEN MATCHED actions and choose the first @@ -118,6 +118,9 @@ ExecMerge(ModifyTableState *mtstate, ResultRelInfo *resultRelInfo, * tuple, so we now instead find a qualifying WHEN NOT MATCHED action to * execute. * + * XXX Hmmm, what if the updated tuple would now match one that was + * considered NOT MATCHED so far? + * * A concurrent delete, changes a WHEN MATCHED case to WHEN NOT MATCHED. * * ExecMergeMatched takes care of following the update chain and diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c index c0a97eba91..9c14709e47 100644 --- a/src/backend/executor/nodeModifyTable.c +++ b/src/backend/executor/nodeModifyTable.c @@ -2124,7 +2124,6 @@ ExecModifyTable(PlanState *pstate) { /* advance to next subplan if any */ node->mt_whichplan++; - if (node->mt_whichplan < node->mt_nplans) { resultRelInfo++; @@ -2169,6 +2168,8 @@ ExecModifyTable(PlanState *pstate) EvalPlanQualSetSlot(&node->mt_epqstate, planSlot); slot = planSlot; + /* XXX Wouldn't it be "nicer" to handle MERGE in the switch, together + * with the other MT operations? This seems a bit out of place. */ if (operation == CMD_MERGE) { ExecMerge(node, resultRelInfo, estate, slot, junkfilter); diff --git a/src/backend/optimizer/prep/preptlist.c b/src/backend/optimizer/prep/preptlist.c index 8848e9a03f..b2543f6814 100644 --- a/src/backend/optimizer/prep/preptlist.c +++ b/src/backend/optimizer/prep/preptlist.c @@ -117,6 +117,10 @@ preprocess_targetlist(PlannerInfo *root) tlist = expand_targetlist(tlist, command_type, result_relation, target_relation); + /* + * For MERGE we need to handle the target list for the target relation, + * and also target list for each action (only INSERT/UPDATE matter). + */ if (command_type == CMD_MERGE) { ListCell *l; @@ -343,6 +347,8 @@ expand_targetlist(List *tlist, int command_type, * generate a NULL for dropped columns (we want to drop any old * values). * + * XXX Should this explain why MERGE has the same logic as UPDATE? + * * When generating a NULL constant for a dropped column, we label * it INT4 (any other guaranteed-to-exist datatype would do as * well). We can't label it with the dropped column's datatype diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c index f7c152beb4..5d49c8c37e 100644 --- a/src/backend/parser/parse_agg.c +++ b/src/backend/parser/parse_agg.c @@ -436,6 +436,7 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr) break; case EXPR_KIND_MERGE_WHEN_AND: if (isAgg) + /* XXX "WHEN AND" seems rather strange. ... */ err = _("aggregate functions are not allowed in WHEN AND conditions"); else err = _("grouping operations are not allowed in WHEN AND conditions"); diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index d59f4fde95..d50543b1ba 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -1198,6 +1198,7 @@ StartLogicalReplication(StartReplicationCmd *cmd) /* Also update the sent position status in shared memory */ SpinLockAcquire(&MyWalSnd->mutex); + /* XXX Huh? Why does MERGE patch change walsender? */ MyWalSnd->sentPtr = MyReplicationSlot->data.confirmed_flush; SpinLockRelease(&MyWalSnd->mutex); @@ -2930,8 +2931,7 @@ WalSndDone(WalSndSendDataCallback send_data) replicatedPtr = XLogRecPtrIsInvalid(MyWalSnd->flush) ? MyWalSnd->write : MyWalSnd->flush; - if (WalSndCaughtUp && - sentPtr == replicatedPtr && + if (WalSndCaughtUp && sentPtr == replicatedPtr && !pq_is_send_pending()) { QueryCompletion qc; diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c index e2706c181c..80ae946d17 100644 --- a/src/backend/rewrite/rewriteHandler.c +++ b/src/backend/rewrite/rewriteHandler.c @@ -1774,6 +1774,7 @@ fill_extraUpdatedCols(RangeTblEntry *target_rte, Relation target_relation) } } + /* * matchLocks - * match the list of locks and returns the matching rules @@ -3946,6 +3947,9 @@ RewriteQuery(Query *parsetree, List *rewrite_events) /* * XXX MERGE doesn't support write rules because they would violate * the SQL Standard spec and would be unclear how they should work. + * + * XXX So does't support means 'ignores'? Should that either fail + * or at least print some warning? */ if (event == CMD_MERGE) product_queries = NIL; diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index 0281351dcf..7a768a7b5b 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -2312,6 +2312,7 @@ varstrfastcmp_locale(char *a1p, int len1, char *a2p, int len2, SortSupport ssup) int result; bool arg1_match; + /* XXX Huh? Why is this in MERGE patch? */ if (len1 < 0 || len2 < 0) ereport(ERROR, (errcode(ERRCODE_DATA_CORRUPTED), @@ -2505,6 +2506,7 @@ varstr_abbrev_convert(Datum original, SortSupport ssup) memset(pres, 0, sizeof(Datum)); len = VARSIZE_ANY_EXHDR(authoritative); + /* XXX Huh? Why is this in MERGE patch? */ if (len < 0) ereport(ERROR, (errcode(ERRCODE_DATA_CORRUPTED), @@ -3260,6 +3262,7 @@ bytea_catenate(bytea *t1, bytea *t2) len1 = VARSIZE_ANY_EXHDR(t1); len2 = VARSIZE_ANY_EXHDR(t2); + /* XXX Huh? Why is this in MERGE patch? */ if (len1 < 0 || len2 < 0) ereport(ERROR, (errcode(ERRCODE_DATA_CORRUPTED), diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h index 3cba38044e..08fea9b8f7 100644 --- a/src/include/nodes/parsenodes.h +++ b/src/include/nodes/parsenodes.h @@ -172,6 +172,7 @@ typedef struct Query List *withCheckOptions; /* a list of WithCheckOption's (added * during rewrite) */ + /* XXX Why not mergeTragetRelation? */ int mergeTarget_relation; List *mergeSourceTargetList; List *mergeActionList; /* list of actions for MERGE (only) */ @@ -1581,11 +1582,11 @@ typedef struct UpdateStmt typedef struct MergeStmt { NodeTag type; - RangeVar *relation; /* target relation to merge into */ + RangeVar *relation; /* target relation to merge into */ Node *source_relation; /* source relation */ - Node *join_condition; /* join condition between source and target */ + Node *join_condition; /* join condition between source and target */ List *mergeWhenClauses; /* list of MergeWhenClause(es) */ - WithClause *withClause; /* WITH clause */ + WithClause *withClause; /* WITH clause */ } MergeStmt; typedef struct MergeWhenClause -- 2.26.2
>From 08e3f0fc99d050c7675acaf395551e9e407d8f25 Mon Sep 17 00:00:00 2001 From: Tomas Vondra <to...@2ndquadrant.com> Date: Sat, 9 Jan 2021 20:26:58 +0100 Subject: [PATCH 2/2] fix valgrind failure --- src/backend/executor/nodeModifyTable.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c index 9c14709e47..8f5746cf95 100644 --- a/src/backend/executor/nodeModifyTable.c +++ b/src/backend/executor/nodeModifyTable.c @@ -829,6 +829,9 @@ ExecDelete(ModifyTableState *mtstate, * mode transactions. */ ldelete:; + + tmfd.result = TM_Ok; + result = table_tuple_delete(resultRelationDesc, tupleid, estate->es_output_cid, estate->es_snapshot, -- 2.26.2