Re: bug report: some issues about pg_15_stable(8fa4a1ac61189efffb8b851ee77e1bc87360c445)

Dean Rasheed Tue, 20 Feb 2024 06:50:17 -0800

On Mon, 19 Feb 2024 at 17:48, zwj <[email protected]> wrote:
>
> Hello,
>
>    I found an issue while using the latest version of PG15 
> (8fa4a1ac61189efffb8b851ee77e1bc87360c445).
>    This question is about 'merge into'.
>
>    When two merge into statements are executed concurrently, I obtain the 
> following process and results.
>    Firstly, the execution results of each Oracle are different, and secondly, 
> I tried to understand its execution process and found that it was not very 
> clear.
>


Hmm, looking at this I think there is a problem with how UNION ALL
subqueries are pulled up, and I don't think it's necessarily limited
to MERGE.

Looking at the plan for this MERGE operation:

explain (verbose, costs off)
merge into mergeinto_0023_tb01 a using (select aid,name,year from
mergeinto_0023_tb02 union all select aid,name,year from
mergeinto_0023_tb03) c on (a.id=c.aid) when matched then update set
year=c.year when not matched then insert(id,name,year)
values(c.aid,c.name,c.year);

                    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge on public.mergeinto_0023_tb01 a
   ->  Merge Right Join
         Output: a.ctid, mergeinto_0023_tb02.year,
mergeinto_0023_tb02.aid, mergeinto_0023_tb02.name,
(ROW(mergeinto_0023_tb02.aid, mergeinto_0023_tb02.name,
mergeinto_0023_tb02.year))
         Merge Cond: (a.id = mergeinto_0023_tb02.aid)
         ->  Sort
               Output: a.ctid, a.id
               Sort Key: a.id
               ->  Seq Scan on public.mergeinto_0023_tb01 a
                     Output: a.ctid, a.id
         ->  Sort
               Output: mergeinto_0023_tb02.year,
mergeinto_0023_tb02.aid, mergeinto_0023_tb02.name,
(ROW(mergeinto_0023_tb02.aid, mergeinto_0023_tb02.name,
mergeinto_0023_tb02.year))
               Sort Key: mergeinto_0023_tb02.aid
               ->  Append
                     ->  Seq Scan on public.mergeinto_0023_tb02
                           Output: mergeinto_0023_tb02.year,
mergeinto_0023_tb02.aid, mergeinto_0023_tb02.name,
ROW(mergeinto_0023_tb02.aid, mergeinto_0023_tb02.name,
mergeinto_0023_tb02.year)
                     ->  Seq Scan on public.mergeinto_0023_tb03
                           Output: mergeinto_0023_tb03.year,
mergeinto_0023_tb03.aid, mergeinto_0023_tb03.name,
ROW(mergeinto_0023_tb03.aid, mergeinto_0023_tb03.name,
mergeinto_0023_tb03.year)

The "ROW(...)" targetlist entries are added because
preprocess_rowmarks() adds a rowmark to the UNION ALL subquery, which
at that point is the only non-target relation in the jointree. It does
this intending that the same values be returned during EPQ rechecking.
However, pull_up_subqueries() causes the UNION all subquery and its
leaf subqueries to be pulled up into the main query as appendrel
entries. So when it comes to EPQ rechecking, the rowmark does
absolutely nothing, and EvalPlanQual() does a full re-scan of
mergeinto_0023_tb02 and mergeinto_0023_tb03 and a re-sort for each
concurrently modified row.

A similar thing happens for UPDATE and DELETE, if they're joined to a
UNION ALL subquery. However, AFAICS that doesn't cause a problem
(other than being pretty inefficient) since, for UPDATE and DELETE,
the join to the UNION ALL subquery will always be an inner join, I
think, and so the join output will always be correct.

However, for MERGE, the join may be an outer join, so during an EPQ
recheck, we're joining the target relation (fixed to return just the
updated row) to the full UNION ALL subquery. So if it's an outer join,
the join output will return all-but-one of the subquery rows as not
matched rows in addition to the one matched row that we want, whereas
the EPQ mechanism is expecting the plan to return just one row.

On the face of it, the simplest fix is to tweak is_simple_union_all()
to prevent UNION ALL subquery pullup for MERGE, forcing a
subquery-scan plan. A quick test shows that that fixes the reported
issue.

is_simple_union_all() already has a test for rowmarks, and a comment
saying that locking isn't supported, but since it is called before
preprocess_rowmarks(), it doesn't know that the subquery is about to
be marked.

However, that leaves the question of whether we should do the same for
UPDATE and DELETE. There doesn't appear to be a live bug there, so
maybe they're best left alone. Also, back-patching a change like that
might make existing queries less efficient. But I feel like I might be
overlooking something here, and this doesn't seem to be how EPQ
rechecks are meant to work (doing a full re-scan of non-target
relations). Also, if the concurrent update were an update of a key
column that was included in the join condition, the re-scan would
follow the update to a new matching source row, which is inconsistent
with what would happen if it were a join to a regular relation.

Thoughts?

Regards,
Dean

Re: bug report: some issues about pg_15_stable(8fa4a1ac61189efffb8b851ee77e1bc87360c445)

Reply via email to