On Thu, 9 Jan 2025, 12:26 Feike Steenbergen, <feikesteenber...@gmail.com>
wrote:

> I'm trying to change a few applications to fully use this, as PostgreSQL
> 17 added this support.
>
> The application does something like this:
>
> - fetch information from a source system and store it in a temp table
> - run a MERGE with a table (target) in this database,
>   updating, inserting and deleting in a single statement
> - the target table holds information for multiple systems
>
> The temp table (source) doesn't contain the identifier for the system, as
> we can inject that at runtime.
>
> This is the shape of the statement:
>
>     MERGE INTO
>         merge_target AS t
>     USING
>         merge_source AS s ON (t.time = s.time AND t.device_id = $1)
>     WHEN MATCHED THEN
>         UPDATE set
>             value = s.value
>     WHEN NOT MATCHED THEN
>         INSERT (device_id, time, value) VALUES ($1, time, value)
>     WHEN NOT MATCHED BY SOURCE
>         AND t.device_id = $1
>         THEN DELETE;
>
> If we run this however, there is always a Seq Scan against merge_target,
> whereas the filter of `AND t.device_id = $1` uses a (Bitmap) Index scan
> in other types of queries.
>
> Previously we would - in a CTE - do a delete *and then* the merge.
>
> Which performed much better as the DELETE would not do a Seq Scan.
>
> Attached a full reproducing test case. Some numbers:
>
> Previously, DELETE in CTE, then merge:
>     Planning Time: 6.700 ms
>     Execution Time: 7.473 ms
>
> Using the MERGE WHEN MATCHED ON SOURCE THEN DELETE:
>     Planning Time: 0.994 ms
>     Execution Time: 65.695 ms
>
> My questions are:
>
> - is the Seq Scan expected by others in this mailing list as well?
> - Is it required to do the Seq Scan?
> - is this something that could be optimized?
>
> Kind regards,
>
> Feike Steenbergen
>

An excellent post. I wish all posters provided nice contained example cases
like this one.

I note ,in the documentation, that a Warning box got added which says
this...

If both WHEN NOT MATCHED BY SOURCE and WHEN NOT MATCHED [BY TARGET] clauses
are specified, the MERGE command will perform a FULL join between
data_source and the target table. For this to work, at least one
join_condition subexpression must use an operator that can support a hash
join, or all of the subexpressions must use operators that can support a
merge join.


This could be a hint as to the reason maybe ? The NOT MATCHED BY SOURCE is
new feature to 17. I'm looking forward to others replies here.

>

Reply via email to