Generally speaking, when executing UNION ; a DISTINCT is run afterward on the 
resultset.

So, if you're sure that each part of UNION cannot return a line returned by 
another one, you may use UNION ALL, you'll cut the cost of the final implicit 
DISTINCT.


----- Mail original -----
De: "Mark Pasterkamp" <markpasterkamp1...@gmail.com>
À: pgsql-hackers@lists.postgresql.org
Envoyé: Jeudi 15 Août 2019 20:37:06
Objet: UNION ALL


Dear all, 


I was wondering if someone could help me understands what a union all actually 
does. 


For my thesis I am using Apache Calcite to rewrite queries into using 
materialized views which I then give to a Postgres database. 
For some queries, this means that they will be rewritten in a UNION ALL style 
query between an expression and a table scan of a materialized view. 
However, contrary to what I expected, the UNION ALL query is actually a lot 
slower. 


As an example, say I have 2 tables: actor and movie. Furthermore, there is also 
a foreign key index on movie to actor. 
I also have a materialized view with the join of these 2 tables for all movies 
<= 2015 called A. 
Now, if I want to query all entries in the join between actor and movie, I 
would assume that a UNION ALL between the join of actor and movie for movies 
>2015 and A is faster than executing the original query.. 
If I look at the explain analyze part, I can certainly see a reduction in cost 
up until the UNION ALL part, which carries a respective cost more than negating 
the cost reduction up to a point where I might as well not use the existing 
materialized view. 


I have some trouble understanding this phenomenon. 
One thought which came to my mind was that perhaps UNION ALL might create a 
temporary table containing both result sets, and then do a table scan and 
return that result. 

this would greatly increase IO cost which could attribute to the problem. 
However, I am really not sure what UNION ALL actually does to append both 
result sets so I was wondering if someone would be able to help me out with 
this. 




Mark


Reply via email to