On 28/11/2020 19:21, Andrey Lepikhov wrote:
On 27.11.2020 21:49, Heikki Linnakangas wrote:
On 31/10/2020 11:26, Andrey V. Lepikhov wrote:
+ /*
+ * Process restrictlist to seperate out the self join
quals from
+ * the other quals. e.g x = x goes to selfjoinquals and a
= b to
+ * otherjoinquals.
+ */
+ split_selfjoin_quals(root, restrictlist, &selfjoinquals,
+ &otherjoinquals);
+
+ if (list_length(selfjoinquals) == 0)
+ {
+ /*
+ * Have a chance to remove join if target list
contains vars from
+ * the only one relation.
+ */
I don't understand the logic here. If 'selfjoinquals' is empty, it means
that there is no join qual between the two relations, right? How can we
ever remove the join in that case? And how does the target list affect
that? Can you give an example query of that?
Maybe it is a problem of variable naming. Following the idea of David
Rowley, we split quals into two subsets: {x==x} and another, for example
{x=y}.
First set is an trivial case of self-join: if we have unique index on
the attribute 'x', then this join is self-join.
Second set is give us a chance: if right side is unique for right side
of the qual and no vars from right side end up in the target list of the
join, then this is a self-join case. Example:
CREATE TABLE a(x int, y int);
CREATE UNIQUE INDEX ON a(x);
SELECT a1.* FROM a a1, a a2 WHERE a1.x = a2.x; -- self-join
CREATE UNIQUE INDEX ON a(y);
SELECT a1.* FROM a a1, a a2 WHERE a1.x = a2.y; -- self-join too
The latter join is not "useless". The patch is returning incorrect
result for that query:
postgres=# insert into a values (1, 2);
INSERT 0 1
postgres=# insert into a values (2, 1);
INSERT 0 1
postgres=# SELECT a1.* FROM a a1, a a2 WHERE a1.x = a2.y; -- WRONG RESULT
x | y
---+---
(0 rows)
postgres=# set enable_self_join_removal=off;
SET
postgres=# SELECT a1.* FROM a a1, a a2 WHERE a1.x = a2.y; -- CORRECT RESULT
x | y
---+---
1 | 2
2 | 1
(2 rows)
- Heikki