Hi Tom:

Thanks for your reply! I have self reviewed the below message at 3 different
time periods to prevent from too inaccurate replies. It may be more detailed
than it really needed, but it probably can show where I am lost.

On Sat, Oct 15, 2022 at 3:27 AM Tom Lane <t...@sss.pgh.pa.us> wrote:

>
> If the pulled-up join doesn't go into the nullable side of the upper
> join then you've changed semantics.  In this case, it'd amount to

reassociating a semijoin that was within the righthand side of another
> semijoin to above that other semijoin.


I understand your reply as:

select * from t1 *left join* t2 on exists (select 1 from t3 where t3.a =
t2.a);
= select * from t1 *left join* (t2 semi join t3 on t3.a = t2.a) on true;
-- go to nullable side
!=  select * from (t1 *left join* t2 on true) semi join t3 on (t3.a =
t2.a);  -- go to above the JoinExpr

I CAN follow the above.  And for this case it is controlled by below code:

pull_up_sublinks_qual_recurse

switch (j->jointype)
{
          case JOIN_INNER:
                       ...
          case JOIN_LEFT:
                j->quals = pull_up_sublinks_qual_recurse(root, j->quals,

               &j->rarg,

               rightrelids,

               NULL, NULL);
              break;
            ...
}

and I didn't change this.   My question is could we assume

A *semijoin* B ON EXISTS (SELECT 1 FROM C on (Pbc))
=  (A *semijoin* (B *semijoin* C on (Pbc))) on TRUE.  (current master did)
=  (A *semijoin* B ON true) *semijoin* C on (Pbc)        (my current
thinking)

Note that there is no 'left outer join' at this place.   Since there are too
many places called pull_up_sublinks_qual_recurse, to make things
less confused, I prepared a patch for this one line change to show where
exactly I changed (see patch 2);  I think this is the first place I lost.

The discussion of outer join
> reordering in optimizer/README says that that doesn't work,


I think you are talking about the graph "Valid OUTER JOIN Optimizations".
I can follow until below.

"
SEMI joins work a little bit differently.  A semijoin can be reassociated
into or out of the lefthand side of another semijoin, left join, or
antijoin, but not into or out of the righthand side.  ..
"
I am unclear why
 (A semijoin B on (Pab)) semijoin C on (Pbc)
!= A semijoin (B semijoin C on (Pbc)) on (Pab);

Seems both return rows from A which match both semijoin (Pab) and
(Pbc).  or I misunderstand the above words in the first place?

At last, when I checked optimizer/README, it looks like we used
a 'nullable side'  while it should be 'nonnullable side'? see patch 1
for details.

-- 
Best Regards
Andy Fan
From 8327cfcf969704e9db73bc43e8f270b70263e493 Mon Sep 17 00:00:00 2001
From: Andy Fan <zhihui.fan1213@gmail.com>
Date: Mon, 17 Oct 2022 08:38:41 +0800
Subject: [PATCH v1 1/2] a typo error? I think it should be nonnullable side
 rather thanks

nullable side.
---
 src/backend/optimizer/README | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index 41c120e0cdf..0e4232f409f 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -217,7 +217,7 @@ RIGHT JOIN is equivalent to LEFT JOIN after switching the two input
 tables, so the same identities work for right joins.
 
 An example of a case that does *not* work is moving an innerjoin into or
-out of the nullable side of an outer join:
+out of the nonnullable side of an outer join:
 
 	A leftjoin (B join C on (Pbc)) on (Pab)
 	!= (A leftjoin B on (Pab)) join C on (Pbc)
-- 
2.21.0

From ab226166518936d81029fbaab7c8aba9fe1550ec Mon Sep 17 00:00:00 2001
From: Andy Fan <zhihui.fan1213@gmail.com>
Date: Mon, 17 Oct 2022 09:08:37 +0800
Subject: [PATCH v1 2/2] This patch doesn't mean I insist on this at all, it
 just shows where I'm lost.

---
 src/backend/optimizer/prep/prepjointree.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 41c7066d90a..581ff2db36b 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -551,12 +551,24 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 				 * joins can get stacked onto either j->larg or j->rarg,
 				 * depending on which rels they reference.
 				 */
+
+				/*
+				 * Here the j comes from convert_EXISTS_sublink_to_join, so
+				 * j->jointype must be SEMIJOIN, so no left/right/full outer
+				 * join involved. I prefer to believe that:
+				 *
+				 * A semi join B on exists (select 1 from C on (Pbc))  -- id1
+				 * = A semi join (B semi join C on (Pbc) ) on true  --  id2 current master does.
+				 * = (A semi join B on true) semi join C on (Pbc) -- id3 this patch does.
+				 *
+				 * I think this is the place I lost:(
+				 */
 				j->quals = pull_up_sublinks_qual_recurse(root,
 														 j->quals,
-														 &j->larg,
-														 available_rels1,
-														 &j->rarg,
-														 child_rels);
+														 jtlink1,
+														 bms_union(available_rels1, child_rels),
+														 NULL,
+														 NULL);
 				/* Return NULL representing constant TRUE */
 				return NULL;
 			}
-- 
2.21.0

Reply via email to