Re: [HACKERS] One question about transformation ANY Sublinks into joins

Armor Sat, 23 Jul 2016 09:40:42 -0700

After we pull up this sublink as semi join , when make join rel for semi join, 
the optimizer will take hash join method into account if a unique path can be 
created with the RHS, for detail please check make_join_rel in 
src/backend/optimizer/path/joinrels.c. 
For this case, the cost of  hash join is cheaper than semi join, so you can see 
the planner chose the hash join rather than semi join.



------------------
Jerry Yu
https://github.com/scarbrofair


 




------------------ Original ------------------
From:  "Robert Haas";<robertmh...@gmail.com>;
Date:  Fri, Jul 22, 2016 00:23 AM
To:  "Armor"<yupengst...@qq.com>; 
Cc:  "pgsql-hackers"<pgsql-hackers@postgresql.org>; 
Subject:  Re: [HACKERS] One question about transformation ANY Sublinks into 
joins



On Sun, Jul 17, 2016 at 5:33 AM, Armor <yupengst...@qq.com> wrote:
> Hi
>     I run a simple SQL with latest PG??
> postgres=# explain select * from t1 where id1 in (select id2 from t2 where
> c1=c2);
>                          QUERY PLAN
> ------------------------------------------------------------
>  Seq Scan on t1  (cost=0.00..43291.83 rows=1130 width=8)
>    Filter: (SubPlan 1)
>    SubPlan 1
>      ->  Seq Scan on t2  (cost=0.00..38.25 rows=11 width=4)
>            Filter: (t1.c1 = c2)
> (5 rows)
>
> and the table schema are as following:
>
> postgres=# \d t1
>       Table "public.t1"
>  Column |  Type   | Modifiers
> --------+---------+-----------
>  id1    | integer |
>  c1     | integer |
>
> postgres=# \d t2
>       Table "public.t2"
>  Column |  Type   | Modifiers
> --------+---------+-----------
>  id2    | integer |
>  c2     | integer |
>
>      I find PG decide not to pull up this sublink because the whereClauses
> in this sublink refer to the Vars of parent query, for detail please check
> the function named convert_ANY_sublink_to_join in
> src/backend/optimizer/plan/subselect.c.
>      However, for such simple sublink which has no agg, no window function,
> no limit, may be we can carefully pull up the predicates in whereCluase
> which refers to the Vars of parent query, then pull up this sublink and
> produce a query plan as following:
>
> postgres=# explain select * from t1 where id1 in (select id2 from t2 where
> c1=c2);
>                                QUERY PLAN
> ------------------------------------------------------------------------
>  Hash Join  (cost=49.55..99.23 rows=565 width=8)
>    Hash Cond: ((t1.id1 = t2.id2) AND (t1.c1 = t2.c2))
>    ->  Seq Scan on t1  (cost=0.00..32.60 rows=2260 width=8)
>    ->  Hash  (cost=46.16..46.16 rows=226 width=8)
>          ->  HashAggregate  (cost=43.90..46.16 rows=226 width=8)
>                Group Key: t2.id2, t2.c2
>                ->  Seq Scan on t2  (cost=0.00..32.60 rows=2260 width=8)

It would need to be a Hash Semi Join rather than a Hash Join, wouldn't it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] One question about transformation ANY Sublinks into joins

Reply via email to