po 30. 3. 2020 v 18:06 odesÃlatel Pavel Stehule <pavel.steh...@gmail.com> napsal:
> Hi > > when I was in talk with Silvio Moioli, I found strange hash join. Hash was > created from bigger table. > > > https://www.postgresql.org/message-id/79dd683d-3296-1b21-ab4a-28fdc2d98807%40suse.de > > Now it looks so materialized CTE disallow hash > > > create table bigger(a int); > create table smaller(a int); > insert into bigger select random()* 10000 from generate_series(1,100000); > insert into smaller select i from generate_series(1,100000) g(i); > > analyze bigger, smaller; > > -- no problem > explain analyze select * from bigger b join smaller s on b.a = s.a; > > postgres=# explain analyze select * from bigger b join smaller s on b.a = > s.a; > QUERY PLAN > > > ---------------------------------------------------------------------------------------------------------------------------- > Hash Join (cost=3084.00..7075.00 rows=100000 width=8) (actual > time=32.937..87.276 rows=99994 loops=1) > Hash Cond: (b.a = s.a) > -> Seq Scan on bigger b (cost=0.00..1443.00 rows=100000 width=4) > (actual time=0.028..8.546 rows=100000 loops=1) > -> Hash (cost=1443.00..1443.00 rows=100000 width=4) (actual > time=32.423..32.423 rows=100000 loops=1) > Buckets: 131072 Batches: 2 Memory Usage: 2785kB > -> Seq Scan on smaller s (cost=0.00..1443.00 rows=100000 > width=4) (actual time=0.025..9.931 rows=100000 loops=1) > Planning Time: 0.438 ms > Execution Time: 91.193 ms > (8 rows) > > but with materialized CTE > > postgres=# explain analyze with b as materialized (select * from bigger), > s as materialized (select * from smaller) select * from b join s on b.a = > s.a; > QUERY PLAN > > > ---------------------------------------------------------------------------------------------------------------------- > Merge Join (cost=23495.64..773995.64 rows=50000000 width=8) (actual > time=141.242..193.375 rows=99994 loops=1) > Merge Cond: (b.a = s.a) > CTE b > -> Seq Scan on bigger (cost=0.00..1443.00 rows=100000 width=4) > (actual time=0.026..11.083 rows=100000 loops=1) > CTE s > -> Seq Scan on smaller (cost=0.00..1443.00 rows=100000 width=4) > (actual time=0.015..9.161 rows=100000 loops=1) > -> Sort (cost=10304.82..10554.82 rows=100000 width=4) (actual > time=78.775..90.953 rows=100000 loops=1) > Sort Key: b.a > Sort Method: external merge Disk: 1376kB > -> CTE Scan on b (cost=0.00..2000.00 rows=100000 width=4) > (actual time=0.033..39.274 rows=100000 loops=1) > -> Sort (cost=10304.82..10554.82 rows=100000 width=4) (actual > time=62.453..74.004 rows=99996 loops=1) > Sort Key: s.a > Sort Method: external sort Disk: 1768kB > -> CTE Scan on s (cost=0.00..2000.00 rows=100000 width=4) > (actual time=0.018..31.669 rows=100000 loops=1) > Planning Time: 0.303 ms > Execution Time: 199.919 ms > (16 rows) > > It doesn't use hash join - the estimations are perfect, but plan is > suboptimal > I was wrong, the estimation on CTE is ok, but JOIN estimation is bad Merge Join (cost=23495.64..773995.64 rows=50000000 width=8) (actual time=141.242..193.375 rows=99994 loops=1) > Regards > > Pavel > >