čt 28. 11. 2019 v 4:48 odesílatel Pavel Stehule <pavel.steh...@gmail.com> napsal:
> Hi > > čt 28. 11. 2019 v 3:56 odesílatel David Fetter <da...@fetter.org> napsal: > >> On Wed, Nov 27, 2019 at 08:47:56AM +0100, Pavel Stehule wrote: >> > Hi >> > >> > I have a report from my customer about migration his application from >> > Oracle to Postgres. >> > >> > The most significant issue was missing correct estimation for coalesce >> > function. He had to rewrite coalesce(var, X) = X to "var IS NULL or var >> = >> > X". Then the result was very satisfactory. >> > >> > Example: >> > >> > create table xxx(a int); >> > insert into xxx select null from generate_series(1,10000); >> > insert into xxx select 1 from generate_series(1,1000); >> > insert into xxx select 0 from generate_series(1,1000); >> > analyze xxx; >> > >> > postgres=# explain analyze select * from xxx where coalesce(a, 0) = 0; >> > QUERY PLAN >> > >> > >> ---------------------------------------------------------------------------------------------------- >> > Seq Scan on xxx (cost=0.00..194.00 rows=60 width=4) (actual >> > time=0.041..4.276 rows=11000 loops=1) >> > Filter: (COALESCE(a, 0) = 0) >> > Rows Removed by Filter: 1000 >> > Planning Time: 0.099 ms >> > Execution Time: 5.412 ms >> > (5 rows) >> > >> > postgres=# explain analyze select * from xxx where a is null or a = 0; >> > QUERY PLAN >> > >> > >> ------------------------------------------------------------------------------------------------------- >> > Seq Scan on xxx (cost=0.00..194.00 rows=10167 width=4) (actual >> > time=0.052..5.891 rows=11000 loops=1) >> > Filter: ((a IS NULL) OR (a = 0)) >> > Rows Removed by Filter: 1000 >> > Planning Time: 0.136 ms >> > Execution Time: 7.522 ms >> > (5 rows) >> > >> > I think so pattern coalesce(var, X) = X is very common so can be very >> > interesting to support it better. >> >> Better support sounds great! >> >> How specifically might this be better supported? On this relatively >> short table, I see planning times considerably longer, I assume >> because they need to take a function call into account, and execution >> times longer but not all that much longer. I tried with 3 million >> rows, and got the representative samples below: >> >> shackle@[local]:5413/ctest(13devel)(149711) # EXPLAIN ANALYZE SELECT * >> FROM xxx WHERE COALESCE(a, 0)=0; >> QUERY PLAN >> >> >> ══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════ >> Gather (cost=1000.00..30391.00 rows=15000 width=4) (actual >> time=1.315..346.406 rows=999772 loops=1) >> Workers Planned: 2 >> Workers Launched: 2 >> -> Parallel Seq Scan on xxx (cost=0.00..27891.00 rows=6250 width=4) >> (actual time=0.029..216.419 rows=333257 loops=3) >> Filter: (COALESCE(a, 0) = 0) >> Rows Removed by Filter: 666743 >> Planning Time: 0.204 ms >> Execution Time: 389.307 ms >> (8 rows) >> >> Time: 391.394 ms >> >> shackle@[local]:5413/ctest(13devel)(149711) # EXPLAIN ANALYZE SELECT * >> FROM xxx WHERE a IS NULL OR a = 0; >> QUERY PLAN >> >> >> ═════════════════════════════════════════════════════════════════════════════════════════════════════════════ >> Seq Scan on xxx (cost=0.00..49766.00 rows=995700 width=4) (actual >> time=0.043..524.401 rows=999772 loops=1) >> Filter: ((a IS NULL) OR (a = 0)) >> Rows Removed by Filter: 2000228 >> Planning Time: 0.106 ms >> Execution Time: 560.593 ms >> (5 rows) >> >> Time: 561.186 ms >> > > I didn't thing about rewriting. The correct solution should be via own > selectivity function. Now for coalesce is used 5% estimation (like for > other functions). Probably it should not be hard code because coalesce is a > node already. But it is part of code that I never modified. > but support functions can be used https://www.cybertec-postgresql.com/en/optimizer-support-functions/ postgres=# create table test(id integer); CREATE TABLE postgres=# insert into test select generate_series(1,100000); INSERT 0 100000 postgres=# insert into test select null from generate_series(1,1000); INSERT 0 1000 postgres=# analyze test; ANALYZE postgres=# create index on test(id); CREATE INDEX postgres=# explain analyze select * from test where coalesce(id, 10) = 10; ┌───────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ╞═══════════════════════════════════════════════════════════════════════════════════════════════════════╡ │ Seq Scan on test (cost=0.00..1708.50 rows=505 width=4) (actual time=0.062..18.370 rows=1001 loops=1) │ │ Filter: (COALESCE(id, 10) = 10) │ │ Rows Removed by Filter: 99999 │ │ Planning Time: 37.212 ms │ │ Execution Time: 18.479 ms │ └───────────────────────────────────────────────────────────────────────────────────────────────────────┘ (5 rows) postgres=# explain analyze select * from test where id is null or id = 10; ┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ╞════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡ │ Bitmap Heap Scan on test (cost=24.30..482.35 rows=964 width=4) (actual time=0.197..0.334 rows=1001 loops=1) │ │ Recheck Cond: ((id IS NULL) OR (id = 10)) │ │ Heap Blocks: exact=5 │ │ -> BitmapOr (cost=24.30..24.30 rows=964 width=0) (actual time=0.189..0.189 rows=0 loops=1) │ │ -> Bitmap Index Scan on test_id_idx (cost=0.00..19.52 rows=963 width=0) (actual time=0.170..0.170 rows=1000 loops=1) │ │ Index Cond: (id IS NULL) │ │ -> Bitmap Index Scan on test_id_idx (cost=0.00..4.30 rows=1 width=0) (actual time=0.019..0.019 rows=1 loops=1) │ │ Index Cond: (id = 10) │ │ Planning Time: 0.090 ms │ │ Execution Time: 0.413 ms │ └────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ (10 rows) There can be strong benefit from replacement if indexes are used. > Pavel > >> >> Best, >> David. >> -- >> David Fetter <david(at)fetter(dot)org> http://fetter.org/ >> Phone: +1 415 235 3778 >> >> Remember to vote! >> Consider donating to Postgres: http://www.postgresql.org/about/donate >> >