2018-04-17 12:52 GMT+02:00 Tomas Vondra <tomas.von...@2ndquadrant.com>:
> > > On 04/17/2018 07:17 AM, Pavel Stehule wrote: > >> Hi >> >> 2018-04-16 22:42 GMT+02:00 Hackety Man <hackety...@gmail.com <mailto: >> hackety...@gmail.com>>: >> >> ... >> > > > >> A support of parallel query execution is not complete - it doesn't work >> in PostgreSQL 11 too. So although EXISTS variant can be faster (but can be >> - the worst case of EXISTS is same like COUNT), then due disabled parallel >> execution the COUNT(*) is faster now. It is unfortunate, because I believe >> so this issue will be fixed in few years. >> >> > None of the issues seems to be particularly related to parallel query. > It's much more likely a general issue with planning EXISTS / LIMIT and > non-uniform data distribution. I was wrong EXISTS are not supported. It looks like new dimension of performance issues related to parallelism. I understand so this example is worst case. postgres=# EXPLAIN (ANALYZE, BUFFERS) select exists(SELECT * FROM zz_noidx1 WHERE LOWER(text_distinct) = LOWER('Test5000001')); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Result (cost=4.08..4.09 rows=1 width=1) (actual time=423.600..423.600 rows=1 loops=1) Buffers: shared hit=3296 read=2110 InitPlan 1 (returns $0) -> Seq Scan on zz_noidx1 (cost=0.00..20406.00 rows=5000 width=0) (actual time=423.595..423.595 rows=0 loops=1) Filter: (lower(text_distinct) = 'test5000001'::text) Rows Removed by Filter: 1000000 Buffers: shared hit=3296 read=2110 Planning Time: 0.133 ms Execution Time: 423.633 ms postgres=# EXPLAIN (ANALYZE, BUFFERS) SELECT COUNT(*) FROM zz_noidx1 WHERE LOWER(text_distinct) = LOWER('Test5000001'); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------- Finalize Aggregate (cost=12661.42..12661.43 rows=1 width=8) (actual time=246.662..246.662 rows=1 loops=1) Buffers: shared hit=817 read=549 -> Gather (cost=12661.21..12661.42 rows=2 width=8) (actual time=246.642..246.656 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=817 read=549 -> Partial Aggregate (cost=11661.21..11661.22 rows=1 width=8) (actual time=242.168..242.169 rows=1 loops=3) Buffers: shared hit=3360 read=2046 -> Parallel Seq Scan on zz_noidx1 (cost=0.00..11656.00 rows=2083 width=0) (actual time=242.165..242.165 rows=0 loops=3) Filter: (lower(text_distinct) = 'test5000001'::text) Rows Removed by Filter: 333333 Buffers: shared hit=3360 read=2046 Planning Time: 0.222 ms Execution Time: 247.927 ms The cost of EXISTS is too low to use parallelism, and value is found too late. When I decrease startup cost to 0 of parallel exec I got similar plan, similar time postgres=# EXPLAIN (ANALYZE, BUFFERS) select exists(SELECT * FROM zz_noidx1 WHERE LOWER(text_distinct) = LOWER('Test5000001')); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Result (cost=2.43..2.44 rows=1 width=1) (actual time=246.398..246.402 rows=1 loops=1) Buffers: shared hit=885 read=489 InitPlan 1 (returns $1) -> Gather (cost=0.00..12156.00 rows=5000 width=0) (actual time=246.393..246.393 rows=0 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=885 read=489 -> Parallel Seq Scan on zz_noidx1 (cost=0.00..11656.00 rows=2083 width=0) (actual time=241.067..241.067 rows=0 loops=3) Filter: (lower(text_distinct) = 'test5000001'::text) Rows Removed by Filter: 333333 Buffers: shared hit=3552 read=1854 Planning Time: 0.138 ms Execution Time: 247.623 ms (13 rows) >From this perspective it looks so cost of EXISTS(subselect) is maybe too low. Regards Pavel > > > regards > > -- > Tomas Vondra http://www.2ndQuadrant.com > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >