Sorry for the duplicates, I received a letter that my letter did not
reach the addressee, I thought the design was incorrect.
On 26.09.2023 12:21, a.rybakina wrote:
I'm sorry I didn't write for a long time, but I really had a very
difficult month, now I'm fully back to work.
*I was able to implement the patches to the end and moved the
transformation of "OR" expressions to ANY.* I haven't seen a big
difference between them yet, one has a transformation before
calculating selectivity (v7.1-Replace-OR-clause-to-ANY.patch), the
other after (v7.2-Replace-OR-clause-to-ANY.patch). Regression tests
are passing, I don't see any problems with selectivity, nothing has
fallen into the coredump, but I found some incorrect transformations.
What is the reason for these inaccuracies, I have not found, but, to
be honest, they look unusual). Gave the error below.
In the patch, I don't like that I had to drag three libraries from
parsing until I found a way around it.The advantage of this approach
compared to the other ([1]) is that at this stage all possible or
transformations are performed, compared to the patch, where the
transformation was done at the parsing stage. That is, here, for
example, there are such optimizations in the transformation:
I took the common element out of the bracket and the rest is converted
to ANY, while, as noted by Peter Geoghegan, we did not have several
bitmapscans, but only one scan through the array.
postgres=# explain analyze SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE prolang = 13 AND prolang=1 OR prolang = 13 AND prolang = 2 OR
prolang = 13 AND prolang = 3;
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Seq Scan on pg_proc p1 (cost=0.00..151.66 rows=1 width=68) (actual
time=1.167..1.168 rows=0 loops=1)
Filter: ((prolang = '13'::oid) AND (prolang = ANY (ARRAY['1'::oid,
'2'::oid, '3'::oid])))
Rows Removed by Filter: 3302
Planning Time: 0.146 ms
Execution Time: 1.191 ms
(5 rows)
*While I was testing, I found some transformations that don't work,
although in my opinion, they should:**
**
**1. First case:*
explain analyze SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE prolang = 13 AND prolang=1 OR prolang = 2 AND prolang = 2 OR
prolang = 13 AND prolang = 13;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on pg_proc p1 (cost=0.00..180.55 rows=2 width=68) (actual
time=2.959..3.335 rows=89 loops=1)
Filter: (((prolang = '13'::oid) AND (prolang = '1'::oid)) OR
((prolang = '2'::oid) AND (prolang = '2'::oid)) OR ((prolang =
'13'::oid) AND (prolang = '13'::oid)))
Rows Removed by Filter: 3213
Planning Time: 1.278 ms
Execution Time: 3.486 ms
(5 rows)
Should have left only prolang = '13'::oid:
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Seq Scan on pg_proc p1 (cost=0.00..139.28 rows=1 width=68) (actual
time=2.034..2.034 rows=0 loops=1)
Filter: ((prolang = '13'::oid ))
Rows Removed by Filter: 3302
Planning Time: 0.181 ms
Execution Time: 2.079 ms
(5 rows)
*2. Also does not work:*
postgres=# explain analyze SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE prolang = 13 OR prolang = 2 AND prolang = 2 OR prolang = 13;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Seq Scan on pg_proc p1 (cost=0.00..164.04 rows=176 width=68) (actual
time=2.422..2.686 rows=89 loops=1)
Filter: ((prolang = '13'::oid) OR ((prolang = '2'::oid) AND
(prolang = '2'::oid)) OR (prolang = '13'::oid))
Rows Removed by Filter: 3213
Planning Time: 1.370 ms
Execution Time: 2.799 ms
(5 rows)
Should have left:
Filter: ((prolang = '13'::oid) OR (prolang = '2'::oid))
*3. Or another:*
explain analyze SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE prolang = 13 OR prolang=13 OR prolang = 2 AND prolang = 2;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Seq Scan on pg_proc p1 (cost=0.00..164.04 rows=176 width=68) (actual
time=2.350..2.566 rows=89 loops=1)
Filter: ((prolang = '13'::oid) OR (prolang = '13'::oid) OR
((prolang = '2'::oid) AND (prolang = '2'::oid)))
Rows Removed by Filter: 3213
Planning Time: 0.215 ms
Execution Time: 2.624 ms
(5 rows)
Should have left:
Filter: ((prolang = '13'::oid) OR (prolang = '2'::oid))
*Falls into coredump at me:*
explain analyze SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE prolang = 13 OR prolang = 2 AND prolang = 2 OR prolang = 13;
explain analyze SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE prolang = 13 OR prolang=13 OR prolang = 2 AND prolang = 2;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Seq Scan on pg_proc p1 (cost=0.00..164.04 rows=176 width=68) (actual
time=2.350..2.566 rows=89 loops=1)
Filter: ((prolang = '13'::oid) OR (prolang = '13'::oid) OR
((prolang = '2'::oid) AND (prolang = '2'::oid)))
Rows Removed by Filter: 3213
Planning Time: 0.215 ms
Execution Time: 2.624 ms
(5 rows)
I remind that initially the task was to find an opportunity to
optimize the case of processing a large number of "or" expressions to
optimize memory consumption. The FlameGraph for executing 50,000 "or"
expressionshas grown 1.4Gb and remains in this state until exiting the
psql session (flamegraph1.png) and it sagged a lot in execution time.
If this case is converted to ANY, the query is executed much faster
and memory is optimized (flamegraph2.png). It may be necessary to use
this approach if there is no support for the framework to process ANY,
IN expressions.
Peter Geoghegan also noticed some development of this patch in terms
of preparing some transformations to optimize the query at the stage
of its execution [0].
[0]
https://www.postgresql.org/message-id/CAH2-Wz%3D9N_4%2BEyhtyFqYQRx4OgVbP%2B1aoYU2JQPVogCir61ZEQ%40mail.gmail.com
[1]
https://www.postgresql.org/message-id/attachment/149105/v7-Replace-OR-clause-to-ANY-expressions.patch