Hello, The problems I tried to solve here: 1. Improve perfomance for index search ANY(ARRAY[...]) condition with single item 2. I saw tons of users code like: if len(array) == 1: sql += '{}'.format(array[0]) else: sql += 'ANY(ARRAY[{}])'.format(array) So there will be less lines of code and it will be clearer. 3. Confusing moment that "IN" works well with single item, and "ANY(ARRAY[])" doesn't without any real reason.
The problem was discussed on stackoverflow: https://stackoverflow.com/questions/45061966/index-usage-with-single-item-anyarray That's my first patch so I will be grateful for constructive criticism. --------------------------------------------------------------------------- CREATE TABLE public.t (id serial, a integer, b integer); INSERT INTO t(a, b) SELECT round(random()*1000), round(random()*1000) FROM generate_series(1, 1000000); CREATE INDEX "i_1" ON public.t USING btree (a, b); CREATE INDEX "i_2" ON public.t USING btree (b); --------------------------------------------------------------------------- If "a = 50" in the first query, everything is ok, appropriate index "i_1" is used: SELECT * FROM t WHERE a = 50 ORDER BY b LIMIT 1 "Limit (cost=0.42..4.03 rows=1 width=12) (actual time=0.085..0.085 rows=1 loops=1)" " Buffers: shared hit=1 read=3" " -> Index Scan using i_1 on t (cost=0.42..4683.12 rows=1300 width=12) (actual time=0.084..0.084 rows=1 loops=1)" " Index Cond: (a = 50)" " Buffers: shared hit=1 read=3" "Planning time: 0.637 ms" "Execution time: 0.114 ms" --------------------------------------------------------------------------- With "a IN (50)" result is the same: SELECT * FROM t WHERE a IN (50) ORDER BY b LIMIT 1 "Limit (cost=0.42..4.03 rows=1 width=12) (actual time=0.058..0.058 rows=1 loops=1)" " Buffers: shared hit=4" " -> Index Scan using i_1 on t (cost=0.42..4683.12 rows=1300 width=12) (actual time=0.056..0.056 rows=1 loops=1)" " Index Cond: (a = 50)" " Buffers: shared hit=4" "Planning time: 0.287 ms" "Execution time: 0.105 ms" --------------------------------------------------------------------------- The problem is when I try to use "a = ANY(ARRAY[50])". Wrong index "i_2" is used instead of "i_1" and execution time becomes x25 longer: SELECT * FROM t WHERE a = ANY(ARRAY[50]) ORDER BY b LIMIT 1 "Limit (cost=0.42..38.00 rows=1 width=12) (actual time=2.591..2.591 rows=1 loops=1)" " Buffers: shared hit=491 read=4" " -> Index Scan using i_2 on t (cost=0.42..48853.65 rows=1300 width=12) (actual time=2.588..2.588 rows=1 loops=1)" " Filter: (a = ANY ('{50}'::integer[]))" " Rows Removed by Filter: 520" " Buffers: shared hit=491 read=4" "Planning time: 0.251 ms" "Execution time: 2.627 ms"
improve-single-item-array.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers