On Sat, Sep 5, 2020 at 7:21 PM Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > > OK, here is a rebased version. Most of the breakage was due to changes > to the BRIN sgml docs.
Hi Tomas, I plan on trying some different queries on different data distributions to get a sense of when the planner chooses a multi-minmax index, and whether the choice is good. Just to start, I used the artificial example in [1], but scaled down a bit to save time. Config is at the default except for: shared_buffers = 1GB random_page_cost = 1.1; effective_cache_size = 4GB; create table t (a bigint, b int) with (fillfactor=95); insert into t select i + 1000*random(), i+1000*random() from generate_series(1,10000000) s(i); update t set a = 1, b = 1 where random() < 0.001; update t set a = 10000000, b = 10000000 where random() < 0.001; analyze t; create index on t using brin (a); CREATE INDEX Time: 1631.452 ms (00:01.631) explain analyze select * from t where a between 1923300::int and 1923600::int; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on t (cost=16.10..43180.43 rows=291 width=12) (actual time=217.770..1131.366 rows=288 loops=1) Recheck Cond: ((a >= 1923300) AND (a <= 1923600)) Rows Removed by Index Recheck: 9999712 Heap Blocks: lossy=56819 -> Bitmap Index Scan on t_a_idx (cost=0.00..16.03 rows=22595 width=0) (actual time=3.054..3.055 rows=568320 loops=1) Index Cond: ((a >= 1923300) AND (a <= 1923600)) Planning Time: 0.328 ms Execution Time: 1131.411 ms (8 rows) Now add the multi-minmax: create index on t using brin (a int8_minmax_multi_ops); CREATE INDEX Time: 6521.026 ms (00:06.521) The first interesting thing is, with both BRIN indexes available, the planner still chooses the conventional BRIN index. Only when I disable it, does it choose the multi-minmax index: explain analyze select * from t where a between 1923300::int and 1923600::int; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on t (cost=68.10..43160.86 rows=291 width=12) (actual time=1.835..4.196 rows=288 loops=1) Recheck Cond: ((a >= 1923300) AND (a <= 1923600)) Rows Removed by Index Recheck: 22240 Heap Blocks: lossy=128 -> Bitmap Index Scan on t_a_idx1 (cost=0.00..68.03 rows=22523 width=0) (actual time=0.691..0.691 rows=1280 loops=1) Index Cond: ((a >= 1923300) AND (a <= 1923600)) Planning Time: 0.250 ms Execution Time: 4.244 ms (8 rows) I wonder if this is a clue that something in the costing unfairly penalizes a multi-minmax index. Maybe not enough to matter in practice, since I wouldn't expect a user to put different kinds of index on the same column. The second thing is, with parallel seq scan, the query is faster than a BRIN bitmap scan, with this pathological data distribution, but the planner won't choose it unless forced to: set enable_bitmapscan = 'off'; explain analyze select * from t where a between 1923300::int and 1923600::int; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Gather (cost=1000.00..120348.10 rows=291 width=12) (actual time=372.766..380.364 rows=288 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on t (cost=0.00..119319.00 rows=121 width=12) (actual time=268.326..366.228 rows=96 loops=3) Filter: ((a >= 1923300) AND (a <= 1923600)) Rows Removed by Filter: 3333237 Planning Time: 0.089 ms Execution Time: 380.434 ms (8 rows) And just to compare size: BRIN 32kB BRIN multi 136kB Btree 188MB [1] https://www.postgresql.org/message-id/459eef3e-48c7-0f5a-8356-992442a78bb6%402ndquadrant.com -- John Naylor https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services