Hi!

I think Bitmap Index Scan should take advantage of B-tree INCLUDE columns, 
which it doesn’t at the moment (tested on master as of yesterday).

Consider this (find the setup at the bottom of this mail).

CREATE INDEX idx ON tbl (a, b) INCLUDE (c);

EXPLAIN (analyze, buffers)
SELECT *
  FROM tbl
 WHERE a = 1
   AND c = 1;

The following plan could be produced:

                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on tbl  (cost=4.14..8.16 rows=1 width=7616) (actual 
time=0.027..0.029 rows=1 loops=1)
   Recheck Cond: (a = 1)
   Filter: (c = 1)
   Rows Removed by Filter: 7
   Heap Blocks: exact=2
   Buffers: shared hit=2 read=1
   ->  Bitmap Index Scan on idx  (cost=0.00..4.14 rows=1 width=0) (actual 
time=0.018..0.018 rows=8 loops=1)
         Index Cond: (a = 1)
         Buffers: shared read=1
 Planning Time: 0.615 ms
 Execution Time: 0.060 ms


The Bitmap Index Scan does not filter on column C, which is INCLUDEd in the 
index leaf nodes.

Instead, the Bitmap Heap Scan fetches unnecessary blocks and then applies the 
filter.

I would expect a similar execution as with this index:

DROP INDEX idx;
CREATE INDEX idx ON tbl (a, b, c);

                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on tbl  (cost=4.14..8.16 rows=1 width=7616) (actual 
time=0.021..0.021 rows=1 loops=1)
   Recheck Cond: ((a = 1) AND (c = 1))
   Heap Blocks: exact=1
   Buffers: shared hit=1 read=1
   ->  Bitmap Index Scan on idx  (cost=0.00..4.14 rows=1 width=0) (actual 
time=0.018..0.018 rows=1 loops=1)
         Index Cond: ((a = 1) AND (c = 1))
         Buffers: shared read=1
 Planning Time: 0.123 ms
 Execution Time: 0.037 ms

(As a side node: I also dislike it how Bitmap Index Scan mixes search 
conditions and filters in “Index Cond”)

Due to the not-filtered column B in the index, the use of column C is here 
pretty much the same as it is for the first index, which has C in INCLUDE.

Am I missing anything or is it just an oversight because INCLUDE was primarily 
done for Index Only Scan?

As a background, here is the use case I see for this scenario:
Filters that can not be used for searching the tree can be put in INCLUDE 
instead of the key-part of the index even if there is no intend to run the 
query as Index Only Scan (e.g. SELECT *). Columns in INCLUDE can be used for 
filtering (works fine for Index Only Scan, btw).

The most prominent example are post-fix or in-fix LIKE ‘%term%’ filters. The 
benefit of putting such columns in INCLUDE is that it is clearly documented 
that the index column is neither used for searching in the tree nor for 
ordering. It is either used for filtering, or for fetching. Knowing this makes 
it easier to extend existing index.

Imagine you have this query to optimize:

SELECT *
  FROM …
 WHERE a = ?
   AND b = ?
 ORDER BY ts DESC
 LIMIT 10

And find this existing index on that table:

CREATE INDEX … ON … (a, b) INCLUDE (c);

In this case it is a rather safe option to replace the index with the following:

CREATE INDEX … ON … (a, b, ts) INCLUDE (c);

On the other hand, if the original index had all column in the key part, 
changing it to (A, B, TS, C) is a rather dangerous option.

-markus

— Setup for the demo

-- demo table: 4 rows per block
-- the row if interest is the first one, the others spill to a second block so 
the problem can also seen in “Buffers"

CREATE TABLE tbl (a INTEGER, b INTEGER, c INTEGER, junk CHAR(1900));

INSERT INTO tbl VALUES (1, 1, 1, 'junk'),
                       (1, 1, 9, 'junk'),
                       (1, 1, 9, 'junk'),
                       (1, 1, 9, 'junk'),
                       (1, 1, 9, 'junk'),
                       (1, 1, 9, 'junk'),
                       (1, 1, 9, 'junk'),
                       (1, 1, 9, 'junk');

-- prevent unwanted plans for demo of Bitmap Index+Heap Scan
SET ENABLE_INDEXSCAN = OFF;
SET ENABLE_SEQSCAN = OFF;

Reply via email to