On Wed, Jun 22, 2016 at 2:23 PM, Jonathan Vanasco <postg...@2xlp.com> wrote:
> > On Jun 22, 2016, at 4:25 AM, Erik Gustafson wrote: > > > don't you want an index on t_a2b.col_a, maybe partial where col_a=1 ? > > that table has indexes on all columns. they're never referenced because > the rows are so short. this was just an example query too, col_a has 200k > variations > > After a lot of testing, I think I found a not-bug but possible > area-for-improvement in the planner when joining against a table for > filtering (using my production 9.5.2 box) > > I checked a query What query? A self-contained email would be nice. against multiple possible indexes using the related columns. only one of > indexes was on the table for each series of tests, and I analyzed the table > after the drop/create of indexes. > > > Note 1: The only time an index-only scan is used, is on this form: > > CREATE INDEX idx_partial_fkey_id_partial ON > table_a(fkey_1, id, col_partial) WHERE col_partial IS NOT FALSE; > > Omitting the col_partial from being indexed will trigger a Bitmap > Heap Scan on the full table with a recheck condition: > > CREATE INDEX idx_partial_fkey_id ON table_a(fkey_1, id) > WHERE col_partial IS NOT FALSE; > > This shouldn't be necessary. the planner knew that `col_partial` > fulfilled the WHERE clause when it used the index, but scanned the table to > check it anyways. > > On most tables the heap scan was negligible, but on a few larger > tables it accounted a 20% increase in execution. > > https://www.postgresql.org/docs/9.6/static/indexes-index-only-scans.html Note especially: "Visibility information is not stored in index entries, only in heap entries; ..." The check against the heap isn't for the truthiness of the predicate but the visibility of the row. > Note 2: > > This is odd, but this index is used by the planner: > CREATE INDEX idx_partial_fkey_id ON table_a(fkey_1, id) > WHERE col_partial IS NOT FALSE; > > but this index is never used: > CREATE INDEX idx_partial_id_fkey ON table_a(id, fkey_1) > WHERE col_partial IS NOT FALSE; > > I honestly don't know why the second index would not be used. The > query time doubled without it when run on a table with 6million rows and > about 20 columns. > > This one requires knowledge of the query; but I am not surprised that reversing the order of columns in a b-tree index has an impact. ------- > > The indexes I tested on: > > CREATE INDEX idx_fkey_1 ON table_a(fkey_1); > CREATE INDEX idx_partial_fkey ON table_a(fkey_1) WHERE col_partial > IS NOT FALSE; > CREATE INDEX idx_partial_fkey_id ON table_a(fkey_1, id) WHERE > col_partial IS NOT FALSE; > CREATE INDEX idx_partial_id_fkey ON table_a(id, fkey_1) WHERE > col_partial IS NOT FALSE; > CREATE INDEX idx_partial_fkey_partial ON table_a(fkey_1, > col_partial) WHERE col_partial IS NOT FALSE; > CREATE INDEX idx_partial_fkey_id_partial ON table_a(fkey_1, id, > col_partial) WHERE col_partial IS NOT FALSE; All at once? David J.