Re: [HACKERS] [WIP] Better partial index-only scans

Joshua Yanovski Mon, 17 Mar 2014 00:16:24 -0700

Here's a SQL script that (1) demonstrates the new index only scan
functionality, and (2) at least on my machine, has a consistently
higher planning time for the version with my change than without it.


On Sun, Mar 16, 2014 at 5:08 AM, Joshua Yanovski <pythones...@gmail.com> wrote:
> Proof of concept initial patch for enabling index only scans for
> partial indices even when an attribute is not in the target list, as
> long as it is only used in restriction clauses that can be proved by
> the index predicate.  This also works for index quals, though they
> still can't be used in the target list.  However, this patch may be
> inefficient since it duplicates effort that is currently delayed until
> after the best plan is chosen.
>
> The patch works by basically repeating the logic from
> create_indexscan_plan in createplan.c that determines which clauses
> can't be discarded, instead of the current approach, which just
> assumes that any attributes referenced anywhere in a restriction
> clause has to be a column in the relevant index.  It should build
> against master and passes make check for me.  It also includes a minor
> fix in the same code in createplan.c to make sure we're explicitly
> comparing an empty list to NIL, but I can take that out if that's not
> considered in scope.  If this were the final patch I'd probably
> coalesce the code used in both places into a single function, but
> since I'm not certain that the implementation in check_index_only
> won't change substantially I held off on that.
>
> Since the original comment suggested that this was not done due to
> planner performance concerns, I assume the performance of this
> approach is unacceptable (though I did a few benchmarks and wasn't
> able to detect a consistent difference--what would be a good test for
> this?).  As such, this is intended as more of a first pass that I can
> build on, but I wanted to get feedback at this stage on where we can
> improve (particularly if there were already ideas on how this might be
> done, as the comment hints).  Index only scans cost less than regular
> index scans so I don't think we can get away with waiting until we've
> chosen the best plan before we do the work described above.  That
> said, as I see it performance could improve in any combination of five
> ways:
> * Improve the performance of determining which clauses can't be
> discarded (e.g. precompute some information about equivalence classes
> for index predicates, mess around with the order in which we check the
> clauses to make it fail faster, switch to real union-find data
> structures for equivalence classes).
> * Find a cleverer way of figuring out whether a partial index can be
> used than just checking which clauses can't be discarded.
> * Use a simpler heuristic (that doesn't match what use to determine
> which clauses can be discarded, but still matches more than we do
> now).
> * Take advantage of work we do here to speed things up elsewhere (e.g.
> if this does get chosen as the best plan we don't need to recompute
> the same information in create_indexscan_plan).
> * Delay determining whether to use an index scan or index only scan
> until after cost analysis somehow.  I'm not sure exactly what this
> would entail.
>
> Since this is my first real work with the codebase, I'd really
> appreciate it if people could help me figure out the best approach
> here (and, more importantly, if one is necessary based on benchmarks).
>  And while this should go without saying, if this patch doesn't
> actually work then please let me know, since all the above is based on
> the assumption that what's there is enough :)
>
> Thanks,
> Joshua Yanovski



-- 
Josh

test_indexscan.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [WIP] Better partial index-only scans

Reply via email to