Here's a SQL script that (1) demonstrates the new index only scan functionality, and (2) at least on my machine, has a consistently higher planning time for the version with my change than without it.
On Sun, Mar 16, 2014 at 5:08 AM, Joshua Yanovski <pythones...@gmail.com> wrote: > Proof of concept initial patch for enabling index only scans for > partial indices even when an attribute is not in the target list, as > long as it is only used in restriction clauses that can be proved by > the index predicate. This also works for index quals, though they > still can't be used in the target list. However, this patch may be > inefficient since it duplicates effort that is currently delayed until > after the best plan is chosen. > > The patch works by basically repeating the logic from > create_indexscan_plan in createplan.c that determines which clauses > can't be discarded, instead of the current approach, which just > assumes that any attributes referenced anywhere in a restriction > clause has to be a column in the relevant index. It should build > against master and passes make check for me. It also includes a minor > fix in the same code in createplan.c to make sure we're explicitly > comparing an empty list to NIL, but I can take that out if that's not > considered in scope. If this were the final patch I'd probably > coalesce the code used in both places into a single function, but > since I'm not certain that the implementation in check_index_only > won't change substantially I held off on that. > > Since the original comment suggested that this was not done due to > planner performance concerns, I assume the performance of this > approach is unacceptable (though I did a few benchmarks and wasn't > able to detect a consistent difference--what would be a good test for > this?). As such, this is intended as more of a first pass that I can > build on, but I wanted to get feedback at this stage on where we can > improve (particularly if there were already ideas on how this might be > done, as the comment hints). Index only scans cost less than regular > index scans so I don't think we can get away with waiting until we've > chosen the best plan before we do the work described above. That > said, as I see it performance could improve in any combination of five > ways: > * Improve the performance of determining which clauses can't be > discarded (e.g. precompute some information about equivalence classes > for index predicates, mess around with the order in which we check the > clauses to make it fail faster, switch to real union-find data > structures for equivalence classes). > * Find a cleverer way of figuring out whether a partial index can be > used than just checking which clauses can't be discarded. > * Use a simpler heuristic (that doesn't match what use to determine > which clauses can be discarded, but still matches more than we do > now). > * Take advantage of work we do here to speed things up elsewhere (e.g. > if this does get chosen as the best plan we don't need to recompute > the same information in create_indexscan_plan). > * Delay determining whether to use an index scan or index only scan > until after cost analysis somehow. I'm not sure exactly what this > would entail. > > Since this is my first real work with the codebase, I'd really > appreciate it if people could help me figure out the best approach > here (and, more importantly, if one is necessary based on benchmarks). > And while this should go without saying, if this patch doesn't > actually work then please let me know, since all the above is based on > the assumption that what's there is enough :) > > Thanks, > Joshua Yanovski -- Josh
test_indexscan.sql
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers