On Sun, Sep 11, 2016 at 9:20 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > Michael Malis <michaelmal...@gmail.com> writes: > >> As I understand it, a merge join will currently read all tuples from > both > >> subqueries (besides early termination). I believe it should be possible > to > >> take advantages of the indexes on one or both of the tables being read > from > >> to skip a large number of tuples that would currently be read. > >> > > IIUC, what you're proposing is to replace "read past some tuples in the > > index" with "re-descend the index btree". Yes, that would win if it >
Roughly yes, although is it necessary to rescan the btree from the top? Is it not possible to "resume" the scan from the previous location? > > You might want to troll the archives for past discussions of index skip > > scan, which is more or less the same idea (could possibly be exactly the > > same idea, depending on how it's implemented). > After searching a bit, all I was able to find was this <https://www.postgresql.org/message-id/CADLWmXXbTSBxP-MzJuPAYSsL_2f0iPm5VWPbCvDbVvfX93FKkw%40mail.gmail.com>. It looks like someone had a rough patch of applying a similar idea to DISTINCT. I can't tell what happened to it, but in the patch they mention that it should be possible to do a "local traversal ie from the current page". A different, but similar optimization, would be to first check the join condition with the data in the index. Then only fetch the full row only if the data in the index passes the join condition. I imagine that this would be easier to implement, although less efficent than what I proposed above because it has to scan the entire index. Thanks, Michael