Hi,

On 07/10/2015 10:43 PM, Tom Lane wrote:
Tomas Vondra <tomas.von...@2ndquadrant.com> writes:
currently partial indexes end up not using index only scans in most
cases, because check_index_only() is overly conservative, as explained
in this comment:
...

I've done a bunch of tests, and I do see small (hardly noticeable)
increase in planning time with long list of WHERE clauses, because all
those need to be checked against the index predicate. Not sure if this
is what's meant by 'quite expensive' in the comment. Moreover, this was
more than compensated by the IOS benefits (even with everything in RAM).

But maybe it's possible to fix that somehow? For example, we're
certainly doing those checks elsewhere when deciding which clauses need
to be evaluated at run-time, so maybe we could cache that somehow?

The key problem here is that you're doing those proofs vastly earlier
than before, for indexes that might not get used at all in the final
plan. If you do some tests with multiple partial indexes you will
probably see a bigger planning-time penalty.

Hmmm. Maybe we could get a bit smarter by looking at the attnums of each clause before doing the expensive stuff (which is predicate_implied_by I believe), exploiting a few simple observations:

  * if the clause is already covered by attrs_used, we don't need to
    process it at all

  * if the clause uses attributes not included in the index predicate,
    we know it can't be implied

Of course, those are local optimizations, and can't fix some of the problems (e.g. a lot of partial indexes).

Perhaps we should bite the bullet and do it anyway, but I'm pretty
suspicious of any claim that the planning cost is minimal.

Perhaps - I'm not claiming the planning cost is minimal. It was in the tests I've done, but no doubt it's possible to construct examples where the planning time will get much worse. With 30 partial indexes, I got an increase from 0.01 ms to ~2.5ms on simple queries.

But maybe we could get at least some of the benefits by planning the index scans like today, and then do the IOS check later? Of course, this won't help with cases where the index scan is thrown away while the index only scan would win, but it does help with cases where we end up doing index scan anyway?

That's essentially what I'm struggling right now - I do have a 3TB data set, the plan looks like this:

                               QUERY PLAN
------------------------------------------------------------------------
 Sort  (cost=1003860164.92..1003860164.92 rows=1 width=16)
   Sort Key: orders.o_orderpriority
   ->  HashAggregate
         Group Key: orders.o_orderpriority
         ->  Merge Semi Join
               Merge Cond:
               ->  Index Scan using pk_orders on orders
                     Filter: ((o_orderdate >= '1997-07-01'::date) AND
                     (o_orderdate < '1997-10-01 00:00:00'::timestamp))
               ->  Index Scan using lineitem_l_orderkey_idx_part1 on
                   lineitem

and the visibility checks from Index Scans are killing the I/O. An IOS is likely to perform much better here (but haven't ran the query yet).

regards


--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to