Re: [HACKERS] index-only scans

Robert Haas Fri, 12 Aug 2011 08:14:21 -0700

On Fri, Aug 12, 2011 at 9:31 AM, Cédric Villemain
<cedric.villemain.deb...@gmail.com> wrote:
>> Well... PostgreSQL can only use the index on a or the index on b, not
>> both.  This patch doesn't change that.  I'm not trying to use indexes
>> in some completely new way; I'm just trying to make them faster by
>> optimizing away the heap access.
>
> For this kind of plan :
> Bitmap Heap Scan
>  Recheck Cond
>   BitmapAnd
>      Bitmap Index Scan
>      Bitmap Index Scan
>
> It may prevent useless Heap Fetch during "Bitmap Heap Scan", isn't it ?


The input to the bitmap heap scan is just a TID bitmap.  It's too late
to decide we want the index tuples at that point; we've forgotten
where they are, and even if we remembered it, it would necessarily be
any cheaper than checking the heap.  We could optimize this to avoid a
heap fetch in the case where we don't need any of the tuple data at
all, but that's going to be somewhat rare, I think.

>>>> Actually, I can see a possible way to allow an index-only type
>>>> optimization to be used for bitmap scans.  As you scan the index, any
>>>> tuples that can be handled index-only get returned immediately; the
>>>> rest are thrown into a bitmap.  Once you're done examining the index,
>>>> you then do a bitmap heap scan to get the tuples that couldn't be
>>>> handled index-only.  This seems like it might be our best hope for a
>>>> "fast count(*)" type optimization, especially if you could combine it
>>>> with some method of scanning the index in physical order rather than
>>>> logical order.
>>>
>>> IIRC we expose some ideas around that, yes. (optimizing bitmap)
>>>
>>> Maybe a question that will explain me more about the feature
>>> limitation (if any):
>>> Does an index-only scan used when the table has no vismap set will
>>> cost (in duration, IO, ...) more than a normal Index scan ?
>>
>> Yeah, it'll do a bit of extra work - the btree AM will cough up the
>> tuple uselessly, and we'll check the visibility map, also uselessly.
>> Then we'll end up doing it the regular way anyhow.  I haven't measured
>> that effect yet; hopefully it's fairly small.
>
> If it is small, or if we can reduce it to be near absent.
> Then... why do we need to distinguish Index Scan at all ? (or
> Index*-Only* scan, which aren't 100% 'Only' btw)
> It is then just an internal optimisation on how we can access/use an
> index. (really good and promising one)

Right, that's kind of what I was going for.  But the overhead isn't
going to be exactly zero, so I think it makes sense to disable it in
the cases where it clearly can't work.  The question is really more
whether we need to get more fine-grained than that, and actually do a
cost-based comparison of an index-only scan vs. a regular index scan.
I hope not, but I haven't tested it yet.

One other thing to think about is that the choice to use an index-scan
is not free of externalities.  The index-only scan is hopefully faster
than touching all the heap pages, but even if it weren't, not touching
all of the heap pages potentially means avoiding evicting things from
shared_buffers that some *other* query might need.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] index-only scans

Reply via email to