Re: Index Skip Scan

Dmitry Dolgov Sat, 17 Nov 2018 15:27:23 -0800

> On Fri, 16 Nov 2018 at 16:06, Jesper Pedersen <jesper.peder...@redhat.com> 
> wrote:
>
> On 11/15/18 6:41 AM, Alexander Kuzmenkov wrote:
> >>> But having this logic inside _bt_next means that it will make a
> >>> non-skip index
> >>> only scan a bit slower, am I right?
> >>
> >> Correct.
> >
> > Well, it depends on how the skip scan is implemented. We don't have to
> > make normal scans slower, because skip scan is just a separate thing.
> >
> > My main point was that current implementation is good as a proof of
> > concept, but it is inefficient for some data and needs some unreliable
> > planner logic to work around this inefficiency. And now we also have
> > execution-time fallback because planning-time fallback isn't good
> > enough. This looks overly complicated. Let's try to design an algorithm
> > that works well regardless of the particular data and doesn't need these
> > heuristics. It should be possible to do so for btree.
> >
> > Of course, I understand the reluctance to implement an entire new type
> > of btree traversal. Anastasia Lubennikova suggested a tweak for the
> > current method that may improve the performance for small groups of
> > equal values. When we search for the next unique key, first check if it
> > is contained on the current btree page using its 'high key'. If it is,
> > find it on the current page. If not, search from the root of the tree
> > like we do now. This should improve the performance for small equal
> > groups, because there are going to be several such groups on the page.
> > And this is exactly where we have the regression compared to unique +
> > index scan.
>
> Robert suggested something similar in [1]. I'll try and look at that
> when I'm back from my holiday.

Yeah, probably you're right. Unfortunately, I've misunderstood the previous
Robert's message in this thread with the similar approach. Jesper, I hope you
don't mind if I'll post an updated patch? _bt_skip is changed there in the
suggested way, so that it checks the current page before searching from the
root of a tree, and I've removed the fallback logic. After some
initial tests I see
that with this version skip scan over a table with 10^7 rows and 10^6
distinct values is slightly slower than a regular scan, but not that much.

> > By the way, what is the data for which we intend this feature to work?
> > Obviously a non-unique btree index, but how wide are the tuples, and how
> > big the equal groups? It would be good to have some real-world examples.
>
> Although my primary use-case is int I agree that we should test the
> different data types, and tuple widths.

My personal motivation here is exactly that we face use-cases for skip scan
from time to time. Usually it's quite few distinct values (up to a dozen or
so, which means that equal groups are quite big), but with the variety of types
and widths.

> On Thu, 15 Nov 2018 at 15:28, James Coleman <jtc...@gmail.com> wrote:
>
> > Is skip scan only possible for index-only scan?
>
> I haven't see much discussion of this question yet. Is there a
> particular reason to lock ourselves into thinking about this only in
> an index only scan?

I guess, the only reason is to limit the scope of the patch.

0001-Index-skip-scan-v4.patch
Description: Binary data

Re: Index Skip Scan

Reply via email to