Re: Feature request for adoptive indexes

Tomas Vondra Tue, 26 Oct 2021 13:43:41 -0700



On 10/26/21 21:39, Pavel Borisov wrote:

I've already answered OP but it seems in the wrong thread, so I copy ithere:
I think now in many cases you can effectively use covering index to havefast index-only scans without index duplication. It will help if youdon't have great selectivity on the last column (most probably youdon't). E.g.:
CREATE INDEX ON table_name (`job`,`nlp`,`year`) INCLUDE (`scan_id`,`issue_flag`, `sequence`)
But I consider the feature can be useful when there is a very littleselectivity in the first index columns. I.e. if (job`,`nlp`,`year') hasmany repeats and the most selection is done in the last column. I am notsure how often this can arise but in general, I see it as a usefulb-tree generalization.
I'm not sure how it should be done. In my view, we need to add anordered posting tree as a leaf element if b-tree and now we have indexstorage only for tuples. The change of on-disk format was previously noteasy in nbtree and if we consider the change, we need an extra bit tomark posting trees among index tuples. Maybe it could be done in a waysimilar to deduplicated tuples if some bits in the tuple header arestill could be freed.
Thoughts?

    If I get what you propose, you want to have a "top" tree for (job, nlp,
    year), which "splits" the data set into subsets of ~5000-7000 rows. And
    then for each subset you want a separate "small" trees on each of the
    other columns, so in this case three trees.

    Well, the problem with this is pretty obvious - each of the small trees
    requires separate copies of the leaf pages. And remember, in a btree the
    internal pages are usually less than 1% of the index, so this pretty
    much triples the size of the index. And if you insert a row into the
    index, it has to insert the item pointer into each of the small trees,
    likely requiring a separate I/O for each.

    So I'd bet this is not any different from just having three separate
    indexes - it doesn't save space, doesn't save I/O, nothing.
Tomas, I really think we should not try realizing this feature usingexisting index pages that contain only tuples. You are right, it willcause large overhead. If instead we decide and succeed in creating"posting trees" as a new on-disk page entry type we can have an indexwith space comparable to the abovementioned covering index but withsorting of values in these trees (i.e. all values are sorted, and "key"ones).

Well, there was no explanation about how it could/should be implemented,and maybe there is some elaborate way to handle the "posting trees" thatI can't quite think of (at least not in the btree context).

I'm still rather skeptical about it - for such feature to be useful theprefix columns must not be very selective, i.e. the posting trees areexpected to be fairly large (e.g. 5-7k rows). It pretty much has to torequire multiple (many) index pages, in order for the "larger" btreeindex to be slower. And at that point I'd expect the extra overhead tobe worse than simply defining multiple simple indexes.

A simple experiment would be to measure timing for queries with acondition on "sequence" using two indexes:


1) (job, nlp, year, sequence)
2) (job, nlp, year, scan_id, issue_flag, sequence)

The (1) index is "optimal" i.e. there's unlikely to be a better indexfor this query, at least no tree-like. (2) is something like the "worst"case index that we can use for this query.


For the new feature to be useful, two things would need to be true:

* query with (2) is much slower than (1)
* the new index would need to be close to (1)

Obviously, if the new index is slower than (2), it's mostly uselessright off the bat. And it probably can't be faster than (1) in practice,as it still is basically a btree index (at least the top half).

So I'd expect the performance to be somewhere between (1) and (2), butif (2) is very close to (1) - which I'd bet it is - then the potentialbenefit is also pretty small.

Perhaps I'm entirely wrong and there's a new type of index, bettersuited for cases similar to this. The "posting tree" reference actuallymade me thinking that maybe btree_gin might be applicable here?



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Feature request for adoptive indexes

Reply via email to