> On Sun, Jun 07, 2020 at 06:51:22PM +1200, David Rowley wrote: > > > * one in create_distinct_paths as per current implementation > > > > with what seems to be similar content. > > I think we need to have UniqueKeys in RelOptInfo so we can describe > what a relation is unique by. There's no point for example in > creating skip scan paths for a relation that's already unique on > whatever we might try to skip scan on. e.g someone does: > > SELECT DISTINCT unique_and_indexed_column FROM tab; > > Since there's a unique index on unique_and_indexed_column then we > needn't try to create a skipscan path for it. > > However, the advantages of having UniqueKeys on the RelOptInfo goes a > little deeper than that. We can make use of it anywhere where we > currently do relation_has_unique_index_for() for. Plus we get what > Andy wants and can skip useless DISTINCT operations when the result is > already unique on the distinct clause. Sure we could carry all the > relation's unique properties around in Paths, but that's not the right > place. It's logically a property of the relation, not the path > specifically. RelOptInfo is a good place to store the properties of > relations. > > The idea of the meaning of uniquekeys within a path is that the path > is specifically making those keys unique. We're not duplicating the > RelOptInfo's uniquekeys there. > > If we have a table like: > > CREATE TABLE tab ( > a INT PRIMARY KEY, > b INT NOT NULL > ); > > CREATE INDEX tab_b_idx ON tab (b); > > Then I'd expect a query such as: SELECT DISTINCT b FROM tab; to have > the uniquekeys for tab's RelOptInfo set to {a}, and the seqscan and > index scan paths uniquekey properties set to NULL, but the skipscan > index path uniquekeys for tab_b_idx set to {b}. Then when we go > create the distinct paths Andy's work will see that there's no > RelOptInfo uniquekeys for the distinct clause, but the skip scan work > will loop over the unique_pathlist and find that we have a skipscan > path with the required uniquekeys, a.k.a {b}. > > Does that make sense?
Yes, from this point of view it makes sense. I've already posted the first version of index skip scan based on this implementation [1]. There could be rought edges, but overall I hope we're on the same page. [1]: https://www.postgresql.org/message-id/flat/20200609102247.jdlatmfyeecg52fi%40localhost