On Sat, 7 Mar 2020 at 03:49, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > > On Wed, Mar 04, 2020 at 11:32:00AM +1300, David Rowley wrote: > >The reason it must be done this way is that when the RelOptInfo that > >we're performing the DISTINCT on is a joinrel, then we're not going to > >see any IndexPaths in the RelOptInfo's pathlist. We'll have some sort > >of Join path instead. I understand you're not yet supporting doing > >this optimisation when joins are involved, but it should be coded in > >such a way that it'll work when we do. (It's probably also a separate > >question as to whether we should have this only work when there are no > >joins. I don't think I personally object to it for stage 1, but > >perhaps someone else might think differently.) > > > > I don't follow. Can you elaborate more? > > AFAICS skip-scan is essentially a capability of an (index) AM. I don't > see how we could ever do that for joinrels? We can do that at the scan > level, below a join, but that's what this patch already supports, I > think. When you say "supporting this optimisation" with joins, do you > mean doing skip-scan for join inputs, or on top of the join?
The skip index scan Path would still be created at the base rel level, but the join path on the join relation would have one of the sub-paths of the join as an index skip scan. An example query that could make use of this is: SELECT * FROM some_table WHERE a IN(SELECT indexed_col_with_few_distinct_values FROM big_table); In this case, we might want to create a Skip Scan path on big_table using the index on the "indexed_col_with_few_distinct_values", then Hash Join to "some_table". That class of query is likely stage 2 or 3 of this work, but we need to lay foundations that'll support it. As for not having IndexScan paths in joinrels. Yes, of course, but that's exactly why create_distinct_paths() cannot work the way the patch currently codes it. The patch does: + /* + * XXX: In case of index scan quals evaluation happens + * after ExecScanFetch, which means skip results could be + * fitered out. Consider the following query: + * + * select distinct (a, b) a, b, c from t where c < 100; + * + * Skip scan returns one tuple for one distinct set of (a, + * b) with arbitrary one of c, so if the choosed c does + * not match the qual and there is any c that matches the + * qual, we miss that tuple. + */ + if (path->pathtype == T_IndexScan && which will never work for join relations since they'll only have paths for Loop/Merge/Hash type joins. The key here is to determine which skip scan paths we should create when we're building the normal index paths then see if we can make use of those when planning joins. Subsequently, we'll then see if we can make use of the resulting join paths during create_distinct_paths(). Doing it this way will allow us to use skip scans in queries such as: SELECT DISTINCT t1.z FROM t1 INNER JOIN t2 ON t1.a = t2.unique_col; We'll first create the skip scan paths on t1, then when creating the join paths we'll create additional join paths which use the skipscan path. Because t1.unique_col will at most have 1 join partner for each t2 row, then the join path will have the same unique_keys as the skipscan path. That'll allow us to use the join path which has the skip scan on whichever side of the join the t1 relation ends up. All create_distinct_paths() should be doing is looking for paths that are already implicitly unique on the distinct clause and consider using those in a cost-based way. It shouldn't be making such paths itself. > >For storing these new paths with UniqueKeys, I'm not sure exactly if > >we can just add_path() such paths into the RelOptInfo's pathlist. > >What we don't want to do is accidentally make use of paths which > >eliminate duplicate values when we don't want that behaviour. If we > >did store these paths in RelOptInfo->pathlist then we'd need to go and > >modify a bunch of places to ignore such paths. set_cheapest() would > >have to do something special for them too, which makes me think > >pathlist is the incorrect place. Parallel query added > >partial_pathlist, so perhaps we need unique_pathlist to make this > >work. > > > > Hmmm, good point. Do we actually produce incorrect plans with the > current patch, using skip-scan path when we should not? I don't think so. The patch is only creating skip scan paths on the base rel when we discover it's valid to do so. That's not the way it should work though. How the patch currently works would be similar to initially only creating a SeqScan path for a query such as: SELECT * FROM tab ORDER BY a;, but then, during create_ordered_paths() go and create some IndexPath to scan the btree index on tab.a because we suddenly realise that it'll be good to use that for the ORDER BY. The planner does not work that way. We always create all the paths that we think will be useful during set_base_rel_pathlists(). We then make use of only existing paths in the upper planner. See what build_index_paths() in particular: /* see if we can generate ordering operators for query_pathkeys */ match_pathkeys_to_index(index, root->query_pathkeys, &orderbyclauses, &orderbyclausecols); We'll need something similar to that but for the query_uniquekeys and ensure we build the skip scan paths when we think they'll be useful and do so during the call to set_base_rel_pathlists(). Later in stage 2 or 3, we can go build skip scan paths when there are semi/anti joins that could make use of them. Making that work will just be some plumbing work in build_index_paths() and making use of those paths during add_paths_to_joinrel(). David