> On Sun, Jul 12, 2020 at 12:48:47PM +0000, Floris Van Nee wrote: > > > > Good point, thanks for looking at this. With the latest planner version > > there > > are indeed more possibilities to use skipping. It never occured to me that > > some of those paths will still rely on index scan returning full data set. > > I'll look > > in details and add verification to prevent putting something like this on > > top of > > skip scan in the next version. > > I believe the required changes are something like in attached patch. There > were a few things I've changed: > - build_uniquekeys was constructing the list incorrectly. For a DISTINCT a,b, > it would create two unique keys, one with a and one with b. However, it > should be one unique key with (a,b).
Yes, I've also noticed that while preparing fix for index scan not covered by index and included it. > - the uniquekeys that is built, still contains some redundant keys, that are > normally eliminated from the path keys lists. I guess you're talking about: + if (EC_MUST_BE_REDUNDANT(ec)) + continue; Can you add some test cases to your changes to show the effect of it? It seem to me redundant keys are already eliminated at this point by either make_pathkeys_for_uniquekeys or even earlier for distinct on, but could be I've missed something. Along the lines I'm also curious about this part: - ListCell *k; - List *exprs = NIL; - - foreach(k, ec->ec_members) - { - EquivalenceMember *mem = (EquivalenceMember *) lfirst(k); - exprs = lappend(exprs, mem->em_expr); - } - - result = lappend(result, makeUniqueKey(exprs, false, false)); + EquivalenceMember *mem = (EquivalenceMember*) lfirst(list_head(ec->ec_members)); I'm curious about this myself, maybe someone can clarify. It looks like generaly speaking there could be more than one member (if not ec_has_volatile), which "representing knowledge that multiple items are effectively equal". Is this information is not interesting enough to preserve it in unique keys? > - the distinct_pathkeys may be NULL, even though there's a possibility for > skipping. But it wouldn't create the uniquekeys in this case. This makes the > planner not choose skip scans even though it could. For example in queries > that do SELECT DISTINCT ON (a) * FROM t1 WHERE a=1 ORDER BY a,b; Since a is > constant, it's eliminated from regular pathkeys. What would be the point of skipping if it's a constant? > - to combat the issues mentioned earlier, there's now a check in > build_index_paths that checks if the query_pathkeys matches the > useful_pathkeys. Note that we have to use the path keys here rather than any > of the unique keys. The unique keys are only Expr nodes - they do not contain > the necessary information about ordering. Due to elimination of some constant > path keys, we have to search the attributes of the index to find the correct > prefix to use in skipping. IIUC here you mean this function, right? + prefix = find_index_prefix_for_pathkey(root, + index, + BackwardScanDirection, + llast_node(PathKey, + root->distinct_pathkeys)); Doesn't it duplicate the job already done in build_index_pathkeys by building those pathkeys again? If yes, probably it's possible to reuse useful_pathkeys. Not sure about unordered indexes, but looks like query_pathkeys should also match in this case. Will also look at the follow up questions in the next email.