Sorry, there's a too-hard-to-read part. At Thu, 25 Jul 2019 20:17:37 +0900 (Tokyo Standard Time), Kyotaro Horiguchi <horikyota....@gmail.com> wrote in <20190725.201737.192223037.horikyota....@gmail.com> > Hello. > > At Wed, 24 Jul 2019 22:49:32 +0200, Dmitry Dolgov <9erthali...@gmail.com> > wrote in <CA+q6zcXgwDMiowOGbr7gimTY3NV-LbcwP=rbma_l56pc+9p...@mail.gmail.com> > > > On Mon, Jul 22, 2019 at 7:10 PM Jesper Pedersen > > > <jesper.peder...@redhat.com> wrote: > > > > > > On 7/22/19 1:44 AM, David Rowley wrote: > > > > Here are the comments I noted down during the review: > > > > > > > > cost_index: > > > > > > > > I know you've not finished here, but I think it'll need to adjust > > > > tuples_fetched somehow to account for estimate_num_groups() on the > > > > Path's unique keys. Any Eclass with an ec_has_const = true does not > > > > need to be part of the estimate there as there can only be at most one > > > > value for these. > > > > > > > > For example, in a query such as: > > > > > > > > SELECT x,y FROM t WHERE x = 1 GROUP BY x,y; > > > > > > > > you only need to perform estimate_num_groups() on "y". > > > > > > > > I'm really not quite sure on what exactly will be required from > > > > amcostestimate() here. The cost of the skip scan is not the same as > > > > the normal scan. So other that API needs adjusted to allow the caller > > > > to mention that we want skip scans estimated, or there needs to be > > > > another callback. > > > > > > > > > > I think this part will become more clear once the index skip scan patch > > > is rebased, as we got the uniquekeys field on the Path, and the > > > indexskipprefixy info on the IndexPath node. > > > > Here is what I came up with to address the problems, mentioned above in this > > thread. It passes tests, but I haven't tested it yet more thoughtful (e.g. > > it > > occurred to me, that `_bt_read_closest` probably wouldn't work, if a next > > key, > > that passes an index condition is few pages away - I'll try to tackle it > > soon). > > Just another small step forward, but I hope it's enough to rebase on top of > > it > > planner changes. > > > > Also I've added few tags, mostly to mention reviewers contribution. > > I have some comments. > > + * The order of columns in the index should be the same, as for > + * unique distincs pathkeys, otherwise we cannot use _bt_search > + * in the skip implementation - this can lead to a missing > + * records. > > It seems that it is enough that distinct pathkeys is contained in > index pathkeys. If it's right, that is almost checked in existing > code: > > > if (pathkeys_contained_in(needed_pathkeys, path->pathkeys)) > > It is perfect when needed_pathkeys is distinct_pathkeys. So > additional check is required if and only if it is not the case.
So I think the following change will work. - + /* Consider index skip scan as well */ - + if (enable_indexskipscan && - + IsA(path, IndexPath) && - + ((IndexPath *) path)->indexinfo->amcanskip && - + (path->pathtype == T_IndexOnlyScan || - + path->pathtype == T_IndexScan) && - + root->distinct_pathkeys != NIL) + + if (enable_indexskipscan && + + IsA(path, IndexPath) && + + ((IndexPath *) path)->indexskipprefix >= 0 && + + (needed_pathkeys == root->distinct_pathkeys || + + pathkeys_contained_in(root->distinct_pathkeys, + + path->pathkeys))) Additional comments on the condition above are: > path->pathtype is always one of T_IndexOnlyScan or T_IndexScan so > the check against them isn't needed. If you have concern on that, > we can add that as Assert(). > > I feel uncomfortable to look into indexinfo there. Couldnd't we > use indexskipprefix == -1 to signal !amcanskip from > create_index_path? > > > + /* > + * XXX: In case of index scan quals evaluation happens after > + * ExecScanFetch, which means skip results could be fitered out > + */ > > Why can't we use skipscan path if having filter condition? If > something bad happens, the reason must be written here instead of > what we do. > > > + if (path->pathtype == T_IndexScan && > + parse->jointree != NULL && > + parse->jointree->quals != NULL && > + ((List *)parse->jointree->quals)->length != 0) > > It's better to use list_length instead of peeping inside. It > handles the NULL case as well. (The structure has recently > changed but .length is not, though.) > > > + * If advancing direction is different from index direction, we must > + * skip right away, but _bt_skip requires a starting point. > > It doesn't seem needed to me. Could you elaborate on the reason > for that? > > > + * If advancing direction is different from index direction, we must > + * skip right away, but _bt_skip requires a starting point. > + */ > + if (direction * indexonlyscan->indexorderdir < 0 && > + !node->ioss_FirstTupleEmitted) > > I'm confused by this. "direction" there is the physical scan > direction (fwd/bwd) of index scan, which is already compensated > by indexorderdir. Thus the condition means we do that when > logical ordering (ASC/DESC) is DESC. (Though I'm not sure what > "index direction" exactly means...) regards. -- Kyotaro Horiguchi NTT Open Source Software Center