Hello. At Wed, 24 Jul 2019 22:49:32 +0200, Dmitry Dolgov <9erthali...@gmail.com> wrote in <CA+q6zcXgwDMiowOGbr7gimTY3NV-LbcwP=rbma_l56pc+9p...@mail.gmail.com> > > On Mon, Jul 22, 2019 at 7:10 PM Jesper Pedersen > > <jesper.peder...@redhat.com> wrote: > > > > On 7/22/19 1:44 AM, David Rowley wrote: > > > Here are the comments I noted down during the review: > > > > > > cost_index: > > > > > > I know you've not finished here, but I think it'll need to adjust > > > tuples_fetched somehow to account for estimate_num_groups() on the > > > Path's unique keys. Any Eclass with an ec_has_const = true does not > > > need to be part of the estimate there as there can only be at most one > > > value for these. > > > > > > For example, in a query such as: > > > > > > SELECT x,y FROM t WHERE x = 1 GROUP BY x,y; > > > > > > you only need to perform estimate_num_groups() on "y". > > > > > > I'm really not quite sure on what exactly will be required from > > > amcostestimate() here. The cost of the skip scan is not the same as > > > the normal scan. So other that API needs adjusted to allow the caller > > > to mention that we want skip scans estimated, or there needs to be > > > another callback. > > > > > > > I think this part will become more clear once the index skip scan patch > > is rebased, as we got the uniquekeys field on the Path, and the > > indexskipprefixy info on the IndexPath node. > > Here is what I came up with to address the problems, mentioned above in this > thread. It passes tests, but I haven't tested it yet more thoughtful (e.g. it > occurred to me, that `_bt_read_closest` probably wouldn't work, if a next key, > that passes an index condition is few pages away - I'll try to tackle it > soon). > Just another small step forward, but I hope it's enough to rebase on top of it > planner changes. > > Also I've added few tags, mostly to mention reviewers contribution.
I have some comments. + * The order of columns in the index should be the same, as for + * unique distincs pathkeys, otherwise we cannot use _bt_search + * in the skip implementation - this can lead to a missing + * records. It seems that it is enough that distinct pathkeys is contained in index pathkeys. If it's right, that is almost checked in existing code: > if (pathkeys_contained_in(needed_pathkeys, path->pathkeys)) It is perfect when needed_pathkeys is distinct_pathkeys. So additional check is required if and only if it is not the case. > if (enable_indexskipscan && > IsA(path, IndexPath) && > ((IndexPath *) path)->indexinfo->amcanskip && > (path->pathtype == T_IndexOnlyScan || > path->pathtype == T_IndexScan) && > (needed_pathkeys == root->distinct_pathkeys || > pathkeys_contained_in(root->distinct_pathkeys, > path->pathkeys))) path->pathtype is always one of T_IndexOnlyScan or T_IndexScan so the check against them isn't needed. If you have concern on that, we can add that as Assert(). I feel uncomfortable to look into indexinfo there. Couldnd't we use indexskipprefix == -1 to signal !amcanskip from create_index_path? + /* + * XXX: In case of index scan quals evaluation happens after + * ExecScanFetch, which means skip results could be fitered out + */ Why can't we use skipscan path if having filter condition? If something bad happens, the reason must be written here instead of what we do. + if (path->pathtype == T_IndexScan && + parse->jointree != NULL && + parse->jointree->quals != NULL && + ((List *)parse->jointree->quals)->length != 0) It's better to use list_length instead of peeping inside. It handles the NULL case as well. (The structure has recently changed but .length is not, though.) + * If advancing direction is different from index direction, we must + * skip right away, but _bt_skip requires a starting point. It doesn't seem needed to me. Could you elaborate on the reason for that? + * If advancing direction is different from index direction, we must + * skip right away, but _bt_skip requires a starting point. + */ + if (direction * indexonlyscan->indexorderdir < 0 && + !node->ioss_FirstTupleEmitted) I'm confused by this. "direction" there is the physical scan direction (fwd/bwd) of index scan, which is already compensated by indexorderdir. Thus the condition means we do that when logical ordering (ASC/DESC) is DESC. (Though I'm not sure what "index direction" exactly means...) regards. -- Kyotaro Horiguchi NTT Open Source Software Center