On Wed, 12 Oct 2022 at 13:06, Klint Gore <kgo...@une.edu.au> wrote: > Limit (cost=0.56..28349.28 rows=1 width=4) (actual time=0.039..0.040 rows=1 > loops=1) > -> Unique (cost=0.56..28349.28 rows=1 width=4) (actual time=0.039..0.039 > rows=1 loops=1) > -> Index Only Scan using idx on tbl (cost=0.56..28349.28 > rows=995241 width=4) (actual time=0.038..0.038 rows=1 loops=1) > Index Cond: (fld = 230) > Heap Fetches: 0 > Planning Time: 0.066 ms > Execution Time: 0.047 ms > > With the distinct and the limit, the planner somehow knows to push the either > the distinct or the limit into the index only scan so the unique for distinct > only had 1 row and the outer limit only had 1 row. Without the limit, the > distinct still does the index only scan but has to do the unique on the > million rows and execution time goes to about 100ms.
I think that would be very simple to fix. I believe I've done that locally but just detecting if needed_pathkeys == NULL in create_final_distinct_paths(). i.e. - if (pathkeys_contained_in(needed_pathkeys, path->pathkeys)) + if (needed_pathkeys == NIL) + { + Node *limitCount = makeConst(INT8OID, -1, InvalidOid, + sizeof(int64), + Int64GetDatum(1), false, + FLOAT8PASSBYVAL); + add_path(distinct_rel, (Path *) + create_limit_path(root, distinct_rel, path, NULL, + limitCount, LIMIT_OPTION_COUNT, 0, + 1)); + } + else if (pathkeys_contained_in(needed_pathkeys, path->pathkeys)) That just adds a Limit Path instead of the Unique Path. i.e: postgres=# explain (analyze, costs off) select distinct a from t1 where a = 0; QUERY PLAN -------------------------------------------------------------------------------------- Limit (actual time=0.074..0.075 rows=1 loops=1) -> Index Only Scan using t1_a_idx on t1 (actual time=0.072..0.073 rows=1 loops=1) Index Cond: (a = 0) Heap Fetches: 1 Planning Time: 0.146 ms Execution Time: 0.100 ms (6 rows) However, I might be wrong about that. I've not given it too much thought. David