Hi everyone,
Who can tell me which solution is better below:
Solution 1: Change the configuration parameters
set enable_seqscan = off
Solution 2: Add DISTINCT clause to SQL
explain analyze select DISTINCT 2 from analyze_word_reports where (cseid =
94) limit 1;
If I don't want to change SQL, is Solution 1 OK?
At 2022-10-12 09:47:17, "David Rowley" <[email protected]> wrote:
>On Wed, 12 Oct 2022 at 13:06, Klint Gore <[email protected]> wrote:
>> Limit (cost=0.56..28349.28 rows=1 width=4) (actual time=0.039..0.040 rows=1
>> loops=1)
>> -> Unique (cost=0.56..28349.28 rows=1 width=4) (actual time=0.039..0.039
>> rows=1 loops=1)
>> -> Index Only Scan using idx on tbl (cost=0.56..28349.28
>> rows=995241 width=4) (actual time=0.038..0.038 rows=1 loops=1)
>> Index Cond: (fld = 230)
>> Heap Fetches: 0
>> Planning Time: 0.066 ms
>> Execution Time: 0.047 ms
>>
>> With the distinct and the limit, the planner somehow knows to push the
>> either the distinct or the limit into the index only scan so the unique for
>> distinct only had 1 row and the outer limit only had 1 row. Without the
>> limit, the distinct still does the index only scan but has to do the unique
>> on the million rows and execution time goes to about 100ms.
>
>I think that would be very simple to fix. I believe I've done that
>locally but just detecting if needed_pathkeys == NULL in
>create_final_distinct_paths().
>
>i.e.
>
>- if (pathkeys_contained_in(needed_pathkeys,
>path->pathkeys))
>+ if (needed_pathkeys == NIL)
>+ {
>+ Node *limitCount = makeConst(INT8OID,
>-1, InvalidOid,
>+
> sizeof(int64),
>+
> Int64GetDatum(1), false,
>+
> FLOAT8PASSBYVAL);
>+ add_path(distinct_rel, (Path *)
>+
>create_limit_path(root, distinct_rel, path, NULL,
>+
> limitCount, LIMIT_OPTION_COUNT, 0,
>+
> 1));
>+ }
>+ else if
>(pathkeys_contained_in(needed_pathkeys, path->pathkeys))
>
>That just adds a Limit Path instead of the Unique Path. i.e:
>
>postgres=# explain (analyze, costs off) select distinct a from t1 where a = 0;
> QUERY PLAN
>--------------------------------------------------------------------------------------
> Limit (actual time=0.074..0.075 rows=1 loops=1)
> -> Index Only Scan using t1_a_idx on t1 (actual time=0.072..0.073
>rows=1 loops=1)
> Index Cond: (a = 0)
> Heap Fetches: 1
> Planning Time: 0.146 ms
> Execution Time: 0.100 ms
>(6 rows)
>
>However, I might be wrong about that. I've not given it too much thought.
>
>David