Hi Rafia.

Em qui., 22 de ago. de 2024 às 17:17, Rafia Sabih <rafia.pghack...@gmail.com>
escreveu:

>
>
> On Thu, 22 Aug 2024 at 15:02, Ranier Vilela <ranier...@gmail.com> wrote:
>
>> Hi.
>>
>> Em qui., 22 de ago. de 2024 às 04:34, Richard Guo <guofengli...@gmail.com>
>> escreveu:
>>
>>> I ran into a query plan where the Result node seems redundant to me:
>>>
>>> create table t (a int, b int, c int);
>>> insert into t select i%10, i%10, i%10 from generate_series(1,100)i;
>>> create index on t (a, b);
>>> analyze t;
>>>
>>> set enable_hashagg to off;
>>> set enable_seqscan to off;
>>>
>>> explain (verbose, costs off)
>>> select distinct b, a from t order by a, b;
>>>                        QUERY PLAN
>>> ---------------------------------------------------------
>>>  Result
>>>    Output: b, a
>>>    ->  Unique
>>>          Output: a, b
>>>          ->  Index Only Scan using t_a_b_idx on public.t
>>>                Output: a, b
>>> (6 rows)
>>>
>>> What I expect is that both the Scan node and the Unique node output
>>> 'b, a', and we do not need an additional projection step, something
>>> like:
>>>
>>> explain (verbose, costs off)
>>> select distinct b, a from t order by a, b;
>>>                     QUERY PLAN
>>> ---------------------------------------------------
>>>  Unique
>>>    Output: b, a
>>>    ->  Index Only Scan using t_a_b_idx on public.t
>>>          Output: b, a
>>> (4 rows)
>>>
>>> I looked into this a little bit and found that in function
>>> create_ordered_paths, we decide whether a projection step is needed
>>> based on a simple pointer comparison between sorted_path->pathtarget
>>> and final_target.
>>>
>>>     /* Add projection step if needed */
>>>     if (sorted_path->pathtarget != target)
>>>         sorted_path = apply_projection_to_path(root, ordered_rel,
>>>                                                sorted_path, target);
>>>
>>> This does not seem right to me, as PathTargets are not canonical, so
>>> we cannot guarantee that two identical PathTargets will have the same
>>> pointer.  Actually, for the query above, the two PathTargets are
>>> identical but have different pointers.
>>>
>> Could memcmp solve this?
>>
>> With patch attached, using memcmp to compare the pointers.
>>
>> select distinct b, a from t order by a, b;
>>             QUERY PLAN
>> ----------------------------------
>>  Sort
>>    Output: b, a
>>    Sort Key: t.a, t.b
>>    ->  HashAggregate
>>          Output: b, a
>>          Group Key: t.a, t.b
>>          ->  Seq Scan on public.t
>>                Output: a, b, c
>> (8 rows)
>>
>> attached patch for consideration.
>>
>> best regards,
>> Ranier Vilela
>>
>
> +1 for the idea of removing this redundant node.
> I had a look in this patch, and I was wondering if we still need
> sorted_path->pathtarget != target in the condition.
>
Although the test is unnecessary, it is cheap and avoids a possible call to
memcmp.

Thanks.

best regards,
Ranier Vilela

Reply via email to