On Wed, Mar 29, 2023 at 4:00 AM David Rowley <dgrowle...@gmail.com> wrote:

> If you write some tests for this code, it will be useful to prove that
> it actually does something, and also that it does not break again in
> the future. I don't really want to just blindly copy the pattern used
> in 3c6fc5820 for creating incremental sort paths if it's not useful
> here. It would be good to see tests that make an Incremental Sort path
> using the code you're changing.


Thanks for the suggestion.  I've managed to come up with a query that
gets the codes being changed here to perform an incremental sort.

set min_parallel_index_scan_size to 0;
set enable_seqscan to off;

Without making those parallel paths:

explain (costs off)
select * from tenk1 where four = 2 order by four, hundred,
parallel_safe_volatile(thousand);
                          QUERY PLAN
--------------------------------------------------------------
 Incremental Sort
   Sort Key: hundred, (parallel_safe_volatile(thousand))
   Presorted Key: hundred
   ->  Gather Merge
         Workers Planned: 3
         ->  Parallel Index Scan using tenk1_hundred on tenk1
               Filter: (four = 2)
(7 rows)

and with those parallel paths:

explain (costs off)
select * from tenk1 where four = 2 order by four, hundred,
parallel_safe_volatile(thousand);
                          QUERY PLAN
---------------------------------------------------------------
 Gather Merge
   Workers Planned: 3
   ->  Incremental Sort
         Sort Key: hundred, (parallel_safe_volatile(thousand))
         Presorted Key: hundred
         ->  Parallel Index Scan using tenk1_hundred on tenk1
               Filter: (four = 2)
(7 rows)

I've added two tests for the code changes in create_ordered_paths in the
new patch.


> Same for the 0003 patch.


For the code changes in gather_grouping_paths, I've managed to come up
with a query that makes an explicit Sort atop cheapest partial path.

explain (costs off)
select count(*) from tenk1 group by twenty, parallel_safe_volatile(two);
                             QUERY PLAN
--------------------------------------------------------------------
 Finalize GroupAggregate
   Group Key: twenty, (parallel_safe_volatile(two))
   ->  Gather Merge
         Workers Planned: 4
         ->  Sort
               Sort Key: twenty, (parallel_safe_volatile(two))
               ->  Partial HashAggregate
                     Group Key: twenty, parallel_safe_volatile(two)
                     ->  Parallel Seq Scan on tenk1
(9 rows)

Without this logic the plan would look like:

explain (costs off)
select count(*) from tenk1 group by twenty, parallel_safe_volatile(two);
                             QUERY PLAN
--------------------------------------------------------------------
 Finalize GroupAggregate
   Group Key: twenty, (parallel_safe_volatile(two))
   ->  Sort
         Sort Key: twenty, (parallel_safe_volatile(two))
         ->  Gather
               Workers Planned: 4
               ->  Partial HashAggregate
                     Group Key: twenty, parallel_safe_volatile(two)
                     ->  Parallel Seq Scan on tenk1
(9 rows)

This test is also added in the new patch.

But I did not find a query that makes an incremental sort in this case.
After trying for a while it seems to me that we do not need to consider
incremental sort in this case, because for a partial path of a grouped
or partially grouped relation, it is either unordered (HashAggregate or
Append), or it has been ordered by the group_pathkeys (GroupAggregate).
It seems there is no case that we'd have a partial path that is
partially sorted.

So update the patches to v2.

Thanks
Richard

Attachment: v2-0001-Revise-how-we-sort-partial-paths-in-create_ordered_paths.patch
Description: Binary data

Attachment: v2-0002-Revise-how-we-sort-partial-paths-in-gather_grouping_paths.patch
Description: Binary data

Reply via email to