Re: MergeAppend could consider sorting cheapest child path

Alexander Pyhalov Fri, 25 Apr 2025 02:16:46 -0700

Andrei Lepikhov писал(а) 2025-04-24 16:01:

On 3/28/25 09:19, Alexander Pyhalov wrote:
Andy Fan писал(а) 2024-10-17 03:33:
I've updated patch. One more interesting case which we found - whenfractional path is selected, it still can be more expensive thansorted cheapest total path (as we look only on indexes whith necessarypathkeys, not on indexes which allow efficiently fetch data).So far couldn't find artificial example, but we've seen inadequateindex selection due to this issue - instead of using index suited forfilters in where, index, suitable for sorting was selected as onehaving the cheapest fractional cost.
I think it is necessary to generalise the approach a little.
Each MergeAppend subpath candidate that fits pathkeys should becompared to the overall-optimal path + explicit Sort node. Let's labelthis two-node composition as base_path. There are three cases exist:startup-optimal, total-optimal and fractional-optimal.In the startup case, base_path should use cheapest_startup_path, thetotal-optimal case - cheapest_total_path and for a fractional case, wemay employ the get_cheapest_fractional_path routine to detect properbase_path.
It may provide a more effective plan either in full, fractional andpartial scan cases:1. The Sort node may be pushed down to subpaths under a parallel orasync Append.2. When a minor set of subpaths doesn't have a proper index, and it isprofitable to sort them instead of switching to plain Append.
In general, analysing the regression tests changed by this code, I seethat the cost model prefers a series of small sortings instead of asingle massive one. May be it will show some profit for execution time.
I am not afraid of any palpable planning overhead here because we justdo cheap cost estimation and comparison operations that don't needadditional memory allocations. The caller is responsible for building aproper Sort node if this method is chosen as optimal.
In the attachment, see the patch written according to the idea. Thereare I introduced two new routines:
get_cheapest_path_for_pathkeys_ext
get_cheapest_fractional_path_for_pathkeys_ext

Hi. I'm a bit confused thatget_cheapest_fractional_path_for_pathkeys_ext() looks only on sortingcheapest fractional path, and get_cheapest_path_for_pathkeys_ext() inSTARTUP_COST case looks only on sorting cheapest_startup_path.Usually, sorted cheapest_total_path will be cheaper than sortedfractional/startup path at least by startup cost (as after sorting itincludes total_cost of input path). But we ignore this case whenselecting cheapest_startup and cheapest_fractional subpaths. As resultselected cheapest_startup and cheapest_fractional can be not cheapestfor startup or selecting a fraction of rows.


Consider the partition with the following access paths:

1) cheapest_startup without required pathkeys:
  startup_cost: 0.42
  total_cost: 4004

2) some index_path  with required pathkeys:
  startup_cost: 6.6
  total_cost: 2000

3) cheapest_total_path:
  startup_cost: 0.42
  total_cost: 3.48

Here, when selecting cheapest startup subpath we'll compare costs ofindex path (2) and sorted cheapest_startup (1), but will ignore sortedcheapest_total_path (3), despite the fact that it really can be thecheapest startup path, providing required sorting order.


--
Best regards,
Alexander Pyhalov,
Postgres Professional

Re: MergeAppend could consider sorting cheapest child path

Reply via email to