On 4/25/25 11:16, Alexander Pyhalov wrote:
Andrei Lepikhov писал(а) 2025-04-24 16:01:
On 3/28/25 09:19, Alexander Pyhalov wrote:
In the attachment, see the patch written according to the idea. There
are I introduced two new routines:
get_cheapest_path_for_pathkeys_ext
get_cheapest_fractional_path_for_pathkeys_ext
Hi. I'm a bit confused that
Thanks for the participation!
get_cheapest_fractional_path_for_pathkeys_ext() looks only on sorting
cheapest fractional path, and get_cheapest_path_for_pathkeys_ext() in
STARTUP_COST case looks only on sorting cheapest_startup_path.
At first, at the moment, I don't understand why we calculate the
cheapest_startup path at all. In my opinion, after commit 6b94e7a [1,
2], the min-fractional path totally covers the case. I began this
discussion in [3] - maybe we need to scrutinise that issue beforehand.
Looking into the min-fractional-path + Sort, we propose a path for the
case when extracting minor part of tuples with sorting later is cheaper
than doing a massive job of non-selective index scan. You also may
imagine the case of a JOIN as a subpath: non-sorted, highly selective
parameterised NestLoop may be way more optimal than MergeJoin, which
fits the pathkeys.
Usually, sorted cheapest_total_path will be cheaper than sorted
fractional/startup path at least by startup cost (as after sorting it
includes total_cost of input path). But we ignore this case when
selecting cheapest_startup and cheapest_fractional subpaths. As result
selected cheapest_startup and cheapest_fractional can be not cheapest
for startup or selecting a fraction of rows.
I don't know what you mean by that. The cheapest_total_path is
considered when we chose optimal cheapest_total path. The same works for
the fractional path - get_cheapest_fractional_path gives us the most
optimal fractional path and probes cheapest_total_path too.
As above, not sure about min-startup case for now. I can imagine
MergeAppend over sophisticated subquery: non-sorted includes highly
parameterised JOINs and the alternative (with pathkeys) includes
HashJoin, drastically increasing startup cost. It is only a theory, of
course. So, lets discover how min-startup works.
At the end, when the sorted path already estimated, we each time compare
it with previously selected pathkeys-cheapest. So, if the sorted path is
worse, we end up with the original path and don't lose anything.
[1]
https://www.postgresql.org/message-id/e8f9ec90-546d-e948-acce-0525f3e92773%40enterprisedb.com
[2]
https://www.postgresql.org/message-id/1581042da8044e71ada2d6e3a51bf7bb%40index.de
[3]
https://www.postgresql.org/message-id/f0206ef2-6b5a-4d07-8770-cfa7cd30f...@gmail.com
--
regards, Andrei Lepikhov