On 10/15/24 12:15, David Rowley wrote:
As for your patch, I'm suspicious that the general idea you're
proposing is an actual improvement.
I didn't intend to treat it as an 'improvement' but as an intermediate
patch. The main purpose here is to debate the way & introduce
considering of number of columns. Conservative development approach has
been preferred before.
it seems you're just charging cpu_operator_cost * <number of columns
to sort by>. It seems like it won't be very hard to fool that into
doing the wrong thing when the first column to sort by is distinct or
almost distinct. There's going to be far fewer or no tiebreaker
comparisons for that case.
As I've written above, it is for starters. It allow to analyse how the
balance between different types of orderings can be changed in extreme
cases. I can join this patch with the following, implementing
differentiation by distincts, but the effect on regression tests will be
smoothed out.
The primary idea is 1) to elaborate GROUP-BY optimisation and 2) give
the optimiser a tool to choose a more optimal sort, comparing
MergeAppend/Sort/IncrementalSort costs.
The whole idea is implemented in the branch [1] and described in the
post [2]. Of course, we differentiate sortings by distinct of the first
column (only one trustworthy statistic). It is not so difficult (but I
doubt the real necessity) to use extended statistics and reflect in the
cost values of different combinations of columns.
As mentioned by Kirill, I also don't understand the cost_sort
signature change. Why would you do that over just doing
list_length(pathkeys) within cost_sort()? Your throwing away a
parameter that might be the most useful one of the bunch for allowing
better sort cost estimates.
Ok, may be it is too much for an intermediate patch. We can change the
interface later, if necessary.
Perhaps that could be done within the EquivalenceClass.
Thanks for the idea!
[1] https://github.com/postgrespro/postgres/tree/sort-columnsnum
[2] https://danolivo.substack.com/p/elaboration-of-the-postgresql-sort
--
regards, Andrei Lepikhov