Re: Incremental Sort Cost Estimation Instability

Andrei Lepikhov Thu, 19 Sep 2024 01:44:42 -0700

On 12/9/2024 12:12, David Rowley wrote:

On Thu, 12 Sept 2024 at 21:51, Andrei Lepikhov <[email protected]> wrote:

Initial problem causes wrong cost_sort estimation. Right now I think
about providing cost_sort() the sort clauses instead of (or in addition
to) the pathkeys.


I'm not quite sure why the sort clauses matter any more than the
EquivalenceClass.  If the EquivalanceClass defines that all members
will have the same value for any given row, then, if we had to choose
any single member to drive the n_distinct estimate from, isn't the
most accurate distinct estimate from the member with the smallest
n_distinct estimate?  (That assumes the less distinct member has every
value the more distinct member has, which might not be true)

Thanks for your efforts! Your idea looks more stable and applicable thanmy patch.BTW, it could still provide wrong ndistinct estimations if we choose asorting operator under clauses mentioned in the EquivalenceClass.However, this thread's primary intention is to stabilize query plans, soI'll try to implement your idea.

The second reason was to distinguish sortings by cost (see proposal [1])because sometimes it could help to save CPU cycles on comparisons.Having a lot of sort/grouping queries with only sporadic joins, I seehow profitable it could sometimes be - text or numeric grouping overmostly Cartesian join may be painful without fine tuned sorting.

[1]https://www.postgresql.org/message-id/[email protected]


--
regards, Andrei Lepikhov

Re: Incremental Sort Cost Estimation Instability

Reply via email to