On 4/6/2025 00:41, Alexander Korotkov wrote:
On Tue, Jun 3, 2025 at 5:35 PM Andrei Lepikhov <lepi...@gmail.com> wrote:
On 3/6/2025 16:05, Alexander Korotkov wrote:
On Tue, Jun 3, 2025 at 4:53 PM Andrei Lepikhov <lepi...@gmail.com> wrote:
Additionally, as I mentioned earlier, the primary reason for choosing
MergeAppend in the regression test was a slight total cost difference
that triggered the startup cost comparison.
May you show the query and its explain, that is a subject of concern for
you?
My point is that difference in total cost is very small. For small
datasets it could be even within the fuzzy limit. However, in
practice difference in total time is as big as difference in startup
time. So, it would be good to make the total cost difference bigger.
For me, it seems like a continuation of the 7d8ac98 discussion. We may
charge a small fee for MergeAppend to adjust the balance, of course.
However, I think this small change requires a series of benchmarks to
determine how it affects the overall cost balance. Without examples it
is hard to say how important this issue is and its worthiness to
commence such work.
Yes, I think it's fair to charge the MergeAppend node. We currently
cost it similarly to Sort merge stage, but it's clearly more
expensive. It dealing on the executor level dealing with Slot's etc,
while Sort node have a set of lower level optimizations.
As I see it, it makes sense to charge MergeAppend for the heap operation
or, what is more logical, reduce the charge on Sort due to internal
optimisations.
Playing with both approaches, I found that it breaks many more tests
than the current patch does. Hence, it needs additional work on the
results analysis to realise how correct these changes are.
--
regards, Andrei Lepikhov