On Sat, Oct 5, 2024 at 6:23 PM Richard Guo <guofengli...@gmail.com> wrote:
>
> On Fri, Sep 27, 2024 at 11:53 AM Richard Guo <guofengli...@gmail.com> wrote:
> > Here is an updated version of this patch that fixes the rowcount
> > estimate issue along this routine. (see set_joinpath_size.)
>
> I have worked on inventing some heuristics to limit the planning
> effort of eager aggregation.  One simple yet effective approach I'm
> thinking of is to consider a grouped path as NOT useful if its row
> reduction ratio falls below a predefined minimum threshold.  Currently
> I'm using 0.5 as the threshold, but I'm open to other values.

I ran the TPC-DS benchmark at scale 10 and observed eager aggregation
applied in several queries, including q4, q8, q11, q23, q31, q33, and
q77.  Notably, the regression in q19 that Tender identified with v11
has disappeared in v13.

Here’s a comparison of Execution Time and Planning Time for the seven
queries with eager aggregation disabled versus enabled (best of 3).

Execution Time:

        EAGER-AGG-OFF           EAGER-AGG-ON

q4      105787.963 ms           34807.938 ms

q8      1407.454 ms             1654.923 ms

q11     67899.213 ms            18670.086 ms

q23     45945.849 ms            42990.652 ms

q31     10463.536 ms            10244.175 ms

q33     2186.928 ms             2217.228 ms

q77     2360.565 ms             2416.674 ms


Planning Time:

        EAGER-AGG-OFF           EAGER-AGG-ON

q4      2.334 ms                2.602 ms

q8      0.685 ms                0.647 ms

q11     0.935 ms                1.094 ms

q23     2.666 ms                2.582 ms

q31     1.051 ms                1.206 ms

q33     1.248 ms                1.796 ms

q77     0.967 ms                0.962 ms


There are good performance improvements in q4 and q11 (3~4 times).
For the other queries, execution times remain largely unchanged,
falling within the margin of error, with no notable regressions
observed.

For the planning time, I do not see notable regressions for any of the
seven queries.

It seems that the new cost estimates and the new heuristic are working
pretty well.

Thanks
Richard


Reply via email to