On Wed, Jan 22, 2025 at 1:48 AM Richard Guo wrote:
> This approach would require injecting multiple intermediate
> aggregation nodes into the path tree, for which we currently lack the
> necessary architecture. As a result, I didn't pursue this idea
> further. However, I'm really glad you mentio
On Wed, Jan 22, 2025 at 1:36 AM Robert Haas wrote:
> Thanks for the example. What seems to be happening here is that each
> of the three joins increases the number of rows by a multiple of
> either 166 or 333. Aggregating reduces the number of rows to 3. I am
> not sure that we should be too conce
On Tue, Jan 21, 2025 at 3:33 AM Richard Guo wrote:
> I've been thinking about this proposal, and it's quite appealing. It
> would significantly reduce both the planning effort and implementation
> complexity, while still yielding reasonable planning results.
>
> One concern I have with this propo
On Tue, Jan 21, 2025 at 2:57 AM Tom Lane wrote:
> However, a partial-aggregation path does not generate the same data
> as an unaggregated path, no matter how fuzzy you are willing to be
> about the concept. So I'm having a very hard time accepting that
> it ought to be part of the same RelOptInf
On Tue, Jan 21, 2025 at 1:28 AM Robert Haas wrote:
> On Sun, Jan 19, 2025 at 7:53 AM Richard Guo wrote:
> > If, at last, the conclusion of this discussion is that we should not
> > apply this change until we fix those problems in aggregate estimates,
> > I'd be very sad. This conclusion is absol
On Mon, Jan 20, 2025 at 12:57 PM Tom Lane wrote:
> However, a partial-aggregation path does not generate the same data
> as an unaggregated path, no matter how fuzzy you are willing to be
> about the concept. So I'm having a very hard time accepting that
> it ought to be part of the same RelOptIn
Robert Haas writes:
> So I don't quite know which way to jump here. It now seems to me that
> we have three similar features with three different designs.
> Parameterization added non-comparable paths to the same path list;
> parallel query added them to a different path list in the same
> RelOptI
On Sun, Jan 19, 2025 at 7:53 AM Richard Guo wrote:
> If, at last, the conclusion of this discussion is that we should not
> apply this change until we fix those problems in aggregate estimates,
> I'd be very sad. This conclusion is absolutely correct, for sure, in
> an ideal world, but in the rea
On Sat, Jan 18, 2025 at 6:16 AM Robert Haas wrote:
> On Thu, Jan 16, 2025 at 3:18 AM Richard Guo wrote:
> > If this t1/t2 join is part of a larger SELECT query, I think the cost
> > estimates for the upper join nodes would likely be quite inaccurate.
>
> That's definitely true. However, the quest
On Thu, Jan 16, 2025 at 3:18 AM Richard Guo wrote:
> If this t1/t2 join is part of a larger SELECT query, I think the cost
> estimates for the upper join nodes would likely be quite inaccurate.
That's definitely true. However, the question is not whether the
planner has problems today (it definit
On Fri, Jan 17, 2025 at 6:40 AM Tom Lane wrote:
> * The README addition, and the basically identical text in the
> commit message, don't even provide a reason to believe that the
> transformation is correct let alone that it will result in faster
> execution. I don't understand why it's so hard t
I'm very sorry for not having had any time to look at this patch
before --- it's been on my radar screen for awhile, but $LIFE has
been rather demanding lately.
Anyway, I've now read through the mail thread and portions of the
v16 patch, and I have to concur with Robert's qualms about whether
this
On Wed, Jan 15, 2025 at 11:40 PM Robert Haas wrote:
> On Wed, Jan 15, 2025 at 1:58 AM Richard Guo wrote:
> > I understand that we're currently quite bad at estimating the number
> > of groups after aggregation. In fact, it's not just aggregation
> > estimates — we're also bad at join estimates i
On Wed, Jan 15, 2025 at 1:58 AM Richard Guo wrote:
> I understand that we're currently quite bad at estimating the number
> of groups after aggregation. In fact, it's not just aggregation
> estimates — we're also bad at join estimates in some cases. This is a
> reality we have to face. Here's w
On Wed, Jan 15, 2025 at 12:07 AM Robert Haas wrote:
> On Sun, Jan 12, 2025 at 9:04 PM Richard Guo wrote:
> > Attached is an updated version of this patch that addresses Jian's
> > review comments, along with some more cosmetic tweaks. I'm going to
> > be looking at this patch again from the poin
On Sun, Jan 12, 2025 at 9:04 PM Richard Guo wrote:
> Attached is an updated version of this patch that addresses Jian's
> review comments, along with some more cosmetic tweaks. I'm going to
> be looking at this patch again from the point of view of committing
> it, with the plan to commit it late
On Thu, Jan 9, 2025 at 12:15 PM jian he wrote:
> hi.
> in create_grouping_expr_infos
>
> tce = lookup_type_cache(exprType((Node *) tle->expr),
> TYPECACHE_BTREE_OPFAMILY);
> if (!OidIsValid(tce->btree_opf) ||
> !OidIsValid(tce->btree_opin
hi.
in create_grouping_expr_infos
tce = lookup_type_cache(exprType((Node *) tle->expr),
TYPECACHE_BTREE_OPFAMILY);
if (!OidIsValid(tce->btree_opf) ||
!OidIsValid(tce->btree_opintype))
return;
/*
*
On Wed, Dec 4, 2024 at 11:38 PM Robert Haas wrote:
> On Sun, Nov 10, 2024 at 7:52 PM Richard Guo wrote:
> > Hmm, currently we only consider grouped aggregation for eager
> > aggregation. For grouped aggregation, the window function's
> > arguments, as well as the PARTITION BY expressions, must a
On Sun, Nov 10, 2024 at 7:52 PM Richard Guo wrote:
> Hmm, currently we only consider grouped aggregation for eager
> aggregation. For grouped aggregation, the window function's
> arguments, as well as the PARTITION BY expressions, must appear in the
> GROUP BY clause. That is to say, the depname
On Tue, Nov 12, 2024 at 1:30 AM Robert Haas wrote:
> On Sun, Nov 10, 2024 at 7:52 PM Richard Guo wrote:
> > Hmm, currently we only consider grouped aggregation for eager
> > aggregation. For grouped aggregation, the window function's
> > arguments, as well as the PARTITION BY expressions, must a
On Sun, Nov 10, 2024 at 7:52 PM Richard Guo wrote:
> > I have similar but weaker feelings about ordered aggregates. Consider:
> >
> > explain select t1.id, array_agg(t2.v order by t3.o) from t1, t2, t3
> > where t1.id = t2.id and t2.id = t3.id group by 1;
> >
>
> It seems to me that a partially ag
On Wed, Nov 6, 2024 at 11:43 PM Robert Haas wrote:
> On Wed, Nov 6, 2024 at 3:22 AM Richard Guo wrote:
> > Yeah, ordered aggregates could be a blocker. I think it might be best
> > to prevent the use of eager aggregation if root->numOrderedAggs > 0
> > for now.
> >
> > I've been thinking about t
On Wed, Nov 6, 2024 at 3:22 AM Richard Guo wrote:
> Yeah, ordered aggregates could be a blocker. I think it might be best
> to prevent the use of eager aggregation if root->numOrderedAggs > 0
> for now.
>
> I've been thinking about the window functions case, as Jian He also
> mentioned it some ti
On Fri, Nov 1, 2024 at 9:42 PM Robert Haas wrote:
> On Fri, Nov 1, 2024 at 2:21 AM Richard Guo wrote:
> > ... an aggregated row from the partial
> > aggregation matches the other side of the join if and only if each row
> > in the partial group does, thereby ensuring that all rows in the same
> >
On Thu, Aug 29, 2024 at 10:26 AM Richard Guo wrote:
>
>
> > 2. I think there might be techniques we could use to limit planning
> > effort at an earlier stage when the approach doesn't appear promising.
> > For example, if the proposed grouping column is already unique, the
> > exercise is pointle
On Fri, Nov 1, 2024 at 2:21 AM Richard Guo wrote:
> ... an aggregated row from the partial
> aggregation matches the other side of the join if and only if each row
> in the partial group does, thereby ensuring that all rows in the same
> partial group have the same 'destiny'.
Ah, I really like th
On Thu, Oct 31, 2024 at 9:16 PM jian he wrote:
>
> hi.
> still trying to understand v13. found a bug.
>
> minimum test :
> drop table if exists t1, t2;
> CREATE TABLE t1 (a int, b int, c int);
> CREATE TABLE t2 (a int, b int, c int);
> SET enable_eager_aggregate TO on;
> explain(costs off, setting
On Thu, Oct 31, 2024 at 12:25 AM Robert Haas wrote:
> Well, the key thing here is the reasoning, which you don't really
> spell out either here or in the patch. The patch's justification for
> the use of BTEQUALIMAGE_PROC Is that, if we use non-equal-image
> operators, "we may lose some informatio
On Wed, Oct 30, 2024 at 5:06 AM Robert Haas wrote:
> On Tue, Sep 24, 2024 at 11:20 PM Richard Guo wrote:
> > The reason is that it is very tricky to set the size estimates for a
> > grouped join relation. For a non-grouped join relation, we know that
> > all its paths have the same rowcount esti
On Tue, Oct 29, 2024 at 9:59 PM Robert Haas wrote:
> On Wed, Sep 25, 2024 at 3:03 AM Richard Guo wrote:
> > On Wed, Sep 11, 2024 at 10:52 AM Tender Wang wrote:
> > > 1. In make_one_rel(), we have the below codes:
> > > /*
> > > * Build grouped base relations for each base rel if possible.
> > >
hi.
still trying to understand v13. found a bug.
minimum test :
drop table if exists t1, t2;
CREATE TABLE t1 (a int, b int, c int);
CREATE TABLE t2 (a int, b int, c int);
SET enable_eager_aggregate TO on;
explain(costs off, settings) SELECT avg(t2.a), t1.c FROM t1 JOIN t2 ON
t1.b = t2.b GROUP BY t
On Tue, Oct 29, 2024 at 3:51 AM Richard Guo wrote:
> > 2. join type is FULL JOIN, (i am not sure about other Semijoins and
> > anti-semijoins types).
>
> The presence of a FULL JOIN does not preclude the use of eager
> aggregation. We still can push a partial aggregation down to a level
> that i
On Tue, Oct 29, 2024 at 3:57 AM Richard Guo wrote:
> On Fri, Oct 18, 2024 at 10:22 PM jian he wrote:
> > So overall I doubt here BTEQUALIMAGE_PROC flag usage is correct.
>
> The BTEQUALIMAGE_PROC flag is used to prevent eager aggregation for
> types whose equality operators do not imply bitwise e
On Sat, Oct 5, 2024 at 11:30 PM Richard Guo wrote:
> Here’s a comparison of Execution Time and Planning Time for the seven
> queries with eager aggregation disabled versus enabled (best of 3).
>
> Execution Time:
>
> EAGER-AGG-OFF EAGER-AGG-ON
>
> q4 105787.963 ms
On Tue, Sep 24, 2024 at 11:20 PM Richard Guo wrote:
> The reason is that it is very tricky to set the size estimates for a
> grouped join relation. For a non-grouped join relation, we know that
> all its paths have the same rowcount estimate (well, in theory). But
> this is not true for a groupe
On Wed, Sep 25, 2024 at 3:03 AM Richard Guo wrote:
> On Wed, Sep 11, 2024 at 10:52 AM Tender Wang wrote:
> > 1. In make_one_rel(), we have the below codes:
> > /*
> > * Build grouped base relations for each base rel if possible.
> > */
> > setup_base_grouped_rels(root);
> >
> > As far as I know,
On Fri, Oct 18, 2024 at 12:44 PM jian he wrote:
> 1. root->parse->resultRelation > 0
> just be 100% sure we are only dealing with SELECT, or we can add
> Assert at the end of setup_eager_aggregation.
Can GROUP BY clauses be used in INSERT/UPDATE/DELETE/MERGE statements?
If not, I think there
On Fri, Oct 18, 2024 at 10:22 PM jian he wrote:
> So overall I doubt here BTEQUALIMAGE_PROC flag usage is correct.
The BTEQUALIMAGE_PROC flag is used to prevent eager aggregation for
types whose equality operators do not imply bitwise equality, such as
NUMERIC.
After a second thought, I think it
/*
* Eager aggregation is only possible if equality of grouping keys, as
* defined by the equality operator, implies bitwise equality.
* Otherwise, if we put keys with different byte images into the same
* group, we may lose some information that could be
On Sat, Oct 5, 2024 at 6:23 PM Richard Guo wrote:
>
> On Fri, Sep 27, 2024 at 11:53 AM Richard Guo wrote:
> > Here is an updated version of this patch that fixes the rowcount
> > estimate issue along this routine. (see set_joinpath_size.)
>
in the function setup_eager_aggregation,
can we be mor
On Sat, Oct 5, 2024 at 6:23 PM Richard Guo wrote:
>
> On Fri, Sep 27, 2024 at 11:53 AM Richard Guo wrote:
> > Here is an updated version of this patch that fixes the rowcount
> > estimate issue along this routine. (see set_joinpath_size.)
>
> I have worked on inventing some heuristics to limit th
On Fri, Sep 13, 2024 at 3:48 PM Tender Wang wrote:
> Since MERGE/SPLIT partition has been reverted, the tests *partition_merge*
> and *partition_split* should be removed
> from parallel_schedule. After doing the above, the 0002 patch can be applied.
Yeah, that's what I need to do.
Thanks
Ric
On Wed, Sep 11, 2024 at 10:52 AM Tender Wang wrote:
> 1. In make_one_rel(), we have the below codes:
> /*
> * Build grouped base relations for each base rel if possible.
> */
> setup_base_grouped_rels(root);
>
> As far as I know, each base rel only has one grouped base relation, if
> possible.
>
On Thu, Sep 5, 2024 at 9:40 AM Tender Wang wrote:
> 1. in setup_eager_aggregation(), before calling create_agg_clause_infos(),
> it does
> some checks if eager aggregation is available. Can we move those checks into
> a function,
> for example, can_eager_agg(), like can_partial_agg() does?
We
On Wed, Aug 28, 2024 at 9:01 PM Robert Haas wrote:
> On Tue, Aug 27, 2024 at 11:57 PM Tender Wang wrote:
> > I haven't look all of them. I just pick few simple plan test(e.g. 19.sql,
> > 45.sql).
> > For example, 19.sql, eager agg pushdown doesn't get large gain, but a little
> > performance reg
Tender Wang 于2024年9月4日周三 11:48写道:
>
>
> Richard Guo 于2024年8月21日周三 15:11写道:
>
>> On Fri, Aug 16, 2024 at 4:14 PM Richard Guo
>> wrote:
>> > I had a self-review of this patchset and made some refactoring,
>> > especially to the function that creates the RelAggInfo structure for a
>> > given relat
Richard Guo 于2024年8月21日周三 15:11写道:
> On Fri, Aug 16, 2024 at 4:14 PM Richard Guo
> wrote:
> > I had a self-review of this patchset and made some refactoring,
> > especially to the function that creates the RelAggInfo structure for a
> > given relation. While there were no major changes, the cod
Richard Guo 于2024年8月21日周三 15:11写道:
> On Fri, Aug 16, 2024 at 4:14 PM Richard Guo
> wrote:
> > I had a self-review of this patchset and made some refactoring,
> > especially to the function that creates the RelAggInfo structure for a
> > given relation. While there were no major changes, the cod
Richard Guo 于2024年8月21日周三 15:11写道:
> On Fri, Aug 16, 2024 at 4:14 PM Richard Guo
> wrote:
> > I had a self-review of this patchset and made some refactoring,
> > especially to the function that creates the RelAggInfo structure for a
> > given relation. While there were no major changes, the cod
On Wed, Aug 28, 2024 at 11:38 PM Tender Wang wrote:
> I upload EXPLAIN(COSTS ON, ANALYZE) test to [1].
> I ran the same query three times, and I chose the third time result.
> You can check 19_off_explain.out and 19_on_explain.out.
So, in 19_off_explain.out, we got this:
-> Finalize Gr
On Wed, Aug 28, 2024 at 10:26 PM Richard Guo wrote:
> Yeah, I'm concerned about this too. In addition to the inaccuracies
> in aggregation estimates, our estimates for joins are sometimes not
> very accurate either. All this are likely to result in regressions
> with eager aggregation in some ca
Richard Guo 于2024年8月29日周四 10:46写道:
> On Wed, Aug 28, 2024 at 9:01 PM Robert Haas wrote:
> > On Tue, Aug 27, 2024 at 11:57 PM Tender Wang wrote:
> > > I haven't look all of them. I just pick few simple plan test(e.g.
> 19.sql, 45.sql).
> > > For example, 19.sql, eager agg pushdown doesn't get la
Richard Guo 于2024年8月29日周四 10:46写道:
> On Wed, Aug 28, 2024 at 9:01 PM Robert Haas wrote:
> > On Tue, Aug 27, 2024 at 11:57 PM Tender Wang wrote:
> > > I haven't look all of them. I just pick few simple plan test(e.g.
> 19.sql, 45.sql).
> > > For example, 19.sql, eager agg pushdown doesn't get la
On Wed, Aug 28, 2024 at 9:01 PM Robert Haas wrote:
> On Tue, Aug 27, 2024 at 11:57 PM Tender Wang wrote:
> > I haven't look all of them. I just pick few simple plan test(e.g. 19.sql,
> > 45.sql).
> > For example, 19.sql, eager agg pushdown doesn't get large gain, but a little
> > performance reg
On Wed, Aug 28, 2024 at 11:57 AM Tender Wang wrote:
> Rectenly, I do some benchmark tests, mainly on tpch and tpcds.
> tpch tests have no plan diff, so I do not continue to test on tpch.
> tpcds(10GB) tests have 22 plan diff as below:
> 4.sql, 5.sql, 8.sql,11.sql,19.sql,23.sql,31.sql,
> 33.sql,39
On Fri, Aug 23, 2024 at 11:59 PM Robert Haas wrote:
> Here are some initial, high-level thoughts about this patch set.
Thank you for your review and feedback! It helps a lot in moving this
work forward.
> 1. As far as I can see, there's no real performance testing on this
> thread. I expect tha
On Tue, Aug 27, 2024 at 11:57 PM Tender Wang wrote:
> Rectenly, I do some benchmark tests, mainly on tpch and tpcds.
> tpch tests have no plan diff, so I do not continue to test on tpch.
Interesting to know.
> tpcds(10GB) tests have 22 plan diff as below:
> 4.sql, 5.sql, 8.sql,11.sql,19.sql,23.s
Richard Guo 于2024年8月21日周三 15:11写道:
> On Fri, Aug 16, 2024 at 4:14 PM Richard Guo
> wrote:
> > I had a self-review of this patchset and made some refactoring,
> > especially to the function that creates the RelAggInfo structure for a
> > given relation. While there were no major changes, the cod
On Wed, Aug 21, 2024 at 3:11 AM Richard Guo wrote:
> Attached is the updated version of the patchset that fixes this bug
> and includes further code refactoring.
Here are some initial, high-level thoughts about this patch set.
1. As far as I can see, there's no real performance testing on this
t
Hey Richard,
Looking more closely at this example
>select t2.a, count(*) from t t1 left join t t2 on t2.b > 1 group by t2.a
having t2.a is null;
I wonder if the inability to exploit eager aggregation is more based on the
fact that COUNT(*) cannot be decomposed into an aggregation of PARTIAL
COUN
On Sun, Jul 7, 2024 at 10:45 AM Paul George wrote:
> Thanks for reviving this patch and for all of your work on it! Eager
> aggregation pushdown will be beneficial for my work and I'm hoping to see it
> land.
Thanks for looking at this patch!
> The output of both the original query and this on
Richard:
Thanks for reviving this patch and for all of your work on it! Eager
aggregation pushdown will be beneficial for my work and I'm hoping to see
it land.
I was playing around with v9 of the patches and was specifically curious
about this previous statement...
>This patch also makes eager
Richard Guo writes:
> Hi All,
>
> Eager aggregation is a query optimization technique that partially
> pushes a group-by past a join, and finalizes it once all the relations
> are joined. Eager aggregation reduces the number of input rows to the
> join and thus may result in a better overall p
64 matches
Mail list logo