David Rowley <dgrowle...@gmail.com> wrote: > On Wed, 4 Jan 2023 at 23:21, Spring Zhong <spring.zh...@openpie.com> wrote: > > The plan is apparently inefficient, since the hash aggregate goes after the > > Cartesian product. We could expect the query's performance get much > > improved if the HashAggregate node can be pushed down to the SCAN node. > > > Is someone has suggestions on this? > > I think this is being worked on. See [1].
Well, the current version of that patch requires the query to contain at least one aggregate. It shouldn't be a big deal to modify it. However note that this feature pushes the aggregate/grouping only to one side of the join ("fake" aggregate count(*) added to the query): SET enable_agg_pushdown TO on; EXPLAIN select i1,i2, count(*) from t1, t2 group by i1,i2; QUERY PLAN -------------------------------------------------------------------------------- Finalize GroupAggregate (cost=440.02..440.04 rows=1 width=16) Group Key: t1.i1, t2.i2 -> Sort (cost=440.02..440.02 rows=1 width=16) Sort Key: t1.i1, t2.i2 -> Nested Loop (cost=195.00..440.01 rows=1 width=16) -> Partial HashAggregate (cost=195.00..195.01 rows=1 width=12) Group Key: t1.i1 -> Seq Scan on t1 (cost=0.00..145.00 rows=10000 width=4) -> Seq Scan on t2 (cost=0.00..145.00 rows=10000 width=4) If both sides should be grouped, finalization of the partial aggregates would be more difficult, and I'm not sure it'd be worth the effort. > [1] https://commitfest.postgresql.org/41/3764/ -- Antonin Houska Web: https://www.cybertec-postgresql.com