David Rowley <dgrowle...@gmail.com> wrote:

> On Wed, 4 Jan 2023 at 23:21, Spring Zhong <spring.zh...@openpie.com> wrote:
> > The plan is apparently inefficient, since the hash aggregate goes after the 
> > Cartesian product. We could expect the query's performance get much 
> > improved if the HashAggregate node can be pushed down to the SCAN node.
> 
> > Is someone has suggestions on this?
> 
> I think this is being worked on. See [1].

Well, the current version of that patch requires the query to contain at least
one aggregate. It shouldn't be a big deal to modify it. However note that this
feature pushes the aggregate/grouping only to one side of the join ("fake"
aggregate count(*) added to the query):

SET enable_agg_pushdown TO on;

EXPLAIN select i1,i2, count(*) from t1, t2 group by i1,i2;
                                   QUERY PLAN                                   
--------------------------------------------------------------------------------
 Finalize GroupAggregate  (cost=440.02..440.04 rows=1 width=16)
   Group Key: t1.i1, t2.i2
   ->  Sort  (cost=440.02..440.02 rows=1 width=16)
         Sort Key: t1.i1, t2.i2
         ->  Nested Loop  (cost=195.00..440.01 rows=1 width=16)
               ->  Partial HashAggregate  (cost=195.00..195.01 rows=1 width=12)
                     Group Key: t1.i1
                     ->  Seq Scan on t1  (cost=0.00..145.00 rows=10000 width=4)
               ->  Seq Scan on t2  (cost=0.00..145.00 rows=10000 width=4)

If both sides should be grouped, finalization of the partial aggregates would
be more difficult, and I'm not sure it'd be worth the effort.

> [1] https://commitfest.postgresql.org/41/3764/

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com


Reply via email to