Richard: Thanks for reviving this patch and for all of your work on it! Eager aggregation pushdown will be beneficial for my work and I'm hoping to see it land.
I was playing around with v9 of the patches and was specifically curious about this previous statement... >This patch also makes eager aggregation work with outer joins. With >outer join, the aggregate cannot be pushed down if any column referenced >by grouping expressions or aggregate functions is nullable by an outer >join above the relation to which we want to apply the partiall >aggregation. Thanks to Tom's outer-join-aware-Var infrastructure, we >can easily identify such situations and subsequently refrain from >pushing down the aggregates. ...and this related comment in eager_aggregate.out: >-- Ensure aggregation cannot be pushed down to the nullable side While I'm new to this work and its subtleties, I'm wondering if this is too broad a condition. I modified the first test query in eager_aggregate.sql to make it a LEFT JOIN and eager aggregation indeed did not happen, which is expected based on the comments upthread. query: SET enable_eager_aggregate=ON; EXPLAIN (VERBOSE, COSTS OFF) SELECT t1.a, sum(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a; plan: QUERY PLAN ------------------------------------------------------------ GroupAggregate Output: t1.a, sum(t2.c) Group Key: t1.a -> Sort Output: t1.a, t2.c Sort Key: t1.a -> Hash Right Join Output: t1.a, t2.c Hash Cond: (t2.b = t1.b) -> Seq Scan on public.eager_agg_t2 t2 Output: t2.a, t2.b, t2.c -> Hash Output: t1.a, t1.b -> Seq Scan on public.eager_agg_t1 t1 Output: t1.a, t1.b (15 rows) (NOTE: I changed the aggregate from avg(...) to sum(...) for simplicity) But, it seems that eager aggregation for the query above can be "replicated" as: query: EXPLAIN (VERBOSE, COSTS OFF) SELECT t1.a, sum(t2.c) FROM eager_agg_t1 t1 LEFT JOIN ( SELECT b, sum(c) c FROM eager_agg_t2 t2p GROUP BY b ) t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a; The output of both the original query and this one match (and the plans with eager aggregation and the subquery are nearly identical if you restore the LEFT JOIN to a JOIN). I admittedly may be missing a subtlety, but does this mean that there are conditions under which eager aggregation can be pushed down to the nullable side? -Paul- On Sat, Jul 6, 2024 at 4:56 PM Richard Guo <guofengli...@gmail.com> wrote: > On Thu, Jun 13, 2024 at 4:07 PM Richard Guo <guofengli...@gmail.com> > wrote: > > I spent some time testing this patchset and found a few more issues. > > ... > > > Hence here is the v8 patchset, with fixes for all the above issues. > > I found an 'ORDER/GROUP BY expression not found in targetlist' error > with this patchset, with the query below: > > create table t (a boolean); > > set enable_eager_aggregate to on; > > explain (costs off) > select min(1) from t t1 left join t t2 on t1.a group by (not (not > t1.a)), t1.a order by t1.a; > ERROR: ORDER/GROUP BY expression not found in targetlist > > This happens because the two grouping items are actually the same and > standard_qp_callback would remove one of them. The fully-processed > groupClause is kept in root->processed_groupClause. However, when > collecting grouping expressions in create_grouping_expr_infos, we are > checking parse->groupClause, which is incorrect. > > The fix is straightforward: check root->processed_groupClause instead. > > Here is a new rebase with this fix. > > Thanks > Richard >