I'm very sorry for not having had any time to look at this patch before --- it's been on my radar screen for awhile, but $LIFE has been rather demanding lately.
Anyway, I've now read through the mail thread and portions of the v16 patch, and I have to concur with Robert's qualms about whether this is ready. A few observations: * The README addition, and the basically identical text in the commit message, don't even provide a reason to believe that the transformation is correct let alone that it will result in faster execution. I don't understand why it's so hard to provide a solid correctness argument. This work was supposedly based on an academic paper; surely that paper must have included a correctness proof? PG might need a few refinements, like being specific about what we expect from the equality operators. But an EXPLAIN plan is not an argument. * As for the performance aspect, we're given Finalize HashAggregate Group Key: a.i -> Nested Loop -> Partial HashAggregate Group Key: b.j -> Seq Scan on b -> Index Only Scan using a_pkey on a Index Cond: (i = b.j) As far as I can see, this will require aggregation to be performed across every row of "b", whereas the naive way would have aggregated across only rows having join partners in "a". If most "b" rows lack a join partner then this will be far slower than the naive way. I do see that it can be better if most "b" rows have multiple join partners, because we'll re-use partial aggregation results instead of (effectively) recalculating them. But the README text makes it sound like this is an unconditional win, which is not the right mindset. (In fact, in this specific example where a.i is presumed unique, how's it a win at all?) * I'm also concerned about what happens with aggregates that can have large partial-aggregation values, such as string_agg(). With the existing usage of partial aggregation for parallel queries, it's possible to be confident that there are not many partial-aggregation values in existence at the same time. I don't think that holds for pushed-down aggregates: for example, I wouldn't be surprised if the planner chooses a join plan that requires stuffing all those values into a hash table, or "materializes" the output of the partial aggregation step. Do we have logic that will avoid blowing out memory during such queries? * I am just as worried as Robert is about the notion of different paths for the same RelOptInfo having different rowcount estimates. That is an extremely fundamental violation of basic planner assumptions. We did bend it for parameterized paths by restating those assumptions as (from optimizer/README): To keep cost estimation rules relatively simple, we make an implementation restriction that all paths for a given relation of the same parameterization (i.e., the same set of outer relations supplying parameters) must have the same rowcount estimate. This is justified by insisting that each such path apply *all* join clauses that are available with the named outer relations. I don't see any corresponding statement here, and it's not clear to me that the point has been thought through adequately. Another aspect that bothers me is that a RelOptInfo is understood to contain a bunch of paths that all yield the same data (the same set of columns), and it seems like that might not be the case here. Certainly partially-aggregated paths will output something different than unaggregated ones, but mightn't different join orders mutate the column set even further? I think that we might be better off building a separate RelOptInfo for each way of pushing down the aggregates, in order to preserve the principle that all the paths in any one RelOptInfo have the same output. This'll mean more RelOptInfos, but not more paths, so I doubt it adds that much performance overhead. Richard Guo <guofengli...@gmail.com> writes: > Back in November, I considered the possibility of introducing a > GroupPathInfo into the Path structure to store the location of the > partial aggregation as well as the estimated rowcount for this grouped > path, similar to how ParamPathInfo functions for parameterized paths. > However, after some exploration, I determined that this was > unnecessary. Why did you determine that was unnecessary? The principal function of ParamPathInfo IMV is to ensure that we use exactly the same rowcount estimate for all the paths that should have the same estimate, and that problem seems to exist here as well. If you don't have a forcing mechanism then paths' estimates will diverge as a result of things like different roundoff errors in different join sequences. Anyway, I agree with Robert that this isn't ready. I don't feel that I can even review it adequately without a lot better internal documentation, specifically a clearer statement of what query shapes the optimization applies to and what's the rationale for the transformation being correct. The commentary in pathnodes.h for the new data structures is likewise so skimpy as to be near useless. regards, tom lane