On Tue, Jan 21, 2025 at 2:57 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > However, a partial-aggregation path does not generate the same data > as an unaggregated path, no matter how fuzzy you are willing to be > about the concept. So I'm having a very hard time accepting that > it ought to be part of the same RelOptInfo, and thus I don't really > buy that annotating paths with a GroupPathInfo is the way forward.
Agreed. I think one point I failed to make myself clear on is that I've never intended to put a partial-aggregation path and an unaggregated path into the same RelOptInfo. One of the basic designs of this patch is that partial-aggregation paths are placed in a separate category of RelOptInfos, which I call "grouped relations" (though I admit that's not the best name). This ensures that we never compare a partial-aggregation path with an unaggregated path during scan/join planning, because I am certain that the two categories of paths are not comparable. Regarding the GroupPathInfo proposal, my intention is to add a valid GroupPathInfo only for the partial-aggregation paths. The goal is to ensure that partial-aggregation paths within this category are compared only if their partial aggregations are at the same location. To be honest, I still doubt that this is necessary. I have two main reasons for this. 1. For a partial-aggregation path, the location where we place the partial aggregation does not impose any restrictions on further planning. This is different from the parameterized path case. If two parameterized paths are equal on very other figure of merit, we will choose the one with fewer required outer rels, as it means fewer join restrictions on upper planning. However, for partial-aggregation paths, we do not have a preference regarding the location of the partial aggregation. For instance, for path "A JOIN PartialAgg(B) JOIN C" and path "PartialAgg(A JOIN B) JOIN C", if one path dominates the other on every figure of merit, it seems to me that there's no point in keeping the less favorable one, although they have their partial aggregations at different join levels. 2. A partial-aggregation path of a rel essentially yields an aggregated form of that rel's row set. The difference between the row sets yielded by paths with different locations of partial aggregation is primarily about the different degrees to which the rows are aggregated. These sets are fundamentally homogeneous. In summary, in my own opinion, I think the partial-aggregation paths of the same "grouped relation" are comparable, regardless of the position of the partial aggregation within the path tree. So I think we should put them into the same RelOptInfo. Of course, I could be very wrong about this. I would greatly appreciate hearing others' thoughts on this. Thanks Richard