On Mon, Jan 29, 2018 at 3:42 AM, Jeevan Chalke <jeevan.cha...@enterprisedb.com> wrote: > Attached new patch set and rebased it on latest HEAD.
I strongly dislike add_single_path_to_append_rel. It adds branches and complexity to code that is already very complex. Most importantly, why are we adding paths to fields in OtherUpperPathExtraData *extra instead of adding them to the path list of some RelOptInfo? If we had an appropriate RelOptInfo to which we could add the paths, then we could make this simpler. If I understand correctly, the reason you're doing it this way is because we have no place to put partially-aggregated, non-partial paths. If we only needed to worry about the parallel query case, we could just build an append of partially-aggregated paths for each child and stick it into the grouped rel's partial pathlist, just as we already do for regular parallel aggregation. There's no reason why add_paths_to_grouping_rel() needs to care about the difference a Partial Aggregate on top of whatever and an Append each branch of which is a Partial Aggregate on top of whatever. However, this won't work for non-partial paths, because add_paths_to_grouping_rel() needs to put paths into the grouped rel's pathlist -- and we can't mix together partially-aggregated paths and fully-aggregated paths in the same path list. But, really, the way we're using grouped_rel->partial_pathlist right now is an awful hack. What I'm thinking we could do is introduce a new UpperRelationKind called UPPERREL_PARTIAL_GROUP_AGG, coming just before UPPERREL_GROUP_AGG. Without partition-wise aggregate, partially_grouped_rel's pathlist would always be NIL, and its partial pathlist would be constructed using the logic in add_partial_paths_to_grouping_rel, which would need renaming. Then, add_paths_to_grouping_rel would use paths from input_rel when doing non-parallel aggregation and paths from partially_grouped_rel when doing parallel aggregation. This would eliminate the ugly grouped_rel->partial_pathlist = NIL assignment at the bottom of create_grouping_paths(), because the grouped_rel's partial_pathlist would never have been (bogusly) populated in the first place, and hence would not need to be reset. All of these changes could be made via a preparatory patch. Then the main patch needs to worry about four cases: 1. Parallel partition-wise aggregate, grouping key doesn't contain partition key. This should just be a matter of adding additional Append paths to partially_grouped_rel->partial_pathlist. The existing code already knows how to stick a Gather and FinalizeAggregate step on top of that, and I don't see why that logic would need any modification or addition. An Append of child partial-grouping paths should be producing the same output as a partial grouping of an Append, except that the former case might produce more separate groups that need to be merged; but that should be OK: we can just throw all the paths into the same path list and let the cheapest one win. 2. Parallel partition-wise aggregate, grouping key contains partition key. For the most part, this is no different from case #1. We won't have groups spanning different partitions in this case, but we might have groups spanning different workers, so we still need a FinalizeAggregate step. As an exception, Gather -> Parallel Append -> [non-partial Aggregate path] would give us a way of doing aggregation in parallel without a separate Finalize step. I'm not sure if we want to consider that to be in scope for this patch. If we do, then we'd add the Parallel Append path to grouped_rel->partial_pathlist. Then, we could stick Gather (Merge) on top if it to produce a path for grouped_rel->pathlist using generate_gather_paths(); alternatively, it can be used by upper planning steps -- something we currently can't ever make work with parallel aggregation. 3. Non-parallel partition-wise aggregate, grouping key contains partition key. Build Append paths from the children of grouped_rel and add them to grouped_rel->pathlist. 3. Non-parallel partition-wise aggregate, grouping key doesn't contain partition key. Build Append paths from the children of partially_grouped_rel and add them to partially_grouped_rel->pathlist. Also add code to generate paths for grouped_rel->pathlist by sticking a FinalizeAggregate step on top of each path from partially_grouped_rel->pathlist. Overall, what I'm trying to argue for here is making this feature look less like its own separate thing and more like part of the general mechanism we've already got: partial paths would turn into regular paths via generate_gather_paths(), and partially aggregated paths would turn into fully aggregated paths by adding FinalizeAggregate. The existing special case that allows us to build a non-partial, fully aggregated path from a partial, partially-aggregated path would be preserved. I think this would probably eliminate some other problems I see in the existing design as well. For example, create_partition_agg_paths doesn't consider using Gather Merge, but that might be a win. With the design above, I think you never need to call create_gather_path() anywhere. In case #1, the existing code takes care of it. In the special case mentioned under #2, if we chose to support that, generate_gather_paths() would take care of it. Both of those places already know about Gather Merge. On another note, I found preferFullAgg to be wicked confusing. To "prefer" something is to like it better, but be willing to accept other options if the preference can't be accommodated. Here, it seems like preferFullAgg = false prevents consideration of full aggregation. So it's really more like allowFullAgg, or, maybe better, try_full_aggregation. Also, try_partition_wise_grouping has a variable isPartialAgg which is always ends up getting set to !preferFullAgg. Having two Boolean variables which are always set to the opposite of each other isn't good. To add to the confusion, the code following the place where isPartialAgg is set sometimes refers to isPartialAgg and sometimes refers to preferFullAgg. I think the comments in this patch still need a good bit of work. They tend to explain what the code does rather than the reason it does it, and they tend to speak vaguely rather than precisely about things happening in other places. For example, consider the need to set partial_pathlist = NIL in create_grouping_paths(). Here's the existing comment: /* * We've been using the partial pathlist for the grouped relation to hold * partially aggregated paths, but that's actually a little bit bogus * because it's unsafe for later planning stages -- like ordered_rel --- * to get the idea that they can use these partial paths as if they didn't * need a FinalizeAggregate step. Zap the partial pathlist at this stage * so we don't get confused. */ Here's your comment about the same hazzard: + * For full aggregation, at this point we are already done with the + * finalization step and thus partial paths are no more needed. Keeping + * those will lead to some unwanted result later in the planning stage. + * Thus like create_grouping_paths(), clear them out. Notice that your comment says that it will created an "unwanted result" and that this result will happen "later", whereas the existing comment is a lot more specific. It says exactly what the problem is (FinalizeAggregate needed) and where the confusion will happen (ordered_rel). Some other examples of comments with similar problems: + * If there are partial subpaths for parallelism, then we need to add + * gather path on top of the append. However, we only do this when full + * aggregation is required. For partial aggregation this can be done at + * later stage. Doesn't really explain why we're doing any of those things, just says that we are. Also, what later stage? + * For non-partial aggregation path, just need to add given append path to + * a grouped_rel. Also, if caller requested a partial aggregation only, + * skip finalize step. Again, why? + * Add all collected append paths into the grouped_rel. For partial + * aggregation mode we need to add a finalize agg node over an append + * path. Why? + /* Similarly, for partial paths. Here we need to add gather node too. */ Why? + * However, in partial aggregation mode this is done at later stage, + * so skip it. When? Here's an example of a much better comment you wrote: + * In find_ec_member_for_tle(), child EC members are ignored if they don't + * belong to the given relids. Thus, if this sort path is based on a child + * relation, we must pass the relids of it. Otherwise, we will end-up into + * an error requiring pathkey item. I haven't studied this patch in enough depth yet to figure out whether I think that makes sense, but clearly when I go to do that this comment is going to be a big help in figuring it out. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company