On Wed, Aug 21, 2024 at 3:11 AM Richard Guo <guofengli...@gmail.com> wrote: > Attached is the updated version of the patchset that fixes this bug > and includes further code refactoring.
Here are some initial, high-level thoughts about this patch set. 1. As far as I can see, there's no real performance testing on this thread. I expect that it's possible to show an arbitrarily large gain for the patch by finding a case where partial aggregation is way better than anything we currently know, but that's not very interesting. What I think would be useful to do is find a corpus of existing queries on an existing data set and try them with and without the patch and see which query plans change and whether they're actually better. For example, maybe TPC-H or the subset of TPC-DS that we can actually run would be a useful starting point. One could then also measure how much the planning time increases with the patch to get a sense of what the overhead of enabling this feature would be. Even if it's disabled by default, people aren't going to want to enable it if it causes planning times to become much longer on many queries for which there is no benefit. 2. I think there might be techniques we could use to limit planning effort at an earlier stage when the approach doesn't appear promising. For example, if the proposed grouping column is already unique, the exercise is pointless (I think). Ideally we'd like to detect that without even creating the grouped_rel. But the proposed grouping column might also be *mostly* unique. For example, consider a table with a million rows and a column 500,000 distinct values. I suspect it will be difficult for partial aggregation to work out to a win in a case like this, because I think that the cost of performing the partial aggregation will not reduce the cost either of the final aggregation or of the intervening join steps by enough to compensate. It would be best to find a way to avoid generating a lot of rels and paths in cases where there's really not much hope of a win. One could, perhaps, imagine going further with this by postponing eager aggregation planning until after regular paths have been built, so that we have good cardinality estimates. Suppose the query joins a single fact table to a series of dimension tables. The final plan thus uses the fact table as the driving table and joins to the dimension tables one by one. Do we really need to consider partial aggregation at every level? Perhaps just where there's been a significant row count reduction since the last time we tried it, but at the next level the row count will increase again? Maybe there are other heuristics we could use in addition or instead. 3. In general, we are quite bad at estimating what will happen to the row count after an aggregation, and we have no real idea what the distribution of values will be. That might be a problem for this patch, because it seems like the decisions we will make about where to perform the partial aggregation might end up being quite random. At the top of the join tree, I'll need to compare directly aggregating the best join path with various paths that involve a finalize aggregation step at the top and a partial aggregation step further down. But my cost estimates and row counts for the partial aggregate steps seem like they will often be quite poor, which means that the plans that use those partial aggregate steps might also be quite poor. Even if they're not, I fear that comparing the cost of those PartialAggregate-Join(s)-FinalizeAggregate paths to the direct Aggregate path will look too much like comparing random numbers. We need to know whether the combination of the FinalizeAggregate step and the PartialAggregate step will be more or less expensive than a plain old Aggregate, but how can we tell that if we don't have accurate cardinality estimates? Thanks for working on this. -- Robert Haas EDB: http://www.enterprisedb.com