On Tue, 3 Oct 2023 at 20:16, David Rowley <dgrowle...@gmail.com> wrote: > I wonder if the attached patch is too much of a special case fix. I > guess from the lack of complaints previously that there are no other > cases where we could possibly have pathkeys that belong to columns > that are aggregated. I've not gone to much effort to see if I can > craft a case that hits this without the ORDER BY/DISTINCT aggregate > optimisation, however.
I spent more time on this today. I'd been wondering if there was any reason why create_agg_path() would receive a subpath with pathkeys that were anything but the PlannerInfo's group_pathkeys. I mean, how could we do Group Aggregate if it wasn't? I wondered if grouping sets might change that, but it seems the group_pathkeys will be set to the initial grouping set. Given that, it would seem it's safe to just trim off any pathkey that was added to the group_pathkeys by adjust_group_pathkeys_for_groupagg(). PlannerInfo.num_groupby_pathkeys marks the number of pathkeys that existed in group_pathkeys before adjust_group_pathkeys_for_groupagg() made any additions, so we can just trim the list length back to that. I've done this in the attached patch. I also considered if it was worth adding a regression test for this and I concluded that there are better ways to test for this and considered if we should add some code to createplan.c to check that all Path pathkeys have corresponding items in the PathTarget. I've included an additional patch which adds some code in USE_ASSERT_CHECKING builds to verify this. Without the fix it's simple enough to trigger this with a query such as: select two,count(distinct four) from tenk1 group by two order by two; Without the fix the additional asserts cause the regression tests to fail, but with the fix everything passes. Justin's case is quite an obscure way to hit this as it requires partitionwise aggregation plus a single partition so that the Append is removed due to only having a single subplan in setrefs.c. If there had been 2 partitions, then the AppendPath wouldn't have inherited the subpath's pathkeys per code at the end of create_append_path(). So in short, I propose the attached fix without any regression tests because I feel that any regression test would just mark that there was a big in create_agg_path() and not really help with ensuring we don't end up with some similar problem in the future. I have some concerns that the assert_pathkeys_in_target() function might be a little heavyweight for USE_ASSERT_CHECKING builds. So I'm not proposing to commit that without further discussion. Does anyone feel differently? If not, I plan to push the attached strip_aggregate_pathkeys_from_aggpaths_v2.patch early next week. David
strip_aggregated_pathkeys_from_aggpaths_v2.patch
Description: Binary data
assert_pathkeys_in_target.patch
Description: Binary data