wiedld opened a new issue, #14691: URL: https://github.com/apache/datafusion/issues/14691
### Describe the bug During the EnforceSorting optimizer run, a valid plan may be turned invalid due to the removal of a necessary coalesce. The result is a planning time failure in the SanityChecker due to `does not satisfy distribution requirements: HashPartitioned[[a@0]]). Child-0 output partitioning: UnknownPartitioning(2)`. We start with a valid input plan: ``` "SortExec: expr=[a@0 ASC], preserve_partitioning=[false]", " AggregateExec: mode=SinglePartitioned, gby=[a@0 as a1], aggr=[]", " CoalescePartitionsExec", " ProjectionExec: expr=[a@0 as a, b@1 as value]", " UnionExec", " DataSourceExec: file_groups={1 group: [[x]]}, projection=[a, b, c, d, e], file_type=parquet", " DataSourceExec: file_groups={1 group: [[x]]}, projection=[a, b, c, d, e], file_type=parquet" ``` And a coalesce is removed to make it invalid: ``` "SortPreservingMergeExec: [a@0 ASC]", " SortExec: expr=[a@0 ASC], preserve_partitioning=[true]", " AggregateExec: mode=SinglePartitioned, gby=[a@0 as a1], aggr=[]", " ProjectionExec: expr=[a@0 as a, b@1 as value]", " UnionExec", " DataSourceExec: file_groups={1 group: [[x]]}, projection=[a, b, c, d, e], file_type=parquet", " DataSourceExec: file_groups={1 group: [[x]]}, projection=[a, b, c, d, e], file_type=parquet", ``` ### To Reproduce A test case demonstrates this: https://github.com/apache/datafusion/pull/14637/commits/670eff35bce04efdc163ce7823437691aa9f29f6 ### Expected behavior EnforceSorting should not take a valid plan, and make it invalid -- and then failing the planning sanity check. ### Additional context We already have a proposed solution: https://github.com/apache/datafusion/pull/14637 While debugging, I did a minor refactor to `paralelize_sorts` and its helper `remove_bottleneck_in_subplan`. The reason for the refactor ([also summarized here](https://github.com/apache/datafusion/pull/14637#discussion_r1957023902)), was that I noticed a pattern of several necessary nodes being removed -- and then added back later. I elected to simplify the code (IMO) by tightening up how we build the `PlanWithCorrespondingCoalescePartitions`, in order to correctly identify want nodes should be removed in the first place. Instead of removing and then adding back. The refactor is isolated in this commit: https://github.com/apache/datafusion/pull/14637/commits/0661ed7e8934e7f2a711416b85cbafde2a7b99e2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org