Blizzara opened a new issue, #16140: URL: https://github.com/apache/datafusion/issues/16140
### Describe the bug A Substrait plan with an aggregation that has duplicate entries doesn't provide output columns for all of the duplicates. This causes issues downstream, since expected columns aren't found (we'd [expect to find](https://substrait.io/relations/logical_relations/#aggregate-operation) `The list of grouping expressions in declaration order followed by the list of measures in declaration order --`). One can wonder if those duplicates are useful (probs not), but the plan should still be valid (and substrait-spark does seem to produce these in some cases). A possible solution might be to automatically wrap the Aggregate in a Project which duplicates the missing columns. ### To Reproduce The following plan fails to read: ``` { "extensionUris": [], "extensions": [], "relations": [ { "root": { "input": { "aggregate": { "input": { "read": { "common": { "direct": {} }, "baseSchema": { "names": [], "struct": { "types": [], "nullability": "NULLABILITY_NULLABLE" } }, "namedTable": { "names": [ "data" ] } } }, "groupings": [ { "groupingExpressions": [ { "literal": { "i32": 1 } }, { "literal": { "i32": 1 } } ] } ], "measures": [] } }, "names": [ "grouping_col_1", "grouping_col_2" ] } } ], "version": { "minorNumber": 54, "producer": "manual" } } ``` Changing one of the literals to another value makes the plan pass. ### Expected behavior Expected answer would be: ``` Aggregate: groupBy=[[Int32(1) AS grouping_col_1, Int32(1) AS grouping_col_2]], aggr=[[]] TableScan: data ``` ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org