asubiotto commented on issue #14991: URL: https://github.com/apache/datafusion/issues/14991#issuecomment-2720197770
Nowhere. What I mean is that the `AggregateExec` **should** have `ordering_mode=Sorted` as you share in your previous comment but it **does not** (and I would expect it to). Apologies if this is off-topic for the issue but since you were looking for examples where the sorted properties of the group columns are somehow lost I thought it would be good to share a practical example (I was searching for ordered aggregation support and came across this issue). Specifically `generated_id` is in sorted order and since it is the only group column for the aggregation I would expect `ordering_mode=Sorted` in the `AggregateExec`. For example, if I change the values/agg function and remove the unnest, this ordering is preserved: ``` EXPLAIN WITH unnested AS (SELECT ROW_NUMBER() OVER () AS generated_id, value as ar FROM range(1,5)) SELECT generated_id, sum(ar) FROM unnested group by generated_id; +---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | plan_type | plan | +---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | logical_plan | Aggregate: groupBy=[[unnested.generated_id]], aggr=[[sum(unnested.ar)]] | | | SubqueryAlias: unnested | | | Projection: row_number() ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING AS generated_id, tmp_table.value AS ar | | | WindowAggr: windowExpr=[[row_number() ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING]] | | | TableScan: tmp_table projection=[value] | | physical_plan | AggregateExec: mode=FinalPartitioned, gby=[generated_id@0 as generated_id], aggr=[sum(unnested.ar)], ordering_mode=Sorted | | | SortExec: expr=[generated_id@0 ASC NULLS LAST], preserve_partitioning=[true] | | | CoalesceBatchesExec: target_batch_size=8192 | | | RepartitionExec: partitioning=Hash([generated_id@0], 12), input_partitions=12 | | | AggregateExec: mode=Partial, gby=[generated_id@0 as generated_id], aggr=[sum(unnested.ar)], ordering_mode=Sorted | | | RepartitionExec: partitioning=RoundRobinBatch(12), input_partitions=1 | | | ProjectionExec: expr=[row_number() ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING@1 as generated_id, value@0 as ar] | | | BoundedWindowAggExec: wdw=[row_number() ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING: Ok(Field { name: "row_number() ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING", data_type: UInt64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }), frame: WindowFrame { units: Rows, start_bound: Preceding(UInt64(NULL)), end_bound: Following(UInt64(NULL)), is_causal: false }], mode=[Sorted] | | | LazyMemoryExec: partitions=1, batch_generators=[range: start=1, end=5, batch_size=8192] | | | | +---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2 row(s) fetched. Elapsed 0.009 seconds. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org