berkaysynnada commented on code in PR #14813:
URL: https://github.com/apache/datafusion/pull/14813#discussion_r2334061407
##########
datafusion/physical-plan/src/windows/mod.rs:
##########
@@ -337,30 +342,151 @@ pub(crate) fn window_equivalence_properties(
input: &Arc<dyn ExecutionPlan>,
window_exprs: &[Arc<dyn WindowExpr>],
) -> EquivalenceProperties {
- // We need to update the schema, so we can not directly use
- // `input.equivalence_properties()`.
+ // We need to update the schema, so we can't directly use input's
equivalence
+ // properties.
let mut window_eq_properties =
EquivalenceProperties::new(Arc::clone(schema))
.extend(input.equivalence_properties().clone());
- let schema_len = schema.fields.len();
- let window_expr_indices =
- ((schema_len - window_exprs.len())..schema_len).collect::<Vec<_>>();
+ let window_schema_len = schema.fields.len();
+ let input_schema_len = window_schema_len - window_exprs.len();
+ let window_expr_indices =
(input_schema_len..window_schema_len).collect::<Vec<_>>();
+
for (i, expr) in window_exprs.iter().enumerate() {
- if let Some(udf_window_expr) =
expr.as_any().downcast_ref::<StandardWindowExpr>()
+ let partitioning_exprs = expr.partition_by();
+ let no_partitioning = partitioning_exprs.is_empty();
+ // Collect columns defining partitioning, and construct all
`SortOptions`
+ // variations for them. Then, we will check each one whether it
satisfies
+ // the existing ordering provided by the input plan.
+ let partition_by_orders = partitioning_exprs
+ .iter()
+ .map(|pb_order|
sort_options_resolving_constant(Arc::clone(pb_order)));
+ let all_satisfied_lexs = partition_by_orders
+ .multi_cartesian_product()
Review Comment:
I got an idea: check satisfy ordering one by one. I mean:
partition_by_cols: [a,b,c]
ordering_req: [x,y,z]
create a new ordering from the first N(1 initially, and increase 1 by 1)
elements.
compared_ordering: [x]
compare the variations of the first element of partition by [a INC NL] & [a
INC NF] & [a DEC NL] & [a DEC NF]
only one of them can survive (and most of the cases, none of them, and skip
the further computation)
store the surviving one, and append the next ordering element variations and
partition by column, and continue comparison until either ordering or
partition_by elements are over, or there isnt any survivor left
This algorithm would decrease the complexity to O(n) I guess
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]