2010YOUY01 commented on code in PR #21976:
URL: https://github.com/apache/datafusion/pull/21976#discussion_r3194201853


##########
datafusion/physical-optimizer/src/optimizer.rs:
##########
@@ -170,18 +169,12 @@ impl PhysicalOptimizer {
             // those are handled by the later `FilterPushdown` rule.
             // See `FilterPushdownPhase` for more details.
             Arc::new(FilterPushdown::new()),
-            // The EnforceDistribution rule is for adding essential 
repartitioning to satisfy distribution
-            // requirements. Please make sure that the whole plan tree is 
determined before this rule.
-            // This rule increases parallelism if doing so is beneficial to 
the physical plan; i.e. at
-            // least one of the operators in the plan benefits from increased 
parallelism.
-            Arc::new(EnforceDistribution::new()),
-            // The CombinePartialFinalAggregate rule should be applied after 
the EnforceDistribution rule
+            // EnsureRequirements: merged EnforceDistribution + EnforceSorting 
into a
+            // single idempotent rule with distribution-aware pushdown_sorts.
+            // See https://github.com/apache/datafusion/issues/21973

Review Comment:
   ```suggestion
               // Ensures each input plan satisfies the distribution and 
ordering requirements
               // declared by `ExecutionPlan::required_input_distribution` and
               // `ExecutionPlan::required_input_ordering`.
               // If the requirements are already satisfied, this rule leaves 
the plan
               // unchanged. For example, it does not add sorting when the 
input is a file
               // scan whose existing order already satisfies the required 
ordering.
               // Otherwise, this rule inserts the necessary repartitioning and 
sorting
               // operators.
               // This used to be implemented as two separate rules: 
`EnforceDistribution`
               // and `EnforceSorting`. It is now a single idempotent rule with
               // distribution-aware `pushdown_sorts`.
               // See https://github.com/apache/datafusion/issues/21973.
   ```
   Added more comments since this is the entry point.
   
   I also have a question: What is this 'distribution-aware pushdown_sorts'? We 
cloud link some reference to it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to