xudong963 commented on PR #15503:
URL: https://github.com/apache/datafusion/pull/15503#issuecomment-2775970366

   @alamb @berkaysynnada My thought about unifying the two methods:
   ```rust
   /// Specifies what statistics to compute
   pub enum StatisticsType {
       /// Only compute global statistics
       Global,
       /// Only compute per-partition statistics
       Partition,
       /// Compute both global and per-partition statistics
       Both,
   }
   
   /// Holds both global and per-partition statistics
   pub struct ExecutionPlanStatistics {
       /// Global statistics for the entire plan
       pub global: Option<Statistics>,
       /// Statistics broken down by partition
       pub partition: Option<Vec<Statistics>>,
   }
   
   /// Returns statistics for this `ExecutionPlan` node based on the requested 
type.
   /// Only computes what is requested to avoid unnecessary calculations.
   fn statistics(&self, stat_type: StatisticsType) -> 
Result<ExecutionPlanStatistics> {
       match stat_type {
           StatisticsType::Global => Ok(ExecutionPlanStatistics {
               global: Some(Statistics::new_unknown(&self.schema())),
               partition: None,
           }),
           StatisticsType::Partition => {
               let partition_stats = vec![
                   Statistics::new_unknown(&self.schema());
                   self.properties().partitioning.partition_count()
               ];
               
               Ok(ExecutionPlanStatistics {
                   global: None,
                   partition: Some(partition_stats),
               })
           },
           StatisticsType::Both => {
               let partition_stats = vec![
                   Statistics::new_unknown(&self.schema());
                   self.properties().partitioning.partition_count()
               ];
               
               // Could merge partition stats here for global stats if needed
               let global_stats = Statistics::new_unknown(&self.schema());
               
               Ok(ExecutionPlanStatistics {
                   global: Some(global_stats),
                   partition: Some(partition_stats),
               })
           }
       }
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to