suremarc commented on code in PR #13296:
URL: https://github.com/apache/datafusion/pull/13296#discussion_r1835029153


##########
datafusion/physical-plan/src/statistics.rs:
##########
@@ -277,6 +262,44 @@ impl MinMaxStatistics {
             .zip(self.min_by_sort_order.iter().skip(1))
             .all(|(max, next_min)| max < next_min)
     }
+
+    /// Computes a bin-packing of the min/max rows in these statistics
+    /// into chains, such that elements in a chain are non-overlapping and 
ordered
+    /// amongst one another.
+    /// This bin-packing is optimal in the sense that it has the fewest number 
of chains.
+    pub fn first_fit(&self) -> Vec<Vec<usize>> {

Review Comment:
   If no ranges overlapped, they could all be ordered into a single chain. If 
some ranges _do_ overlap, they get placed into separate chains. The check for 
non-overlapping-ness happens in this logic: 
https://github.com/apache/datafusion/blob/31d371695478839b3112b788ec1966142a6fc0ff/datafusion/physical-plan/src/statistics.rs#L286-L294



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to