geoffreyclaude commented on code in PR #15563:
URL: https://github.com/apache/datafusion/pull/15563#discussion_r2028543416


##########
datafusion/physical-plan/src/topk/mod.rs:
##########
@@ -90,15 +90,38 @@ pub struct TopK {
     scratch_rows: Rows,
     /// stores the top k values and their sort key values, in order
     heap: TopKHeap,
+    /// row converter, for common keys between the sort keys and the input 
ordering
+    common_prefix_converter: Option<RowConverter>,
+    /// Common sort prefix between the input and the sort expressions to allow 
early exit optimization
+    common_sort_prefix: Arc<[PhysicalSortExpr]>,
+    /// If true, indicates that all rows of subsequent batches are guaranteed 
to be worse than the top K
+    pub(crate) finished: bool,
+}
+
+fn build_sort_fields(
+    ordering: &LexOrdering,
+    schema: &SchemaRef,
+) -> Result<Vec<SortField>> {
+    ordering
+        .iter()
+        .map(|e| {
+            Ok(SortField::new_with_options(
+                e.expr.data_type(schema)?,
+                e.options,
+            ))
+        })
+        .collect::<Result<_>>()
 }
 
 impl TopK {
     /// Create a new [`TopK`] that stores the top `k` values, as
     /// defined by the sort expressions in `expr`.
     // TODO: make a builder or some other nicer API
+    #[allow(clippy::too_many_arguments)]

Review Comment:
   The `TopK` code could benefit from a refactoring for better readability. 
It's probably grown a bit too much out of it's original compact format... But 
leaving this out of the scope of this PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to