viirya commented on code in PR #11218:
URL: https://github.com/apache/datafusion/pull/11218#discussion_r1682189556
##########
datafusion/physical-plan/src/spill.rs:
##########
@@ -85,3 +85,104 @@ fn read_spill(sender: Sender<Result<RecordBatch>>, path:
&Path) -> Result<()> {
}
Ok(())
}
+
+/// Spill the `RecordBatch` to disk as smaller batches
+/// split by `batch_size_rows`
+/// Return `total_rows` what is spilled
+pub fn spill_record_batch_by_size(
+ batch: &RecordBatch,
+ path: PathBuf,
+ schema: SchemaRef,
+ batch_size_rows: usize,
+) -> Result<usize> {
Review Comment:
Do we need to returned size? Is it possible it spills more/less the number
of rows of the batch?
If not, maybe just `Result<()>`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]