Kontinuation commented on code in PR #14644: URL: https://github.com/apache/datafusion/pull/14644#discussion_r1954496042
########## datafusion/physical-plan/src/sorts/stream.rs: ########## @@ -171,20 +173,32 @@ impl<T: CursorArray> std::fmt::Debug for FieldCursorStream<T> { } impl<T: CursorArray> FieldCursorStream<T> { - pub fn new(sort: PhysicalSortExpr, streams: Vec<SendableRecordBatchStream>) -> Self { + pub fn new( + sort: PhysicalSortExpr, + streams: Vec<SendableRecordBatchStream>, + reservation: MemoryReservation, + ) -> Self { let streams = streams.into_iter().map(|s| s.fuse()).collect(); Self { sort, streams: FusedStreams(streams), + reservation, phantom: Default::default(), } } fn convert_batch(&mut self, batch: &RecordBatch) -> Result<ArrayValues<T::Values>> { let value = self.sort.expr.evaluate(batch)?; let array = value.into_array(batch.num_rows())?; + let size_in_mem = array.get_buffer_memory_size(); let array = array.as_any().downcast_ref::<T>().expect("field values"); - Ok(ArrayValues::new(self.sort.options, array)) + let mut array_reservation = self.reservation.new_empty(); + array_reservation.try_grow(size_in_mem)?; + Ok(ArrayValues::new( + self.sort.options, + array, + array_reservation, + )) Review Comment: I think this reservation is needed. When the sort expression is a simple column reference, the `array` simply reuses the buffer in `batch`, this is the case where reservation is not needed. However, the sort expression can be a complex expression such as `l_linenumber + 1`, the result of evaluation takes additional space in this case. Always reserving `array` is a more conservative approach that prevents allocations from overshooting the limit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org