Re: [PR] Fix record batch memory size double counting [datafusion]

via GitHub Thu, 14 Nov 2024 00:33:24 -0800


2010YOUY01 commented on code in PR #13377:
URL: https://github.com/apache/datafusion/pull/13377#discussion_r1841768614



##########
datafusion/physical-plan/src/spill.rs:
##########
@@ -175,11 +247,103 @@ mod tests {
         )?;
 
         let file = BufReader::new(File::open(spill_file.path())?);
-        let reader = arrow::ipc::reader::FileReader::try_new(file, None)?;
+        let reader = FileReader::try_new(file, None)?;
 
         assert_eq!(reader.num_batches(), 4);
         assert_eq!(reader.schema(), schema);
 
         Ok(())
     }
+
+    #[test]
+    fn test_get_record_batch_memory_size() {
+        // Create a simple record batch with two columns
+        let schema = Arc::new(Schema::new(vec![
+            Field::new("ints", DataType::Int32, true),
+            Field::new("float64", DataType::Float64, false),
+        ]));
+
+        let int_array =
+            Int32Array::from(vec![Some(1), Some(2), Some(3), Some(4), 
Some(5)]);
+        let float64_array = Float64Array::from(vec![1.0, 2.0, 3.0, 4.0, 5.0]);
+
+        let batch = RecordBatch::try_new(
+            schema,
+            vec![Arc::new(int_array), Arc::new(float64_array)],
+        )
+        .unwrap();
+
+        let size = get_record_batch_memory_size(&batch);
+        assert_eq!(size, 60);

Review Comment:
   This is a great point. I also feel that this manual memory accounting is 
complex and error-prone. We’d better change all of it. (Maybe also use some 
RAII in the implementation, instead of manually growing and shrinking memory 
usage as we’re doing right now.)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Fix record batch memory size double counting [datafusion]

Reply via email to