blaginin commented on code in PR #14685:
URL: https://github.com/apache/datafusion/pull/14685#discussion_r1974977572


##########
datafusion/core/src/datasource/file_format/parquet.rs:
##########
@@ -1934,7 +1934,8 @@ mod tests {
 
         // test metadata
         assert_eq!(exec.statistics()?.num_rows, Precision::Exact(8));
-        assert_eq!(exec.statistics()?.total_byte_size, Precision::Exact(671));
+        // assert_eq!(exec.statistics()?.total_byte_size, 
Precision::Exact(671));
+        // todo: uncomment when FileScanConfig::projection_stats puts byte size

Review Comment:
   I don't think those lines were correct, because tests had the same bug: 
projection was set _after_ the source, and so source statistics (which were 
used here) weren't really updated properly 
   
   <img width="629" alt="RustRover-EAP 2025-02-28 08 01 28" 
src="https://github.com/user-attachments/assets/9ed99086-e605-4365-a2ee-fff4a66f4bae";
 />
   
   A proper way to fix this is set bytes size in `projection_stats`
   
   <img width="469" alt="image" 
src="https://github.com/user-attachments/assets/c37adb95-2562-4283-b74f-de0d1fdb37bb";
 />
   
   Since it's a separate existing todo, I think it's better to change it in a 
PR on top of this one



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to