blaginin commented on code in PR #14685: URL: https://github.com/apache/datafusion/pull/14685#discussion_r1974977572
########## datafusion/core/src/datasource/file_format/parquet.rs: ########## @@ -1934,7 +1934,8 @@ mod tests { // test metadata assert_eq!(exec.statistics()?.num_rows, Precision::Exact(8)); - assert_eq!(exec.statistics()?.total_byte_size, Precision::Exact(671)); + // assert_eq!(exec.statistics()?.total_byte_size, Precision::Exact(671)); + // todo: uncomment when FileScanConfig::projection_stats puts byte size Review Comment: I don't think those lines were correct, because tests had the same bug: projection was set _after_ the source, and so source statistics (which were used here) weren't really updated properly <img width="629" alt="RustRover-EAP 2025-02-28 08 01 28" src="https://github.com/user-attachments/assets/9ed99086-e605-4365-a2ee-fff4a66f4bae" /> A proper way to fix this is set bytes size in `projection_stats` <img width="469" alt="image" src="https://github.com/user-attachments/assets/c37adb95-2562-4283-b74f-de0d1fdb37bb" /> Since it's a separate existing todo, I think it's better to change it in a PR on top of this one -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org