alamb commented on issue #16149: URL: https://github.com/apache/datafusion/issues/16149#issuecomment-2976129703
> I am also curious: **Why would uncompressed Parquet be considered an optimization over Snappy-compressed Parquet?** Is the decompression overhead of Snappy significant enough to slow down read performance? Yes, exactly this -- once you have the data locally, the speed of block decompression like snappy often dominates the query performance. Of course using no decompression comes at a tradeoff of file size / more network required -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org