adriangb commented on issue #16427:
URL: https://github.com/apache/datafusion/issues/16427#issuecomment-2997541861

   Just a thought: do we need an artificial dataset to really highlight the 
problem / solution? I think it's unlikely to be measurable with a dataset that 
has 25 columns and 500 row groups, especially if we're talking about avoiding 
parsing but not even avoiding IO. My guess is if you make a dataset with [10k 
columns](https://github.com/microsoft/amudai/blob/main/docs/spec/src/what_about_parquet.md#wide-schemas)
 and 1000s of row groups we'll see a difference.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to