Re: [I] Add support for clickbench data and benchmark with page index [datafusion]

via GitHub Mon, 23 Jun 2025 11:33:16 -0700


adriangb commented on issue #16427:
URL: https://github.com/apache/datafusion/issues/16427#issuecomment-2997541861


   Just a thought: do we need an artificial dataset to really highlight the 
problem / solution? I think it's unlikely to be measurable with a dataset that 
has 25 columns and 500 row groups, especially if we're talking about avoiding 
parsing but not even avoiding IO. My guess is if you make a dataset with [10k 
columns](https://github.com/microsoft/amudai/blob/main/docs/spec/src/what_about_parquet.md#wide-schemas)
 and 1000s of row groups we'll see a difference.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] Add support for clickbench data and benchmark with page index [datafusion]

Reply via email to