Re: [I] Add support for clickbench data and benchmark with page index [datafusion]

via GitHub Mon, 23 Jun 2025 21:42:51 -0700


zhuqi-lucas commented on issue #16427:
URL: https://github.com/apache/datafusion/issues/16427#issuecomment-2998767526


   
   Thank you @adriangb for this good point, i agree with you, and why i create 
this jira because we also can use it to mock more custom data based current 
clickbench.
   
   > Just a thought: do we need an artificial dataset to really highlight the 
problem / solution? I think it's unlikely to be measurable with a dataset that 
has 25 columns and 500 row groups, especially if we're talking about avoiding 
parsing but not even avoiding IO. My guess is if you make a dataset with [10k 
columns](https://github.com/microsoft/amudai/blob/main/docs/spec/src/what_about_parquet.md#wide-schemas)
 and 1000s of row groups we'll see a difference.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] Add support for clickbench data and benchmark with page index [datafusion]

Reply via email to