jayzhan211 commented on issue #11680: URL: https://github.com/apache/datafusion/issues/11680#issuecomment-2259425914
The experiment I did in #11708 shows that 1. There is no much difference for clickbench Q17 2. Outperform for high cardinality, row num 2,000,000 with all the value is different 3. Simplify Repartition Hash code largely @alamb If the benchmark code looks good to you, I think we could reuse hash To further improve clickbench Q17, the bottleneck is now arrow::Row (RowConverter::append, and Rows::push), do you think there is room for improvement? or should we find a way to reduce Rows by design -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
