adriangb commented on PR #17632: URL: https://github.com/apache/datafusion/pull/17632#issuecomment-3305036177
It shouldn't be too bad: without filters you'd still have to run the hash function once in RepartitionExec and another hash function in HashJoinExec. So we're running 2 times instead of 3 _for rows that match the filter_. For rows that are pruned we run it 1 time instead of 2. And that's only until all of the build sides are done, then we may run it 0 times. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org