rkrishn7 commented on PR #17632:
URL: https://github.com/apache/datafusion/pull/17632#issuecomment-3305050342

   > It shouldn't be too bad: without filters you'd still have to run the hash 
function once in RepartitionExec and another hash function in HashJoinExec. So 
we're running 2 times instead of 3 _for rows that match the filter_. For rows 
that are pruned we run it 1 time instead of 2. And that's only until all of the 
build sides are done, then we may run it 0 times.
   
   The `hash(...) % n != partition_id` portion of the filter gets added for 
each build partition, right? If that's the case then in the worst case we're 
running it up to `N` times just for the dynamic filter?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to