Huy1Ng commented on issue #1375:
URL: 
https://github.com/apache/datafusion-ballista/issues/1375#issuecomment-3766327789

   At the current stage, dynamic filter only work in **a single task** . Let' 
say you have cluster with 1 scheduler and 1 executor. 
   You make this query "select * from line_items sort by line_id asc". The 
schedule plans 2 task, and send to the sole executor. Each task will be able to 
leverage the new threshold it found in TopK to reduce number of rows scan 
upstream, *but these tasks cannot share this information to each other*. 
   
   My approach is to solve this local problem with by having tasks on a single 
executor share this dynamic filter with each others. Then we can proceed to 
look for ways to sync this formation across the network boundary.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to