alamb commented on PR #21817:
URL: https://github.com/apache/datafusion/pull/21817#issuecomment-4652021544

   > My preferred long-term solution would be to implement a hash table 
specialized for semi/anti joins. I believe that would be not only faster, but 
also more general, since the optimization could apply to all data types.
   > 
   > One related idea is to fully separate the semi/anti join path from the 
existing hash join implementation. I think this would make both paths more 
organized and potentially more performant: see a related issue
   https://github.com/apache/datafusion/issues/22710
   
   
   I think @neilconway  did some similar work here (to optimize SEMI joins)
   - https://github.com/apache/datafusion/pull/22794


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to