Re: [I] Proposal: Hook to better support `CollectLeft` joins in distributed execution [datafusion]

via GitHub Fri, 13 Sep 2024 12:41:23 -0700


thinkharderdev commented on issue #12454:
URL: https://github.com/apache/datafusion/issues/12454#issuecomment-2350031852


   > One challenge I predict with the above scenario is that it seems to assume 
that the order of rows from the build side will be the same on all nodes across 
all partitions (so you can match up the BooleanBuffer across ndoes)
   
   Yeah, this is definitely a challenge in the general case if the build-side 
subquery is inlined into the hash join. In our case the build side subquery is 
a separate stage so we can pretty easily ensure a consistent row ordering since 
the its read from a shuffle 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Proposal: Hook to better support `CollectLeft` joins in distributed execution [datafusion]

Reply via email to