thinkharderdev commented on issue #12454: URL: https://github.com/apache/datafusion/issues/12454#issuecomment-2350031852
> One challenge I predict with the above scenario is that it seems to assume that the order of rows from the build side will be the same on all nodes across all partitions (so you can match up the BooleanBuffer across ndoes) Yeah, this is definitely a challenge in the general case if the build-side subquery is inlined into the hash join. In our case the build side subquery is a separate stage so we can pretty easily ensure a consistent row ordering since the its read from a shuffle -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
