Re: [PR] Remove CoalescePartitions insertion from HashJoinExec [datafusion]

2025-04-05 Thread via GitHub
berkaysynnada commented on PR #15476: URL: https://github.com/apache/datafusion/pull/15476#issuecomment-2772106637 > This was not caught by the tests 2 weeks ago, because most plans can have a partition executed multiple times - but repartition can not. As the signature of execute() i

Re: [PR] Remove CoalescePartitions insertion from HashJoinExec [datafusion]

2025-04-04 Thread via GitHub
goldmedal commented on PR #15476: URL: https://github.com/apache/datafusion/pull/15476#issuecomment-2764514075 > This PR appears to have caused CI failures for some reason so @goldmedal has a PR to revert it: > > * [Revert #15476 to fix the datafusion-examples CI failĀ  #15496](https:/

Re: [PR] Remove CoalescePartitions insertion from HashJoinExec [datafusion]

2025-04-04 Thread via GitHub
ctsk commented on PR #15476: URL: https://github.com/apache/datafusion/pull/15476#issuecomment-2762368030 I've amended the PR so that `Executionplan::execute` fails if one tries to execute such a problematic plan. -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Remove CoalescePartitions insertion from HashJoinExec [datafusion]

2025-04-04 Thread via GitHub
ctsk commented on PR #15476: URL: https://github.com/apache/datafusion/pull/15476#issuecomment-2769354999 Alright, so what went wrong here is that for CollectLeft joins, the left ExecutionPlan gets executed *for every input partition*. This was not caught by the tests 2 weeks ago, be

Re: [PR] Remove CoalescePartitions insertion from HashJoinExec [datafusion]

2025-04-02 Thread via GitHub
ctsk commented on PR #15476: URL: https://github.com/apache/datafusion/pull/15476#issuecomment-2772417654 > > Alright, so what went wrong here is that for CollectLeft joins, the left ExecutionPlan gets executed for every input partition. > The issue is now CollectLeft should always re

Re: [PR] Remove CoalescePartitions insertion from HashJoinExec [datafusion]

2025-03-30 Thread via GitHub
ctsk commented on PR #15476: URL: https://github.com/apache/datafusion/pull/15476#issuecomment-2764514013 Sorry about that! Thanks for tracking it down @goldmedal. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] Remove CoalescePartitions insertion from HashJoinExec [datafusion]

2025-03-30 Thread via GitHub
alamb commented on PR #15476: URL: https://github.com/apache/datafusion/pull/15476#issuecomment-2764497395 This PR appears to have caused CI failures for some reason so @goldmedal has a PR to revert it: - https://github.com/apache/datafusion/pull/15496 -- This is an automated message

Re: [PR] Remove CoalescePartitions insertion from HashJoinExec [datafusion]

2025-03-29 Thread via GitHub
Dandandan merged PR #15476: URL: https://github.com/apache/datafusion/pull/15476 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] Remove CoalescePartitions insertion from HashJoinExec [datafusion]

2025-03-29 Thread via GitHub
Dandandan commented on PR #15476: URL: https://github.com/apache/datafusion/pull/15476#issuecomment-2763855444 Thanks @ctsk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Remove CoalescePartitions insertion from HashJoinExec [datafusion]

2025-03-28 Thread via GitHub
ctsk commented on PR #15476: URL: https://github.com/apache/datafusion/pull/15476#issuecomment-2762302514 Before this PR, if someone hand-wired a CollectLeft HashJoin where the left child has more than one output partition, the HashJoin would automatically add a CoalescePartitions exec. Thi

Re: [PR] Remove CoalescePartitions insertion from HashJoinExec [datafusion]

2025-03-28 Thread via GitHub
comphead commented on PR #15476: URL: https://github.com/apache/datafusion/pull/15476#issuecomment-2762263935 > Note that this does break for users of HashJoinExec that > > * Use the CollectLeft mode, with >1 partition on the build side AND > * Construct their physical plan without

Re: [PR] Remove CoalescePartitions insertion from HashJoinExec [datafusion]

2025-03-28 Thread via GitHub
ctsk commented on PR #15476: URL: https://github.com/apache/datafusion/pull/15476#issuecomment-2761891586 Note that this does break for users of HashJoinExec that - Use the CollectLeft mode, with >1 partition on the build side AND - Construct their physical plan without running EnforceD