Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-16 Thread via GitHub
alamb merged PR #16083: URL: https://github.com/apache/datafusion/pull/16083 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-16 Thread via GitHub
alamb commented on PR #16083: URL: https://github.com/apache/datafusion/pull/16083#issuecomment-2978141974 🚀 -- I am feeling physically nervous that there are so many PRs open so starting the merge train! -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-14 Thread via GitHub
jonathanc-n commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2147448608 ## datafusion/physical-plan/src/joins/symmetric_hash_join.rs: ## @@ -818,6 +822,20 @@ where .collect(); (build_indices, probe_ind

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-14 Thread via GitHub
comphead commented on PR #16083: URL: https://github.com/apache/datafusion/pull/16083#issuecomment-2973494086 Thanks for this contribution, I'm planning to have this PR open for a little bit of more time to see if there are any other feedbacks -- This is an automated message from the Apac

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-14 Thread via GitHub
comphead commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2147425621 ## datafusion/physical-plan/src/joins/symmetric_hash_join.rs: ## @@ -818,6 +822,20 @@ where .collect(); (build_indices, probe_indice

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-12 Thread via GitHub
jonathanc-n commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2142033043 ## datafusion/physical-plan/src/joins/utils.rs: ## @@ -1102,6 +1129,30 @@ where .collect() } +pub(crate) fn get_mark_indices( +range: &Range, +

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-12 Thread via GitHub
jonathanc-n commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2142033043 ## datafusion/physical-plan/src/joins/utils.rs: ## @@ -1102,6 +1129,30 @@ where .collect() } +pub(crate) fn get_mark_indices( +range: &Range, +

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-12 Thread via GitHub
comphead commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2141896340 ## datafusion/core/tests/fuzz_cases/join_fuzz.rs: ## @@ -305,6 +305,31 @@ async fn test_left_mark_join_1k_filtered() { .await } +// todo: add JoinTestType

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-12 Thread via GitHub
jonathanc-n commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2142055608 ## datafusion/physical-plan/src/joins/utils.rs: ## @@ -1102,6 +1129,30 @@ where .collect() } +pub(crate) fn get_mark_indices( +range: &Range, +

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-12 Thread via GitHub
jonathanc-n commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2141998407 ## datafusion/core/tests/fuzz_cases/join_fuzz.rs: ## @@ -305,6 +305,31 @@ async fn test_left_mark_join_1k_filtered() { .await } +// todo: add JoinTestT

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-12 Thread via GitHub
comphead commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2141908140 ## datafusion/physical-plan/src/joins/utils.rs: ## @@ -1102,6 +1129,30 @@ where .collect() } +pub(crate) fn get_mark_indices( +range: &Range, +

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-12 Thread via GitHub
comphead commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2141905289 ## datafusion/core/tests/fuzz_cases/join_fuzz.rs: ## @@ -305,6 +305,31 @@ async fn test_left_mark_join_1k_filtered() { .await } +// todo: add JoinTestType

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-11 Thread via GitHub
jonathanc-n commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2141219041 ## datafusion/optimizer/src/optimize_projections/mod.rs: ## @@ -704,7 +704,8 @@ fn split_join_requirements( | JoinType::Left | JoinType::Righ

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-11 Thread via GitHub
ctsk commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2139539152 ## datafusion/physical-plan/src/joins/utils.rs: ## @@ -880,7 +901,7 @@ pub(crate) fn build_batch_from_indices( for column_index in column_indices { le

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-11 Thread via GitHub
ctsk commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2139536143 ## datafusion/physical-plan/src/joins/utils.rs: ## @@ -1102,6 +1129,30 @@ where .collect() } +pub(crate) fn get_mark_indices( +range: &Range, +inp

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-11 Thread via GitHub
ctsk commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2139507808 ## datafusion/optimizer/src/optimize_projections/mod.rs: ## @@ -704,7 +704,8 @@ fn split_join_requirements( | JoinType::Left | JoinType::Right

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-11 Thread via GitHub
ctsk commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2139507808 ## datafusion/optimizer/src/optimize_projections/mod.rs: ## @@ -704,7 +704,8 @@ fn split_join_requirements( | JoinType::Left | JoinType::Right

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-02 Thread via GitHub
jonathanc-n commented on PR #16083: URL: https://github.com/apache/datafusion/pull/16083#issuecomment-2931660916 This should be ready for another review, I've added fuzz tests and fixed up the suggestions cc @Dandandan @ctsk @comphead -- This is an automated message from the Apach

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-01 Thread via GitHub
jonathanc-n commented on PR #16083: URL: https://github.com/apache/datafusion/pull/16083#issuecomment-2928463577 > Regarding the lack of support for RightMark joins in some join operators, I believe it would be best to return an error in the constructor of those operators if they do not sup

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-01 Thread via GitHub
jonathanc-n commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2119831572 ## datafusion/physical-plan/src/joins/utils.rs: ## @@ -1126,6 +1153,28 @@ where .collect() } +pub(crate) fn get_mark_indices( +range: &Range, +

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-01 Thread via GitHub
jonathanc-n commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2119682441 ## datafusion/sql/src/unparser/plan.rs: ## @@ -738,21 +739,38 @@ impl Unparser<'_> { let negated = match join.join_type {

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-01 Thread via GitHub
ctsk commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2119313686 ## datafusion/physical-plan/src/joins/utils.rs: ## @@ -1126,6 +1153,28 @@ where .collect() } +pub(crate) fn get_mark_indices( +range: &Range, +inp

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-06-01 Thread via GitHub
ctsk commented on PR #16083: URL: https://github.com/apache/datafusion/pull/16083#issuecomment-2927498848 Alrighty! Regarding the lack of support for RightMark joins in some join operators, I believe it would be best to return an error in the constructor of those operators if they do

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-05-31 Thread via GitHub
Dandandan commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2118781458 ## datafusion/physical-plan/src/joins/utils.rs: ## @@ -975,6 +996,12 @@ pub(crate) fn adjust_indices_by_join_type( // the left_indices will not be u

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-05-30 Thread via GitHub
comphead commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2116905907 ## datafusion/physical-plan/src/joins/sort_merge_join.rs: ## @@ -221,9 +221,11 @@ impl SortMergeJoinExec { // When output schema contains only the right

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-05-30 Thread via GitHub
Dandandan commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2116661763 ## datafusion/physical-plan/src/joins/utils.rs: ## @@ -1126,6 +1153,28 @@ where .collect() } +pub(crate) fn get_mark_indices( +range: &Range, +

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-05-30 Thread via GitHub
Dandandan commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2116652162 ## datafusion/sql/src/unparser/plan.rs: ## @@ -738,21 +739,38 @@ impl Unparser<'_> { let negated = match join.join_type {

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-05-30 Thread via GitHub
ctsk commented on PR #16083: URL: https://github.com/apache/datafusion/pull/16083#issuecomment-2923436003 After updating `JoinType::supports_swap` to include LeftMark/RightMark join, the join_selection rule *should* already plan right joins where appropriate. Subsequently running the sqllog

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-05-30 Thread via GitHub
ctsk commented on PR #16083: URL: https://github.com/apache/datafusion/pull/16083#issuecomment-2923375501 Really cool work! I'll look over it in more detail over the weekend =) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-05-29 Thread via GitHub
jonathanc-n commented on PR #16083: URL: https://github.com/apache/datafusion/pull/16083#issuecomment-2920977812 @alamb Are you able to add this pull request to [here](https://github.com/apache/datafusion/issues/15885) to get some eyes on it? Thanks! -- This is an automated message from

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-05-26 Thread via GitHub
jonathanc-n commented on PR #16083: URL: https://github.com/apache/datafusion/pull/16083#issuecomment-2910279622 @2010YOUY01 @Dandandan Is it possible to take a look? Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-05-20 Thread via GitHub
jonathanc-n commented on PR #16083: URL: https://github.com/apache/datafusion/pull/16083#issuecomment-2895840655 @2010YOUY01 I think i'lll try to get sql queries optimized into a right mark join after support for symmetric hash join + sort merge join. Right mark is equivalent to left

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-05-19 Thread via GitHub
2010YOUY01 commented on PR #16083: URL: https://github.com/apache/datafusion/pull/16083#issuecomment-2890335049 This is cool! I got some questions: 1. Can we test this feature through the SQL interface (some SQL with subqueries got optimized into RightMarkJoin)? Or maybe this feature is n

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-05-17 Thread via GitHub
jonathanc-n commented on code in PR #16083: URL: https://github.com/apache/datafusion/pull/16083#discussion_r2094393265 ## datafusion/physical-plan/src/joins/nested_loop_join.rs: ## @@ -1009,15 +1010,27 @@ fn join_left_and_right_batch( right_side_ordered, )?; -

Re: [PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-05-17 Thread via GitHub
jonathanc-n commented on PR #16083: URL: https://github.com/apache/datafusion/pull/16083#issuecomment-2888794567 cc @comphead -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[PR] feat: Support RightMark join for NestedLoop and Hash join [datafusion]

2025-05-17 Thread via GitHub
jonathanc-n opened a new pull request, #16083: URL: https://github.com/apache/datafusion/pull/16083 ## Which issue does this PR close? - Closes #13138 . ## Rationale for this change Revamp implementation of the previous stale implementation for RightMark ##