Re: [PR] POC: Reduce `Arc` cloning on hashmap build side [datafusion]

2025-06-20 Thread via GitHub
jonathanc-n closed pull request #16380: POC: Reduce `Arc` cloning on hashmap build side URL: https://github.com/apache/datafusion/pull/16380 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] POC: Reduce `Arc` cloning on hashmap build side [datafusion]

2025-06-12 Thread via GitHub
Dandandan commented on PR #16380: URL: https://github.com/apache/datafusion/pull/16380#issuecomment-2969139729 > I've noticed that it is possible for `interleave` to perform worse than `take` despite the `Arc` clones from `take`. This happens twice as well for `equal_row_arr` and `build_bat

Re: [PR] POC: Reduce `Arc` cloning on hashmap build side [datafusion]

2025-06-12 Thread via GitHub
Dandandan commented on code in PR #16380: URL: https://github.com/apache/datafusion/pull/16380#discussion_r2142058321 ## datafusion/physical-plan/src/joins/hash_join.rs: ## @@ -95,9 +96,11 @@ struct JoinLeftData { /// The hash table with indices into `batch` hash_map:

Re: [PR] POC: Reduce `Arc` cloning on hashmap build side [datafusion]

2025-06-12 Thread via GitHub
Dandandan commented on code in PR #16380: URL: https://github.com/apache/datafusion/pull/16380#discussion_r2142059305 ## datafusion/physical-plan/src/joins/hash_join.rs: ## @@ -991,52 +998,72 @@ async fn collect_left_input( let mut hashmap = JoinHashMap::with_capacity(num

Re: [PR] POC: Reduce `Arc` cloning on hashmap build side [datafusion]

2025-06-11 Thread via GitHub
jonathanc-n commented on code in PR #16380: URL: https://github.com/apache/datafusion/pull/16380#discussion_r2141169183 ## datafusion/physical-plan/src/joins/hash_join.rs: ## @@ -1372,15 +1407,16 @@ pub fn equal_rows_arr( // The results are then folded (combined) using the

Re: [PR] POC: Reduce `Arc` cloning on hashmap build side [datafusion]

2025-06-11 Thread via GitHub
jonathanc-n commented on PR #16380: URL: https://github.com/apache/datafusion/pull/16380#issuecomment-2964423579 I've noticed that it is possible for `interleave` to perform worse than `take` despite the `Arc` clones from `take`. This happens twice as well for `equal_row_arr` and `build_bat

Re: [PR] POC: Reduce `Arc` cloning on hashmap build side [datafusion]

2025-06-11 Thread via GitHub
jonathanc-n commented on code in PR #16380: URL: https://github.com/apache/datafusion/pull/16380#discussion_r2141169183 ## datafusion/physical-plan/src/joins/hash_join.rs: ## @@ -1372,15 +1407,16 @@ pub fn equal_rows_arr( // The results are then folded (combined) using the

Re: [PR] POC: Reduce `Arc` cloning on hashmap build side [datafusion]

2025-06-11 Thread via GitHub
jonathanc-n commented on code in PR #16380: URL: https://github.com/apache/datafusion/pull/16380#discussion_r2141135830 ## datafusion/physical-plan/src/joins/hash_join.rs: ## @@ -95,9 +96,11 @@ struct JoinLeftData { /// The hash table with indices into `batch` hash_map

[PR] POC: Reduce `Arc` cloning on hashmap build side [datafusion]

2025-06-11 Thread via GitHub
jonathanc-n opened a new pull request, #16380: URL: https://github.com/apache/datafusion/pull/16380 ## Which issue does this PR close? - Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes tes