Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-05-10 Thread via GitHub
alamb commented on PR #15409: URL: https://github.com/apache/datafusion/pull/15409#issuecomment-2868765074 Thank you @wiedld and @2010YOUY01 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-05-10 Thread via GitHub
alamb merged PR #15409: URL: https://github.com/apache/datafusion/pull/15409 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-05-07 Thread via GitHub
wiedld commented on PR #15409: URL: https://github.com/apache/datafusion/pull/15409#issuecomment-2858928895 @2010YOUY01 -- comments addressed, and a conditional removed due to this reason: https://github.com/apache/datafusion/pull/15409#discussion_r2076020887 -- This is an automated messa

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-05-06 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2076020887 ## datafusion/datasource/src/memory.rs: ## @@ -723,6 +761,222 @@ impl MemorySourceConfig { pub fn original_schema(&self) -> SchemaRef { Arc::clone(&se

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-05-06 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2076020887 ## datafusion/datasource/src/memory.rs: ## @@ -723,6 +761,222 @@ impl MemorySourceConfig { pub fn original_schema(&self) -> SchemaRef { Arc::clone(&se

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-05-06 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2076020887 ## datafusion/datasource/src/memory.rs: ## @@ -723,6 +761,222 @@ impl MemorySourceConfig { pub fn original_schema(&self) -> SchemaRef { Arc::clone(&se

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-04-28 Thread via GitHub
2010YOUY01 commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2065594544 ## datafusion/datasource/src/memory.rs: ## @@ -723,6 +761,222 @@ impl MemorySourceConfig { pub fn original_schema(&self) -> SchemaRef { Arc::clone

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-04-28 Thread via GitHub
2010YOUY01 commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2065566166 ## datafusion/core/tests/physical_optimizer/enforce_distribution.rs: ## @@ -3471,3 +3477,102 @@ fn optimize_away_unnecessary_repartition2() -> Result<()> {

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-04-21 Thread via GitHub
wiedld commented on PR #15409: URL: https://github.com/apache/datafusion/pull/15409#issuecomment-2820043803 Took me awhile to circle back to this PR. I believe it's now ready for re-review @2010YOUY01 -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-04-21 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2053132567 ## datafusion/datasource/src/memory.rs: ## @@ -902,4 +1130,319 @@ mod tests { Ok(()) } + +fn batch(row_size: usize) -> RecordBatch { +le

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-04-21 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2053131373 ## datafusion/core/tests/fuzz_cases/aggregate_fuzz.rs: ## @@ -520,7 +520,9 @@ async fn group_by_string_test( let expected = compute_counts(&input, column_name)

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-04-21 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2053132567 ## datafusion/datasource/src/memory.rs: ## @@ -902,4 +1130,319 @@ mod tests { Ok(()) } + +fn batch(row_size: usize) -> RecordBatch { +le

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-04-21 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2053131713 ## datafusion/datasource/src/memory.rs: ## @@ -440,6 +443,35 @@ impl DataSource for MemorySourceConfig { } } +fn repartitioned( Review Comment:

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-04-21 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2053131373 ## datafusion/core/tests/fuzz_cases/aggregate_fuzz.rs: ## @@ -520,7 +520,9 @@ async fn group_by_string_test( let expected = compute_counts(&input, column_name)

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-04-03 Thread via GitHub
wiedld commented on PR #15409: URL: https://github.com/apache/datafusion/pull/15409#issuecomment-2777243569 Thanks for the review. Haven't had time to do the updates. Converting to draft, and will mark ready again after updates. -- This is an automated message from the Apache Git

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-28 Thread via GitHub
2010YOUY01 commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2018210999 ## datafusion/datasource/src/memory.rs: ## @@ -902,4 +1130,319 @@ mod tests { Ok(()) } + +fn batch(row_size: usize) -> RecordBatch { +

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-28 Thread via GitHub
2010YOUY01 commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2018202146 ## datafusion/datasource/src/memory.rs: ## @@ -440,6 +443,35 @@ impl DataSource for MemorySourceConfig { } } +fn repartitioned( Review Comm

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-27 Thread via GitHub
alamb commented on PR #15409: URL: https://github.com/apache/datafusion/pull/15409#issuecomment-2759400185 @2010YOUY01 and @alan910127 -- do you have time to review this PR as you filed / expressed interested on https://github.com/apache/datafusion/issues/15088#top -- This is an auto

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-25 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2011766884 ## datafusion/core/tests/physical_optimizer/enforce_distribution.rs: ## @@ -3461,3 +3467,102 @@ fn optimize_away_unnecessary_repartition2() -> Result<()> {

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-25 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2011720446 ## datafusion/core/tests/fuzz_cases/aggregate_fuzz.rs: ## @@ -520,7 +520,9 @@ async fn group_by_string_test( let expected = compute_counts(&input, column_name)

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-25 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2011766884 ## datafusion/core/tests/physical_optimizer/enforce_distribution.rs: ## @@ -3461,3 +3467,102 @@ fn optimize_away_unnecessary_repartition2() -> Result<()> {

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-25 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2011720446 ## datafusion/core/tests/fuzz_cases/aggregate_fuzz.rs: ## @@ -520,7 +520,9 @@ async fn group_by_string_test( let expected = compute_counts(&input, column_name)

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-25 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2011688314 ## datafusion/core/tests/fuzz_cases/aggregate_fuzz.rs: ## @@ -520,7 +520,9 @@ async fn group_by_string_test( let expected = compute_counts(&input, column_name)

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-25 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2011691796 ## datafusion/datasource/src/memory.rs: ## @@ -718,6 +750,181 @@ impl MemorySourceConfig { pub fn original_schema(&self) -> SchemaRef { Arc::clone(&se

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-25 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2011691796 ## datafusion/datasource/src/memory.rs: ## @@ -718,6 +750,181 @@ impl MemorySourceConfig { pub fn original_schema(&self) -> SchemaRef { Arc::clone(&se

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-25 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2011688314 ## datafusion/core/tests/fuzz_cases/aggregate_fuzz.rs: ## @@ -520,7 +520,9 @@ async fn group_by_string_test( let expected = compute_counts(&input, column_name)

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-25 Thread via GitHub
wiedld commented on PR #15409: URL: https://github.com/apache/datafusion/pull/15409#issuecomment-2750626791 TODO: I'll run benchmarks on this tmrw. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-25 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2011678484 ## datafusion/core/tests/fuzz_cases/aggregate_fuzz.rs: ## @@ -520,7 +520,9 @@ async fn group_by_string_test( let expected = compute_counts(&input, column_name)

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-25 Thread via GitHub
wiedld commented on PR #15409: URL: https://github.com/apache/datafusion/pull/15409#issuecomment-2750536557 TODO: I'll run benchmarks on this tmrw. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-25 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2011599235 ## datafusion/core/tests/memory_limit/mod.rs: ## @@ -455,7 +455,9 @@ async fn test_stringview_external_sort() { .with_memory_pool(Arc::new(FairSpillPool::n

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-25 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2011496396 ## datafusion/datasource/src/memory.rs: ## @@ -902,4 +1108,319 @@ mod tests { Ok(()) } + +fn batch(row_size: usize) -> RecordBatch { +le

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-25 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2011495377 ## datafusion/datasource/src/memory.rs: ## @@ -902,4 +1108,319 @@ mod tests { Ok(()) } + +fn batch(row_size: usize) -> RecordBatch { +le

Re: [PR] Enable repartitioning on MemTable. [datafusion]

2025-03-25 Thread via GitHub
wiedld commented on code in PR #15409: URL: https://github.com/apache/datafusion/pull/15409#discussion_r2011495377 ## datafusion/datasource/src/memory.rs: ## @@ -902,4 +1108,319 @@ mod tests { Ok(()) } + +fn batch(row_size: usize) -> RecordBatch { +le