Re: [PR] Improve error message when ScalarValue fails to cast array [datafusion]

2025-07-03 Thread via GitHub
findepi commented on PR #16670: URL: https://github.com/apache/datafusion/pull/16670#issuecomment-3033380496 test added -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] chore: Introduce ANSI support for remainder operation [datafusion-comet]

2025-07-03 Thread via GitHub
rishvin commented on PR #1971: URL: https://github.com/apache/datafusion-comet/pull/1971#issuecomment-3033380721 @andygrove : Could you please review and check if this is how you envisioned these changes? -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3032383269 > @alamb could you maybe run it on `sort_tpch10`. Perhaps the difference is different when using more cores (16 vs 10 I believe?). I have queued this up and it should run in a fe

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3032511251 🤖 `./gh_compare_branch.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh) Running Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubun

Re: [PR] Support multiple ordered array_agg aggregations [datafusion]

2025-07-03 Thread via GitHub
findepi commented on PR #16625: URL: https://github.com/apache/datafusion/pull/16625#issuecomment-3032514381 We might be able to get away with just one and I agree this would be eventually preferred -- just like it was implemented earlier in this PR. As you know this didn't work for the pro

[I] [iceberg] Error loading in-memory sorter check class path [datafusion-comet]

2025-07-03 Thread via GitHub
andygrove opened a new issue, #1982: URL: https://github.com/apache/datafusion-comet/issues/1982 ### Describe the bug I am fuzz testing Comet 0.10.0-SNAPSHOT and Iceberg 1.8.1 and running into an error for queries with `ORDER BY` clauses: ``` SELECT c5, Sqrt(c5) AS x FROM lo

Re: [PR] Fix TopK Sort incorrectly pushed down past Join with anti join [datafusion]

2025-07-03 Thread via GitHub
adriangb commented on code in PR #16641: URL: https://github.com/apache/datafusion/pull/16641#discussion_r2183060691 ## datafusion/physical-optimizer/src/enforce_sorting/sort_pushdown.rs: ## @@ -216,7 +218,42 @@ fn pushdown_sorts_helper( fn pushdown_requirement_to_children(

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3032659548 🤖: Benchmark completed Details ``` Comparing HEAD and reuse_rows Benchmark clickbench_extended.json ┏━━

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3032659676 🤖 `./gh_compare_branch.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh) Running Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubun

Re: [PR] Fix TopK Sort incorrectly pushed down past Join with anti join [datafusion]

2025-07-03 Thread via GitHub
zhuqi-lucas commented on PR #16641: URL: https://github.com/apache/datafusion/pull/16641#issuecomment-3032766023 > left a minor nit to reduce indentation but otherwise approved, great job! Thank you @adriangb for patient review, addressed it in latest PR. -- This is an automated mes

Re: [PR] Convert Option> to Vec [datafusion]

2025-07-03 Thread via GitHub
findepi merged PR #16615: URL: https://github.com/apache/datafusion/pull/16615 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafu

Re: [I] Convert `Option>` to `Vec` [datafusion]

2025-07-03 Thread via GitHub
findepi closed issue #12195: Convert `Option>` to `Vec` URL: https://github.com/apache/datafusion/issues/12195 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] chore: Introduce ANSI support for remainder operation [datafusion-comet]

2025-07-03 Thread via GitHub
andygrove commented on PR #1971: URL: https://github.com/apache/datafusion-comet/pull/1971#issuecomment-3033423639 > @andygrove : Could you please review and check if this is how you envisioned these changes? Thanks @rishvin. I took a first pass through, and this looks great. Thanks

Re: [PR] fix: sqllogictest runner label condition mismatch [datafusion]

2025-07-03 Thread via GitHub
lliangyu-lin commented on code in PR #16633: URL: https://github.com/apache/datafusion/pull/16633#discussion_r2183648969 ## datafusion/sqllogictest/bin/sqllogictests.rs: ## @@ -243,7 +243,7 @@ async fn run_test_file_substrait_round_trip( }; setup_scratch_dir(&relative_

Re: [PR] Implementation for regex_instr [datafusion]

2025-07-03 Thread via GitHub
blaginin merged PR #15928: URL: https://github.com/apache/datafusion/pull/15928 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [I] Add regexp function - regexp_instr() [datafusion]

2025-07-03 Thread via GitHub
blaginin closed issue #13009: Add regexp function - regexp_instr() URL: https://github.com/apache/datafusion/issues/13009 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] Chore: Implement BloomFilterMightContain as a ScalarUDFImpl [datafusion-comet]

2025-07-03 Thread via GitHub
tglanz commented on code in PR #1954: URL: https://github.com/apache/datafusion-comet/pull/1954#discussion_r2184397234 ## native/spark-expr/src/bloom_filter/spark_bit_array.rs: ## @@ -102,8 +101,10 @@ impl SparkBitArray { } } +/// Returns the number of 64-bit words neede

Re: [PR] Cascaded spill merge and re-spill [datafusion]

2025-07-03 Thread via GitHub
2010YOUY01 commented on PR #15610: URL: https://github.com/apache/datafusion/pull/15610#issuecomment-3034531555 > This PR has a significant strength in that it works reliably even under a fairly conservative memory limit, which is impressive. I also learned a lot while reviewing it :). Howe

Re: [PR] Chore: Implement BloomFilterMightContain as a ScalarUDFImpl [datafusion-comet]

2025-07-03 Thread via GitHub
tglanz commented on code in PR #1954: URL: https://github.com/apache/datafusion-comet/pull/1954#discussion_r2184397234 ## native/spark-expr/src/bloom_filter/spark_bit_array.rs: ## @@ -102,8 +101,10 @@ impl SparkBitArray { } } +/// Returns the number of 64-bit words neede

Re: [I] 1000x slowdown opening parquet file due to partitions [datafusion]

2025-07-03 Thread via GitHub
jatin510 commented on issue #16676: URL: https://github.com/apache/datafusion/issues/16676#issuecomment-3034536746 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Support multiple ordered array_agg aggregations [datafusion]

2025-07-03 Thread via GitHub
ozankabak commented on PR #16625: URL: https://github.com/apache/datafusion/pull/16625#issuecomment-3032265152 OK, I see. I will think about the interplay between the two and if we can somehow can get away with just one. -- This is an automated message from the Apache Git Service. To resp

Re: [PR] Refactor error handling to use boxed errors for DataFusionError variants [datafusion]

2025-07-03 Thread via GitHub
comphead commented on PR #16672: URL: https://github.com/apache/datafusion/pull/16672#issuecomment-3032828771 cc @crepererum There is another PR pending for `SchemaError` https://github.com/apache/datafusion/pull/16653 -- This is an automated message from the Apache Git Service. To res

Re: [I] Fix shading issues with Iceberg integration [datafusion-comet]

2025-07-03 Thread via GitHub
andygrove commented on issue #1934: URL: https://github.com/apache/datafusion-comet/issues/1934#issuecomment-3032848449 > [@andygrove](https://github.com/andygrove) [@parthchandra](https://github.com/parthchandra) Do we want to keep the current Arrow shading in Iceberg as is and put both t

Re: [I] Fix shading issues with Iceberg integration [datafusion-comet]

2025-07-03 Thread via GitHub
andygrove closed issue #1934: Fix shading issues with Iceberg integration URL: https://github.com/apache/datafusion-comet/issues/1934 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] Postgres: support `ADD CONSTRAINT NOT VALID` and `VALIDATE CONSTRAINT` [datafusion-sqlparser-rs]

2025-07-03 Thread via GitHub
iffyio merged PR #1908: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1908 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [I] Postgres NOT VALID and VALIDATE CONSTRAINT not parsed for ALTER TABLE [datafusion-sqlparser-rs]

2025-07-03 Thread via GitHub
iffyio closed issue #1907: Postgres NOT VALID and VALIDATE CONSTRAINT not parsed for ALTER TABLE URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1907 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] Redshift alter column type no set [datafusion-sqlparser-rs]

2025-07-03 Thread via GitHub
iffyio merged PR #1912: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1912 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [PR] docs: Minor improvements to Spark SQL test docs [datafusion-comet]

2025-07-03 Thread via GitHub
andygrove merged PR #1980: URL: https://github.com/apache/datafusion-comet/pull/1980 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [I] Fix shading issues with Iceberg integration [datafusion-comet]

2025-07-03 Thread via GitHub
huaxingao commented on issue #1934: URL: https://github.com/apache/datafusion-comet/issues/1934#issuecomment-3032844548 @andygrove @parthchandra Do we want to keep the current Arrow shading in Iceberg as is and put both the Comet and Iceberg JARs on the classpath? If that's the plan, I

Re: [PR] Add support for MySQL MEMBER OF [datafusion-sqlparser-rs]

2025-07-03 Thread via GitHub
iffyio merged PR #1917: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1917 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [PR] Add span for `Expr::TypedString` [datafusion-sqlparser-rs]

2025-07-03 Thread via GitHub
iffyio merged PR #1919: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1919 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-03 Thread via GitHub
zhuqi-lucas commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3032768991 > > 🤖: Benchmark completed > > Details > > ``` > > Comparing HEAD and reuse_rows > > > > Benchmark sort_tpch10.json > >

Re: [PR] chore(deps): bump tokio from 1.45.1 to 1.46.0 [datafusion]

2025-07-03 Thread via GitHub
comphead merged PR #1: URL: https://github.com/apache/datafusion/pull/1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] Fix TopK Sort incorrectly pushed down past Join with anti join [datafusion]

2025-07-03 Thread via GitHub
adriangb commented on PR #16641: URL: https://github.com/apache/datafusion/pull/16641#issuecomment-3032778175 > > left a minor nit to reduce indentation but otherwise approved, great job! > > Thank you @adriangb for patient review, addressed it in latest PR. Btw, I think you mea

Re: [PR] Support for Postgres's CREATE SERVER [datafusion-sqlparser-rs]

2025-07-03 Thread via GitHub
solontsev commented on PR #1914: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1914#issuecomment-3032911536 Thanks @iffyio! Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] fix: [Iceberg] Fix decimal corruption [datafusion-comet]

2025-07-03 Thread via GitHub
andygrove opened a new pull request, #1985: URL: https://github.com/apache/datafusion-comet/pull/1985 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes

Re: [PR] Convert Option> to Vec [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16615: URL: https://github.com/apache/datafusion/pull/16615#issuecomment-3033391089 > Do i understand correctly the next major is the next release going off main branch? yes > I.e. we can merge as is now? Let's do it! -- This is an a

Re: [PR] Support multiple ordered array_agg aggregations [datafusion]

2025-07-03 Thread via GitHub
ozankabak commented on PR #16625: URL: https://github.com/apache/datafusion/pull/16625#issuecomment-3032589803 I think we will find a solution that avoids this redundancy. Expect some feedback from me (or someone on my team) in a few days -- This is an automated message from the Apache Gi

Re: [PR] Comet 0.9.0 [datafusion-site]

2025-07-03 Thread via GitHub
kazuyukitanimura commented on code in PR #78: URL: https://github.com/apache/datafusion-site/pull/78#discussion_r2183211451 ## content/blog/2025-07-01-datafusion-comet-0.9.0.md: ## @@ -0,0 +1,176 @@ +--- +layout: post +title: Apache DataFusion Comet 0.9.0 Release +date: 2025-07-

Re: [PR] Fix TopK Sort incorrectly pushed down past Join with anti join [datafusion]

2025-07-03 Thread via GitHub
adriangb merged PR #16641: URL: https://github.com/apache/datafusion/pull/16641 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

[I] Postgres: Support negative scale for `NUMERIC` [datafusion-sqlparser-rs]

2025-07-03 Thread via GitHub
mvzink opened a new issue, #1923: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1923 When specifying explicit precision and scale for numerics, Postgres allows the ranges `0..=1000` for precision and `-1000..=1000` for scale. Currently, `ExactNumberInfo` uses `u64` to repres

Re: [PR] fix: [Iceberg] Fix decimal corruption [datafusion-comet]

2025-07-03 Thread via GitHub
andygrove merged PR #1985: URL: https://github.com/apache/datafusion-comet/pull/1985 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [I] [iceberg] Incorrect results when reading decimal fields [datafusion-comet]

2025-07-03 Thread via GitHub
andygrove closed issue #1983: [iceberg] Incorrect results when reading decimal fields URL: https://github.com/apache/datafusion-comet/issues/1983 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] [iceberg] Incorrect results when reading decimal fields [datafusion-comet]

2025-07-03 Thread via GitHub
andygrove commented on issue #1983: URL: https://github.com/apache/datafusion-comet/issues/1983#issuecomment-3033259469 Fixed in https://github.com/apache/datafusion-comet/pull/1985 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] benchmark: Support sort_tpch10 for benchmark [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16671: URL: https://github.com/apache/datafusion/pull/16671#issuecomment-3032178617 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] Add SchemaAdapterFactory Support for ListingTable with Schema Evolution and Mapping [datafusion]

2025-07-03 Thread via GitHub
adriangb commented on code in PR #16583: URL: https://github.com/apache/datafusion/pull/16583#discussion_r2182722840 ## datafusion/core/src/datasource/listing/table.rs: ## @@ -67,8 +69,62 @@ pub enum SchemaSource { /// Configuration for creating a [`ListingTable`] /// +/// #

Re: [PR] benchmark: Support sort_tpch10 for benchmark [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16671: URL: https://github.com/apache/datafusion/pull/16671#issuecomment-3032179243 Thanks @zhuqi-lucas and @Dandandan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Add SchemaAdapterFactory Support for ListingTable with Schema Evolution and Mapping [datafusion]

2025-07-03 Thread via GitHub
adriangb commented on code in PR #16583: URL: https://github.com/apache/datafusion/pull/16583#discussion_r2182725943 ## datafusion/core/src/datasource/listing/table.rs: ## @@ -302,11 +405,58 @@ impl ListingTableConfig { file_schema: self.file_schema,

Re: [PR] refactor filter pushdown APIs [datafusion]

2025-07-03 Thread via GitHub
adriangb commented on code in PR #16642: URL: https://github.com/apache/datafusion/pull/16642#discussion_r2182744547 ## datafusion/physical-plan/src/execution_plan.rs: ## @@ -520,10 +520,19 @@ pub trait ExecutionPlan: Debug + DisplayAs + Send + Sync { parent_filters: Ve

[PR] Refactor error handling to use boxed errors for DataFusionError variants [datafusion]

2025-07-03 Thread via GitHub
kosiew opened a new pull request, #16672: URL: https://github.com/apache/datafusion/pull/16672 ## Which issue does this PR close? - Part of a series of PR to address #16652. ## Rationale for this change This change standardizes the internal representation of several `Dat

Re: [PR] docs: Documentation updates for 0.9.0 release [datafusion-comet]

2025-07-03 Thread via GitHub
andygrove commented on code in PR #1981: URL: https://github.com/apache/datafusion-comet/pull/1981#discussion_r2182758032 ## docs/source/user-guide/installation.md: ## @@ -61,12 +61,13 @@ Cloud Service Providers. Comet jar files are available in [Maven Central](https://central

Re: [PR] docs: Documentation updates for 0.9.0 release [datafusion-comet]

2025-07-03 Thread via GitHub
andygrove commented on code in PR #1981: URL: https://github.com/apache/datafusion-comet/pull/1981#discussion_r2182760995 ## docs/source/user-guide/tuning.md: ## @@ -21,18 +21,6 @@ under the License. Comet provides some tuning options to help you get the best performance from

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3032716133 🤖: Benchmark completed Details ``` Comparing HEAD and reuse_rows Benchmark sort_tpch1.json ┏━━┳

Re: [PR] Fix duplicate field name error in Join::try_new_with_project_input during physical planning [datafusion]

2025-07-03 Thread via GitHub
LiaCastaneda commented on code in PR #16454: URL: https://github.com/apache/datafusion/pull/16454#discussion_r2183300447 ## datafusion/core/src/physical_planner.rs: ## @@ -1502,6 +1521,64 @@ fn get_null_physical_expr_pair( Ok((Arc::new(null_value), physical_name)) } +///

Re: [I] Support `from_unixtime(ts, [fmt])` [datafusion]

2025-07-03 Thread via GitHub
comphead commented on issue #16577: URL: https://github.com/apache/datafusion/issues/16577#issuecomment-3032986631 > > I'd still keep it in a format of `from_unixtime(ts, [fmt])` > > That would mean either smart parsing the second arg (fmt? tz?) or breaking existing usages. I'm fine w

Re: [PR] Fix duplicate field name error in Join::try_new_with_project_input during physical planning [datafusion]

2025-07-03 Thread via GitHub
LiaCastaneda commented on code in PR #16454: URL: https://github.com/apache/datafusion/pull/16454#discussion_r2183300447 ## datafusion/core/src/physical_planner.rs: ## @@ -1502,6 +1521,64 @@ fn get_null_physical_expr_pair( Ok((Arc::new(null_value), physical_name)) } +///

Re: [PR] Fix duplicate field name error in Join::try_new_with_project_input during physical planning [datafusion]

2025-07-03 Thread via GitHub
LiaCastaneda commented on code in PR #16454: URL: https://github.com/apache/datafusion/pull/16454#discussion_r2183300447 ## datafusion/core/src/physical_planner.rs: ## @@ -1502,6 +1521,64 @@ fn get_null_physical_expr_pair( Ok((Arc::new(null_value), physical_name)) } +///

Re: [PR] Fix duplicate field name error in Join::try_new_with_project_input during physical planning [datafusion]

2025-07-03 Thread via GitHub
LiaCastaneda commented on code in PR #16454: URL: https://github.com/apache/datafusion/pull/16454#discussion_r2183300447 ## datafusion/core/src/physical_planner.rs: ## @@ -1502,6 +1521,64 @@ fn get_null_physical_expr_pair( Ok((Arc::new(null_value), physical_name)) } +///

Re: [PR] fix: [Iceberg] Fix decimal corruption [datafusion-comet]

2025-07-03 Thread via GitHub
codecov-commenter commented on PR #1985: URL: https://github.com/apache/datafusion-comet/pull/1985#issuecomment-3033005396 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1985?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] Fix duplicate field name error in Join::try_new_with_project_input during physical planning [datafusion]

2025-07-03 Thread via GitHub
LiaCastaneda commented on code in PR #16454: URL: https://github.com/apache/datafusion/pull/16454#discussion_r2183300447 ## datafusion/core/src/physical_planner.rs: ## @@ -1502,6 +1521,64 @@ fn get_null_physical_expr_pair( Ok((Arc::new(null_value), physical_name)) } +///

Re: [PR] Add reproducer for tpch Q16 deserialization bug [datafusion]

2025-07-03 Thread via GitHub
NGA-TRAN commented on code in PR #16662: URL: https://github.com/apache/datafusion/pull/16662#discussion_r2183328514 ## datafusion/proto/tests/cases/roundtrip_physical_plan.rs: ## @@ -1736,3 +1737,57 @@ async fn roundtrip_physical_plan_node() { let _ = plan.execute(0, ctx

[PR] fix: [iceberg] Enable CometShuffleManager in Iceberg Spark tests [datafusion-comet]

2025-07-03 Thread via GitHub
hsiang-c opened a new pull request, #1987: URL: https://github.com/apache/datafusion-comet/pull/1987 ## Which issue does this PR close? Closes #. https://github.com/apache/datafusion-comet/issues/1685 ## Rationale for this change When we enabled Iceberg Sp

[PR] Add a note about Boxing errors in upgrade guide [datafusion]

2025-07-03 Thread via GitHub
alamb opened a new pull request, #16673: URL: https://github.com/apache/datafusion/pull/16673 ## Which issue does this PR close? - part of https://github.com/apache/datafusion/issues/16652 - related to https://github.com/apache/datafusion/pull/16653 ## Rationale for t

Re: [PR] refactor: shrink `SchemaError` [datafusion]

2025-07-03 Thread via GitHub
alamb commented on code in PR #16653: URL: https://github.com/apache/datafusion/pull/16653#discussion_r2183785894 ## datafusion/common/src/error.rs: ## @@ -1179,4 +1180,9 @@ mod test { assert_eq!(errs[1].strip_backtrace(), "Error during planning: b"); assert_eq

Re: [PR] refactor: shrink `SchemaError` [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16653: URL: https://github.com/apache/datafusion/pull/16653#issuecomment-3033743371 Doc PR: - https://github.com/apache/datafusion/pull/16673 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] [branch-48] Set the default value of datafusion.execution.collect_statistics to true #16447 [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16659: URL: https://github.com/apache/datafusion/pull/16659#issuecomment-3033748969 I plan to merge this in tomorrow unless there is any more feedback -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] Decimal & UInt Binary operation giving wrong output [datafusion]

2025-07-03 Thread via GitHub
jatin510 commented on issue #16667: URL: https://github.com/apache/datafusion/issues/16667#issuecomment-3031632431 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[I] Decimal & UInt Binary operation giving wrong output [datafusion]

2025-07-03 Thread via GitHub
jatin510 opened a new issue, #16667: URL: https://github.com/apache/datafusion/issues/16667 ### Describe the bug The binary operation between Unsigned Integer and Decimal type produces wrong output. ### To Reproduce Eg queries: ``` select arrow_cast(1.23,

[PR] Extend binary coercion rules to support Decimal arithmetic operations with integer(signed and unsigned) types [datafusion]

2025-07-03 Thread via GitHub
jatin510 opened a new pull request, #16668: URL: https://github.com/apache/datafusion/pull/16668 ## Which issue does this PR close? Closes: https://github.com/apache/datafusion/issues/16667 ## Rationale for this change ## What changes are included

[PR] benchmark: Support sort_tpch10 for benchmark [datafusion]

2025-07-03 Thread via GitHub
zhuqi-lucas opened a new pull request, #16671: URL: https://github.com/apache/datafusion/pull/16671 ## Which issue does this PR close? Currently we only have sort_tpch for benchmark, recently when optimizing sort, i found sort_tpch10 will show more stable result sometimes, so i added

Re: [PR] limit intermediate batch size in nested_loop_join [datafusion]

2025-07-03 Thread via GitHub
jonathanc-n commented on code in PR #16443: URL: https://github.com/apache/datafusion/pull/16443#discussion_r2182837993 ## datafusion/physical-plan/src/joins/utils.rs: ## @@ -843,24 +844,56 @@ pub(crate) fn apply_join_filter_to_indices( probe_indices: UInt32Array, filt

[I] [iceberg] Incorrect results when reading decimal fields [datafusion-comet]

2025-07-03 Thread via GitHub
andygrove opened a new issue, #1983: URL: https://github.com/apache/datafusion-comet/issues/1983 ### Describe the bug During fuzz testing, I am seeing errors reading decimals. ``` ## SQL ``` SELECT c7, cast(c7 as SMALLINT), try_cast(c7 as SMALLINT) FROM local.db.test0

Re: [PR] Implementation for regex_instr [datafusion]

2025-07-03 Thread via GitHub
Omega359 commented on PR #15928: URL: https://github.com/apache/datafusion/pull/15928#issuecomment-3032478116 Run extended tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [I] Support `from_unixtime(ts, [fmt])` [datafusion]

2025-07-03 Thread via GitHub
Omega359 commented on issue #16577: URL: https://github.com/apache/datafusion/issues/16577#issuecomment-3032483374 Likely this can be closed as there is a valid and easy alternative. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] limit intermediate batch size in nested_loop_join [datafusion]

2025-07-03 Thread via GitHub
UBarney commented on code in PR #16443: URL: https://github.com/apache/datafusion/pull/16443#discussion_r2182935284 ## datafusion/physical-plan/src/joins/utils.rs: ## @@ -843,24 +844,56 @@ pub(crate) fn apply_join_filter_to_indices( probe_indices: UInt32Array, filter:

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3032712630 🤖: Benchmark completed Details ``` Comparing HEAD and reuse_rows Benchmark sort_tpch10.json ┏━━

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3032712753 🤖 `./gh_compare_branch.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh) Running Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubun

Re: [PR] refactor: shrink `SchemaError` [datafusion]

2025-07-03 Thread via GitHub
comphead commented on code in PR #16653: URL: https://github.com/apache/datafusion/pull/16653#discussion_r2183170224 ## datafusion/expr/src/logical_plan/builder.rs: ## @@ -2535,16 +2535,17 @@ mod tests { match plan { Err(DataFusionError::SchemaError( -

Re: [PR] Comet 0.9.0 [datafusion-site]

2025-07-03 Thread via GitHub
andygrove commented on code in PR #78: URL: https://github.com/apache/datafusion-site/pull/78#discussion_r2183172953 ## content/blog/2025-07-01-datafusion-comet-0.9.0.md: ## @@ -0,0 +1,176 @@ +--- +layout: post +title: Apache DataFusion Comet 0.9.0 Release +date: 2025-07-01 +aut

Re: [PR] Comet 0.9.0 [datafusion-site]

2025-07-03 Thread via GitHub
andygrove commented on code in PR #78: URL: https://github.com/apache/datafusion-site/pull/78#discussion_r2183173425 ## content/blog/2025-07-01-datafusion-comet-0.9.0.md: ## @@ -0,0 +1,176 @@ +--- +layout: post +title: Apache DataFusion Comet 0.9.0 Release +date: 2025-07-01 +aut

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-03 Thread via GitHub
comphead commented on code in PR #16647: URL: https://github.com/apache/datafusion/pull/16647#discussion_r2183175335 ## datafusion/physical-plan/src/sorts/stream.rs: ## @@ -105,26 +110,57 @@ impl RowCursorStream { }) .collect::>>()?; -let stre

Re: [PR] Comet 0.9.0 [datafusion-site]

2025-07-03 Thread via GitHub
andygrove commented on PR #78: URL: https://github.com/apache/datafusion-site/pull/78#issuecomment-3032821765 Thanks for the reviews @kazuyukitanimura @parthchandra @comphead. I'll plan on merging this tomorrow once the release vote passes. -- This is an automated message from the A

Re: [PR] rustup version [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16663: URL: https://github.com/apache/datafusion/pull/16663#issuecomment-3033663294 Thanks @melroy12 -- I started the CI tests I suspect this PR will fail at least `clippy` -- you'll perhaps have to run Clippy and make the changes it suggests (note you can als

Re: [PR] refactor filter pushdown APIs [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16642: URL: https://github.com/apache/datafusion/pull/16642#issuecomment-3033664854 Looks like has one more failure (and maybe we can merge up too) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Improve error message when ScalarValue fails to cast array [datafusion]

2025-07-03 Thread via GitHub
alamb merged PR #16670: URL: https://github.com/apache/datafusion/pull/16670 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Improve error message when ScalarValue fails to cast array [datafusion]

2025-07-03 Thread via GitHub
alamb commented on code in PR #16670: URL: https://github.com/apache/datafusion/pull/16670#discussion_r2183767743 ## datafusion/common/src/scalar/mod.rs: ## @@ -895,17 +919,8 @@ fn dict_from_values( } macro_rules! typed_cast_tz { -($array:expr, $index:expr, $ARRAYTYPE:id

Re: [PR] Add comments to ClickBench queries about setting binary_as_string [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16605: URL: https://github.com/apache/datafusion/pull/16605#issuecomment-3033684735 @Dandandan can i trouble you for a review of this PR (it makes no code changes, just comments0 -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] Add an example of embedding indexes inside a parquet file [datafusion]

2025-07-03 Thread via GitHub
alamb merged PR #16395: URL: https://github.com/apache/datafusion/pull/16395 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Refactor statement execution logic in `datafusion-cli` [datafusion]

2025-07-03 Thread via GitHub
alamb closed issue #16559: Refactor statement execution logic in `datafusion-cli` URL: https://github.com/apache/datafusion/issues/16559 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] `datafusion-cli`: Refactor statement execution logic [datafusion]

2025-07-03 Thread via GitHub
alamb merged PR #16634: URL: https://github.com/apache/datafusion/pull/16634 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Add an example of embedding indexes inside a parquet file [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16395: URL: https://github.com/apache/datafusion/pull/16395#issuecomment-3033685921 This is so great -- now we just need to write up a blog post 🎣 Thanks again @zhuqi-lucas -- this is going to be great -- This is an automated message from the Apache Git

Re: [I] Add an example of embedding indexes *inside* a parquet file [datafusion]

2025-07-03 Thread via GitHub
alamb closed issue #16374: Add an example of embedding indexes *inside* a parquet file URL: https://github.com/apache/datafusion/issues/16374 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] `datafusion-cli`: Refactor statement execution logic [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16634: URL: https://github.com/apache/datafusion/pull/16634#issuecomment-3033686652 Thanks again @liamzwbao -- keeping the code clean 💯 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] Update to Rust 1.88 [datafusion]

2025-07-03 Thread via GitHub
melroy12 commented on PR #16663: URL: https://github.com/apache/datafusion/pull/16663#issuecomment-3033691052 > Thanks @melroy12 -- I started the CI tests > > I suspect this PR will fail at least `clippy` -- you'll perhaps have to run Clippy and make the changes it suggests (note you

Re: [PR] refactor: shrink `SchemaError` [datafusion]

2025-07-03 Thread via GitHub
alamb commented on PR #16653: URL: https://github.com/apache/datafusion/pull/16653#issuecomment-3033688762 I think technically this is an API change so it would nice to leave a note in the upgrade guide. I'll make a PR -- This is an automated message from the Apache Git Service. To respon

Re: [PR] refactor filter pushdown APIs [datafusion]

2025-07-03 Thread via GitHub
adriangb commented on PR #16642: URL: https://github.com/apache/datafusion/pull/16642#issuecomment-3033712468 > Looks like has one more failure (and maybe we can merge up too) done! -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Update to Rust 1.88 [datafusion]

2025-07-03 Thread via GitHub
melroy12 commented on PR #16663: URL: https://github.com/apache/datafusion/pull/16663#issuecomment-3033695701 > Thanks @melroy12 -- I started the CI tests > > I suspect this PR will fail at least `clippy` -- you'll perhaps have to run Clippy and make the changes it suggests (note you

Re: [I] Support Push down expression evaluation in `TableProviders` [datafusion]

2025-07-03 Thread via GitHub
adriangb commented on issue #14993: URL: https://github.com/apache/datafusion/issues/14993#issuecomment-3033503866 I started looking into this and where it gets messy is: 1. Partition columns. I think this needs a rethink. I suggest pushing partition column generation down into the actual

Re: [PR] Fix discrepancy in Float64 to timestamp(9) casts [datafusion]

2025-07-03 Thread via GitHub
alamb commented on code in PR #16639: URL: https://github.com/apache/datafusion/pull/16639#discussion_r2183689441 ## datafusion/sqllogictest/test_files/timestamps.slt: ## @@ -394,12 +503,12 @@ SELECT COUNT(*) FROM ts_data_secs where ts > to_timestamp_seconds('2020-09-08 12 que

Re: [PR] Support for Postgres `CREATE SERVER` [datafusion-sqlparser-rs]

2025-07-03 Thread via GitHub
alamb commented on PR #1914: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1914#issuecomment-3033540449 THank you @iffyio for being a coding machine -- it is pretty amazing to see all this code go in -- its like we don't really have any idea how many crazy variants of SQL th

  1   2   >