Re: [PR] Add reproducer for tpch Q16 deserialization bug [datafusion]

2025-07-02 Thread via GitHub
gabotechs commented on code in PR #16662: URL: https://github.com/apache/datafusion/pull/16662#discussion_r2181971781 ## datafusion/proto/tests/cases/roundtrip_physical_plan.rs: ## @@ -1736,3 +1737,57 @@ async fn roundtrip_physical_plan_node() { let _ = plan.execute(0, ct

Re: [PR] Add reproducer for tpch Q16 deserialization bug [datafusion]

2025-07-02 Thread via GitHub
gabotechs commented on code in PR #16662: URL: https://github.com/apache/datafusion/pull/16662#discussion_r2181971781 ## datafusion/proto/tests/cases/roundtrip_physical_plan.rs: ## @@ -1736,3 +1737,57 @@ async fn roundtrip_physical_plan_node() { let _ = plan.execute(0, ct

Re: [PR] Add SchemaAdapterFactory Support for ListingTable with Schema Evolution and Mapping [datafusion]

2025-07-02 Thread via GitHub
kosiew commented on code in PR #16583: URL: https://github.com/apache/datafusion/pull/16583#discussion_r2181945043 ## datafusion/core/src/datasource/listing/table.rs: ## @@ -894,6 +1047,81 @@ impl ListingTable { self.schema_source } +/// Set the [`SchemaAdapt

Re: [PR] Add SchemaAdapterFactory Support for ListingTable with Schema Evolution and Mapping [datafusion]

2025-07-02 Thread via GitHub
kosiew commented on code in PR #16583: URL: https://github.com/apache/datafusion/pull/16583#discussion_r2181919123 ## datafusion/core/src/datasource/listing/table.rs: ## @@ -1169,8 +1399,10 @@ impl ListingTable { self.options.collect_stat, inexact_stats

[PR] Add support for GRANT ON ALL VIEWS IN SCHEMA [datafusion-sqlparser-rs]

2025-07-02 Thread via GitHub
yoavcloud opened a new pull request, #1922: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1922 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Add SchemaAdapterFactory Support for ListingTable with Schema Evolution and Mapping [datafusion]

2025-07-02 Thread via GitHub
kosiew commented on code in PR #16583: URL: https://github.com/apache/datafusion/pull/16583#discussion_r2181911243 ## datafusion/core/src/datasource/listing/table.rs: ## @@ -894,6 +1047,81 @@ impl ListingTable { self.schema_source } +/// Set the [`SchemaAdapt

Re: [PR] Add SchemaAdapterFactory Support for ListingTable with Schema Evolution and Mapping [datafusion]

2025-07-02 Thread via GitHub
kosiew commented on code in PR #16583: URL: https://github.com/apache/datafusion/pull/16583#discussion_r2181787405 ## datafusion/core/src/datasource/listing/table.rs: ## @@ -894,6 +1047,81 @@ impl ListingTable { self.schema_source } +/// Set the [`SchemaAdapt

Re: [PR] Add SchemaAdapterFactory Support for ListingTable with Schema Evolution and Mapping [datafusion]

2025-07-02 Thread via GitHub
kosiew commented on code in PR #16583: URL: https://github.com/apache/datafusion/pull/16583#discussion_r2181783094 ## datafusion/core/src/datasource/listing/table.rs: ## @@ -302,11 +405,58 @@ impl ListingTableConfig { file_schema: self.file_schema,

Re: [PR] Add SchemaAdapterFactory Support for ListingTable with Schema Evolution and Mapping [datafusion]

2025-07-02 Thread via GitHub
kosiew commented on code in PR #16583: URL: https://github.com/apache/datafusion/pull/16583#discussion_r2181758515 ## datafusion/core/src/datasource/listing/table.rs: ## @@ -894,6 +1047,81 @@ impl ListingTable { self.schema_source } +/// Set the [`SchemaAdapt

Re: [PR] Add SchemaAdapterFactory Support for ListingTable with Schema Evolution and Mapping [datafusion]

2025-07-02 Thread via GitHub
kosiew commented on code in PR #16583: URL: https://github.com/apache/datafusion/pull/16583#discussion_r2181718449 ## datafusion/core/src/datasource/listing/table.rs: ## @@ -123,25 +176,72 @@ impl ListingTableConfig { /// /// If the schema is provided, it must contain

Re: [PR] Add SchemaAdapterFactory Support for ListingTable with Schema Evolution and Mapping [datafusion]

2025-07-02 Thread via GitHub
kosiew commented on code in PR #16583: URL: https://github.com/apache/datafusion/pull/16583#discussion_r2181741126 ## datafusion/core/src/datasource/listing/table.rs: ## @@ -67,8 +69,62 @@ pub enum SchemaSource { /// Configuration for creating a [`ListingTable`] /// +/// # S

Re: [PR] Add SchemaAdapterFactory Support for ListingTable with Schema Evolution and Mapping [datafusion]

2025-07-02 Thread via GitHub
kosiew commented on code in PR #16583: URL: https://github.com/apache/datafusion/pull/16583#discussion_r2181718449 ## datafusion/core/src/datasource/listing/table.rs: ## @@ -123,25 +176,72 @@ impl ListingTableConfig { /// /// If the schema is provided, it must contain

Re: [PR] Add SchemaAdapterFactory Support for ListingTable with Schema Evolution and Mapping [datafusion]

2025-07-02 Thread via GitHub
kosiew commented on code in PR #16583: URL: https://github.com/apache/datafusion/pull/16583#discussion_r2181718449 ## datafusion/core/src/datasource/listing/table.rs: ## @@ -123,25 +176,72 @@ impl ListingTableConfig { /// /// If the schema is provided, it must contain

Re: [PR] Add SchemaAdapterFactory Support for ListingTable with Schema Evolution and Mapping [datafusion]

2025-07-02 Thread via GitHub
kosiew commented on code in PR #16583: URL: https://github.com/apache/datafusion/pull/16583#discussion_r2181718449 ## datafusion/core/src/datasource/listing/table.rs: ## @@ -123,25 +176,72 @@ impl ListingTableConfig { /// /// If the schema is provided, it must contain

Re: [PR] Add SchemaAdapterFactory Support for ListingTable with Schema Evolution and Mapping [datafusion]

2025-07-02 Thread via GitHub
kosiew commented on code in PR #16583: URL: https://github.com/apache/datafusion/pull/16583#discussion_r2181672400 ## datafusion/core/src/datasource/listing/table.rs: ## @@ -67,8 +69,62 @@ pub enum SchemaSource { /// Configuration for creating a [`ListingTable`] /// +/// # S

Re: [PR] feat: implement predicate adaptation missing fields of structs [datafusion]

2025-07-02 Thread via GitHub
kosiew commented on code in PR #16589: URL: https://github.com/apache/datafusion/pull/16589#discussion_r2181554365 ## datafusion/physical-expr-adapter/src/schema_rewriter.rs: ## @@ -234,233 +333,141 @@ mod tests { let result = rewriter.rewrite(column_expr).unwrap();

Re: [PR] feat: implement predicate adaptation missing fields of structs [datafusion]

2025-07-02 Thread via GitHub
kosiew commented on code in PR #16589: URL: https://github.com/apache/datafusion/pull/16589#discussion_r2181563813 ## datafusion/physical-expr-adapter/src/schema_rewriter.rs: ## @@ -234,233 +333,141 @@ mod tests { let result = rewriter.rewrite(column_expr).unwrap();

Re: [PR] feat: implement predicate adaptation missing fields of structs [datafusion]

2025-07-02 Thread via GitHub
kosiew commented on code in PR #16589: URL: https://github.com/apache/datafusion/pull/16589#discussion_r2181554365 ## datafusion/physical-expr-adapter/src/schema_rewriter.rs: ## @@ -234,233 +333,141 @@ mod tests { let result = rewriter.rewrite(column_expr).unwrap();

[I] TPC-H Q16 fails during deserialization [datafusion]

2025-07-02 Thread via GitHub
NGA-TRAN opened a new issue, #16665: URL: https://github.com/apache/datafusion/issues/16665 ### Describe the bug Datadog is working on building a distributed version of DataFusion, which requires query serialization and deserialization. While testing with TPC-H queries, we found that

Re: [PR] Add reproducer for tpch Q16 deserialization bug [datafusion]

2025-07-02 Thread via GitHub
NGA-TRAN commented on code in PR #16662: URL: https://github.com/apache/datafusion/pull/16662#discussion_r2181499476 ## datafusion/proto/tests/cases/roundtrip_physical_plan.rs: ## @@ -1736,3 +1737,57 @@ async fn roundtrip_physical_plan_node() { let _ = plan.execute(0, ctx

Re: [PR] Fix TopK Sort incorrectly pushed down past Join with anti join [datafusion]

2025-07-02 Thread via GitHub
zhuqi-lucas commented on code in PR #16641: URL: https://github.com/apache/datafusion/pull/16641#discussion_r2181487323 ## datafusion/physical-optimizer/src/enforce_sorting/sort_pushdown.rs: ## @@ -216,7 +218,25 @@ fn pushdown_sorts_helper( fn pushdown_requirement_to_children(

Re: [I] Add support for native Parquet writes [datafusion-comet]

2025-07-02 Thread via GitHub
andygrove commented on issue #1625: URL: https://github.com/apache/datafusion-comet/issues/1625#issuecomment-3030362983 > Hello [@andygrove](https://github.com/andygrove) , this sounds like an interesting problem I'd be keen to work on. I think I might need some guidance to navigate it eff

Re: [PR] Fix TopK Sort incorrectly pushed down past Join with anti join [datafusion]

2025-07-02 Thread via GitHub
zhuqi-lucas commented on code in PR #16641: URL: https://github.com/apache/datafusion/pull/16641#discussion_r2181454354 ## datafusion/physical-optimizer/src/enforce_sorting/sort_pushdown.rs: ## @@ -216,7 +218,25 @@ fn pushdown_sorts_helper( fn pushdown_requirement_to_children(

Re: [PR] limit intermediate batch size in nested_loop_join [datafusion]

2025-07-02 Thread via GitHub
UBarney commented on code in PR #16443: URL: https://github.com/apache/datafusion/pull/16443#discussion_r2181428292 ## datafusion/physical-plan/src/joins/nested_loop_join.rs: ## @@ -883,44 +1000,63 @@ impl NestedLoopJoinStream { let visited_left_side = left_data.bitmap(

Re: [PR] Simple Functions Preview [datafusion]

2025-07-02 Thread via GitHub
github-actions[bot] commented on PR #14668: URL: https://github.com/apache/datafusion/pull/14668#issuecomment-3030282290 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [I] Overflow happened on: `Long.MinValue div -1` [datafusion-comet]

2025-07-02 Thread via GitHub
coderfender commented on issue #1477: URL: https://github.com/apache/datafusion-comet/issues/1477#issuecomment-3030246512 I can actually get a successful output ( in parity with spark) if I remove the `ORDER BY ` clause . However, the query fails when I do add the `ORDER BY` . @wForget no

Re: [PR] feat: Add from_unixtime support [datafusion-comet]

2025-07-02 Thread via GitHub
kazuyukitanimura merged PR #1943: URL: https://github.com/apache/datafusion-comet/pull/1943 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsub

Re: [PR] feat: Add from_unixtime support [datafusion-comet]

2025-07-02 Thread via GitHub
kazuyukitanimura commented on PR #1943: URL: https://github.com/apache/datafusion-comet/pull/1943#issuecomment-3029907146 Merged, thanks @andygrove -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Revert Finalize support for `RightMark` join + `Mark` join [datafusion]

2025-07-02 Thread via GitHub
comphead commented on PR #16597: URL: https://github.com/apache/datafusion/pull/16597#issuecomment-3029629138 > @alamb @comphead How do you run extended tests? I'm not sure if you can access it https://github.com/apache/datafusion/actions/workflows/extended.yml -- This is an autom

Re: [PR] limit intermediate batch size in nested_loop_join [datafusion]

2025-07-02 Thread via GitHub
jonathanc-n commented on code in PR #16443: URL: https://github.com/apache/datafusion/pull/16443#discussion_r2181125973 ## datafusion/physical-plan/src/joins/nested_loop_join.rs: ## @@ -828,13 +828,127 @@ impl NestedLoopJoinStream { handle_state!(self.proces

[PR] Fix Query Planner able to find struct field with capital letters [datafusion]

2025-07-02 Thread via GitHub
dttung2905 opened a new pull request, #16664: URL: https://github.com/apache/datafusion/pull/16664 ## Which issue does this PR close? - Closes https://github.com/apache/datafusion/issues/16648 ## What changes are included in this PR? My way of fixing is to follow exact ma

Re: [PR] limit intermediate batch size in nested_loop_join [datafusion]

2025-07-02 Thread via GitHub
jonathanc-n commented on code in PR #16443: URL: https://github.com/apache/datafusion/pull/16443#discussion_r2181102938 ## datafusion/physical-plan/src/joins/nested_loop_join.rs: ## @@ -828,13 +833,127 @@ impl NestedLoopJoinStream { handle_state!(self.proces

Re: [I] Support `from_unixtime(ts, [fmt])` [datafusion]

2025-07-02 Thread via GitHub
kazuyukitanimura commented on issue #16577: URL: https://github.com/apache/datafusion/issues/16577#issuecomment-3029568843 `from_unixtime` currently returns timestamp type. Providing `fmt` means we need to return a String. I guess we can just use ``` to_char(from_unixtime(expres

[PR] rustup version [datafusion]

2025-07-02 Thread via GitHub
melroy12 opened a new pull request, #16663: URL: https://github.com/apache/datafusion/pull/16663 ## Which issue does this PR close? - Closes #16655. ## Rationale for this change Improve build times. ## What changes are included in this PR? Rust 1.88.0 is proper

Re: [I] Update workspace to use Rust 1.88 [datafusion]

2025-07-02 Thread via GitHub
melroy12 commented on issue #16655: URL: https://github.com/apache/datafusion/issues/16655#issuecomment-3029461385 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] Revert Finalize support for `RightMark` join + `Mark` join [datafusion]

2025-07-02 Thread via GitHub
jonathanc-n commented on PR #16597: URL: https://github.com/apache/datafusion/pull/16597#issuecomment-3029444790 @alamb @comphead How do you run extended tests? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [I] Physical plan pushdown for volatile predicates [datafusion]

2025-07-02 Thread via GitHub
theirix commented on issue #16545: URL: https://github.com/apache/datafusion/issues/16545#issuecomment-3029427291 @findepi could you please tell if this behaviour makes sense? If not, I could try fixing the physical plan as in #13268, where you have a review -- This is an automated messag

Re: [PR] chore(deps): Update sqlparser to 0.56 [datafusion]

2025-07-02 Thread via GitHub
Dimchikkk commented on code in PR #16456: URL: https://github.com/apache/datafusion/pull/16456#discussion_r2181032563 ## Cargo.toml: ## @@ -167,7 +167,7 @@ recursive = "0.1.1" regex = "1.8" rstest = "0.25.0" serde_json = "1" -sqlparser = { version = "0.56.0", default-features

Re: [PR] chore(deps): Update sqlparser to 0.56 [datafusion]

2025-07-02 Thread via GitHub
Dimchikkk commented on code in PR #16456: URL: https://github.com/apache/datafusion/pull/16456#discussion_r2181032563 ## Cargo.toml: ## @@ -167,7 +167,7 @@ recursive = "0.1.1" regex = "1.8" rstest = "0.25.0" serde_json = "1" -sqlparser = { version = "0.56.0", default-features

Re: [I] Update workspace to use Rust 1.88 [datafusion]

2025-07-02 Thread via GitHub
melroy12 commented on issue #16655: URL: https://github.com/apache/datafusion/issues/16655#issuecomment-3029407142 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] feat: support literal for ARRAY top level [datafusion-comet]

2025-07-02 Thread via GitHub
comphead commented on PR #1978: URL: https://github.com/apache/datafusion-comet/pull/1978#issuecomment-3029399448 > Per the spark documentation these are the literals aupported by Spark - https://spark.apache.org/docs/latest/sql-ref-literals.html Do we have a specific example we want to su

Re: [PR] feat: support literal for ARRAY top level [datafusion-comet]

2025-07-02 Thread via GitHub
parthchandra commented on PR #1978: URL: https://github.com/apache/datafusion-comet/pull/1978#issuecomment-3029349482 Per the spark documentation these are the literals aupported by Spark - https://spark.apache.org/docs/latest/sql-ref-literals.html Do we have a specific example we want t

[PR] Add reproducer for tpch Q16 deserialization bug [datafusion]

2025-07-02 Thread via GitHub
NGA-TRAN opened a new pull request, #16662: URL: https://github.com/apache/datafusion/pull/16662 ## Which issue does this PR close? Reproducing a bug ## Rationale for this change Datadog is working on building a distributed version of DataFusion, which requires q

Re: [PR] Implementation for regex_instr [datafusion]

2025-07-02 Thread via GitHub
Omega359 commented on PR #15928: URL: https://github.com/apache/datafusion/pull/15928#issuecomment-3029308432 Run extended tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [I] Add `SessionConfig` reference to `ScalarFunctionArgs` [datafusion]

2025-07-02 Thread via GitHub
alamb commented on issue #13519: URL: https://github.com/apache/datafusion/issues/13519#issuecomment-3029293964 > In light of a push to reduce breaking changes like https://github.com/apache/datafusion/issues/16622 (and https://github.com/apache/datafusion/pull/16078, https://github.com/ap

Re: [PR] POC: Add `ConfigOptions` to ExecutionProps when execution is started [datafusion]

2025-07-02 Thread via GitHub
alamb commented on code in PR #16661: URL: https://github.com/apache/datafusion/pull/16661#discussion_r2180950730 ## datafusion/execution/src/config.rs: ## @@ -152,7 +157,7 @@ impl SessionConfig { /// assert_eq!(config.options().execution.batch_size, 1024); /// ```

[PR] POC: Add `ConfigOptions` to ExecutionProps when execution is started [datafusion]

2025-07-02 Thread via GitHub
alamb opened a new pull request, #16661: URL: https://github.com/apache/datafusion/pull/16661 ## Which issue does this PR close? - This is a possible alternative to https://github.com/apache/datafusion/pull/16573 from @findepi I think @Omega359 had other PRs as well Re

Re: [PR] Comet 0.9.0 [datafusion-site]

2025-07-02 Thread via GitHub
parthchandra commented on code in PR #78: URL: https://github.com/apache/datafusion-site/pull/78#discussion_r2180932696 ## content/blog/2025-07-01-datafusion-comet-0.9.0.md: ## @@ -0,0 +1,176 @@ +--- +layout: post +title: Apache DataFusion Comet 0.9.0 Release +date: 2025-07-01 +

Re: [PR] Expose execution time zone to UDFs [datafusion]

2025-07-02 Thread via GitHub
alamb commented on PR #16573: URL: https://github.com/apache/datafusion/pull/16573#issuecomment-3029236099 > It looks we didn't reach a conclusion here yet. There is "just a String" option as currently implemented in this PR. There is "a new struct passed by reference (or Arc)" option as pr

Re: [PR] feat: Add a configuration to make parquet encryption optional [datafusion]

2025-07-02 Thread via GitHub
alamb commented on code in PR #16649: URL: https://github.com/apache/datafusion/pull/16649#discussion_r2180912655 ## datafusion/datasource-parquet/src/file_format.rs: ## @@ -350,15 +352,17 @@ impl FileFormat for ParquetFormat { Some(time_unit) => Some(parse_coerce_

Re: [PR] Implementation for regex_instr [datafusion]

2025-07-02 Thread via GitHub
Omega359 commented on PR #15928: URL: https://github.com/apache/datafusion/pull/15928#issuecomment-302984 Clippy failures related to rand update (I think https://github.com/apache/datafusion/pull/16062) -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] fix: The inconsistency between scalar and array on the cast decimal to timestamp [datafusion]

2025-07-02 Thread via GitHub
findepi commented on code in PR #16539: URL: https://github.com/apache/datafusion/pull/16539#discussion_r2180903190 ## datafusion/common/src/scalar/mod.rs: ## @@ -3069,7 +3069,14 @@ impl ScalarValue { ScalarValue::Decimal128(Some(decimal_value), _, scale),

Re: [PR] Support multiple ordered array_agg aggregations [datafusion]

2025-07-02 Thread via GitHub
findepi commented on PR #16625: URL: https://github.com/apache/datafusion/pull/16625#issuecomment-3029193406 I think i found a solution that avoids any plan changes for previously working queries and applies changes only when a query would otherwise fail. PTAL -- This is an automat

[PR] feat: Support `PiecewiseMergeJoin` to speed up single range predicate joins [datafusion]

2025-07-02 Thread via GitHub
jonathanc-n opened a new pull request, #16660: URL: https://github.com/apache/datafusion/pull/16660 ## Rationale for this change `PiecewiseMergeJoin` is a nice pre cursor to the implementation of ASOF, inequality, etc. joins (multiple range predicates). `PiecewiseMergeJoin` is specialize

Re: [PR] Convert Option> to Vec [datafusion]

2025-07-02 Thread via GitHub
findepi commented on code in PR #16615: URL: https://github.com/apache/datafusion/pull/16615#discussion_r2180825074 ## datafusion/expr/src/expr_fn.rs: ## @@ -821,7 +821,7 @@ impl ExprFuncBuilder { let fun_expr = match fun { ExprFuncKind::Aggregate(mut uda

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-02 Thread via GitHub
Dandandan commented on code in PR #16647: URL: https://github.com/apache/datafusion/pull/16647#discussion_r2180819618 ## datafusion/physical-plan/src/sorts/stream.rs: ## @@ -105,26 +110,53 @@ impl RowCursorStream { }) .collect::>>()?; -let str

Re: [PR] fix: sqllogictest runner label condition mismatch [datafusion]

2025-07-02 Thread via GitHub
alamb commented on PR #16633: URL: https://github.com/apache/datafusion/pull/16633#issuecomment-3029112830 @gabotechs do you have time to review this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Add PhysicalExpr optimizer and cast unwrapping [datafusion]

2025-07-02 Thread via GitHub
alamb merged PR #16530: URL: https://github.com/apache/datafusion/pull/16530 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Add PhysicalExpr optimizer and cast unwrapping [datafusion]

2025-07-02 Thread via GitHub
adriangb commented on PR #16530: URL: https://github.com/apache/datafusion/pull/16530#issuecomment-3029145280 Thanks @alamb! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[PR] Make `GenericDialect` support trailing commas in projections [datafusion-sqlparser-rs]

2025-07-02 Thread via GitHub
simonvandel opened a new pull request, #1921: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1921 Similar to https://github.com/apache/datafusion-sqlparser-rs/pull/1911. The docs for GenericDialect says that is can be permissive, so I thought it could support trailing com

Re: [PR] Improve display format of BoundedWindowAggExec [datafusion]

2025-07-02 Thread via GitHub
alamb commented on code in PR #16645: URL: https://github.com/apache/datafusion/pull/16645#discussion_r2180842064 ## datafusion/physical-plan/src/windows/bounded_window_agg_exec.rs: ## @@ -262,9 +262,9 @@ impl DisplayAs for BoundedWindowAggExec { .iter()

Re: [PR] Fix discrepancy in Float64 to timestamp(9) casts [datafusion]

2025-07-02 Thread via GitHub
Omega359 commented on PR #16639: URL: https://github.com/apache/datafusion/pull/16639#issuecomment-3029143745 Ok. Well considering I couldn't even figure out how to cast a float to a timestamp in postgres without using `to_timestamp` and duckdb outright says it's not implemented (see below)

Re: [PR] Add SchemaAdapterFactory Support for ListingTable with Schema Evolution and Mapping [datafusion]

2025-07-02 Thread via GitHub
adriangb commented on code in PR #16583: URL: https://github.com/apache/datafusion/pull/16583#discussion_r2180848279 ## datafusion/core/src/datasource/listing/table.rs: ## @@ -67,8 +69,62 @@ pub enum SchemaSource { /// Configuration for creating a [`ListingTable`] /// +/// #

Re: [PR] Convert Option> to Vec [datafusion]

2025-07-02 Thread via GitHub
findepi commented on PR #16615: URL: https://github.com/apache/datafusion/pull/16615#issuecomment-3029085531 @alamb this is a breaking change, right? is there a label / process to be used for it? -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] Support multiple ordered array_agg aggregations [datafusion]

2025-07-02 Thread via GitHub
findepi commented on code in PR #16625: URL: https://github.com/apache/datafusion/pull/16625#discussion_r2180865670 ## datafusion/sqllogictest/test_files/aggregate.slt: ## @@ -6306,7 +6356,7 @@ logical_plan physical_plan 01)AggregateExec: mode=Final, gby=[], aggr=[first_value

Re: [PR] Fix discrepancy in Float64 to timestamp(9) casts [datafusion]

2025-07-02 Thread via GitHub
findepi commented on PR #16639: URL: https://github.com/apache/datafusion/pull/16639#issuecomment-3029028385 @Omega359 thanks. This totally escaped my notice that to_timestamp also changed, not only the casts (that are part of the same test). -- This is an automated message from the Apach

Re: [PR] Convert Option> to Vec [datafusion]

2025-07-02 Thread via GitHub
alamb commented on PR #16615: URL: https://github.com/apache/datafusion/pull/16615#issuecomment-3029099256 > @alamb this is a breaking change, right? is there a label / process to be used for it? Probably -- we normally add the api change label, which I just did This is the sta

Re: [PR] Fix discrepancy in Float64 to timestamp(9) casts [datafusion]

2025-07-02 Thread via GitHub
findepi commented on PR #16639: URL: https://github.com/apache/datafusion/pull/16639#issuecomment-3029132694 @Omega359 i am not exactly fan of the cast semantics. If i were to choose, i would choose it to be different. Note that it's pre-existing though: Consider - cast source

Re: [PR] fix: sqllogictest runner label condition mismatch [datafusion]

2025-07-02 Thread via GitHub
alamb commented on PR #16633: URL: https://github.com/apache/datafusion/pull/16633#issuecomment-3029113178 Thank you @lliangyu-lin 🙏 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Fix duplicate field name error in Join::try_new_with_project_input during physical planning [datafusion]

2025-07-02 Thread via GitHub
alamb commented on PR #16454: URL: https://github.com/apache/datafusion/pull/16454#issuecomment-3029116168 @gabotechs do you have time to review this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Implementation for regex_instr [datafusion]

2025-07-02 Thread via GitHub
Omega359 commented on PR #15928: URL: https://github.com/apache/datafusion/pull/15928#issuecomment-3029112924 Run extended tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] Implementation for regex_instr [datafusion]

2025-07-02 Thread via GitHub
alamb commented on PR #15928: URL: https://github.com/apache/datafusion/pull/15928#issuecomment-3029119907 > Run extended tests It works! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Fix discrepancy in Float64 to timestamp(9) casts [datafusion]

2025-07-02 Thread via GitHub
Omega359 commented on PR #16639: URL: https://github.com/apache/datafusion/pull/16639#issuecomment-3029120750 ```cast(123456789.123456789 as timestamp) => 1970-01-01T00:00:00.123456789``` That strikes me as wrong. -- This is an automated message from the Apache Git Service. To resp

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-02 Thread via GitHub
Dandandan commented on code in PR #16647: URL: https://github.com/apache/datafusion/pull/16647#discussion_r2180844635 ## datafusion/physical-plan/src/sorts/stream.rs: ## @@ -105,26 +110,53 @@ impl RowCursorStream { }) .collect::>>()?; -let str

Re: [PR] `datafusion-cli`: Refactor statement execution logic [datafusion]

2025-07-02 Thread via GitHub
alamb commented on code in PR #16634: URL: https://github.com/apache/datafusion/pull/16634#discussion_r2180844132 ## datafusion-cli/src/exec.rs: ## @@ -228,25 +227,43 @@ pub(super) async fn exec_and_print( let statements = DFParser::parse_sql_with_dialect(&sql, dialect.as

Re: [PR] Convert Option> to Vec [datafusion]

2025-07-02 Thread via GitHub
alamb commented on PR #16615: URL: https://github.com/apache/datafusion/pull/16615#issuecomment-3029100794 We have a long history of releasing breaking API changes so i think it is best just to use your judgement here -- This is an automated message from the Apache Git Service. To respond

[I] Datafusion can't seem to cast evolving structs [datafusion]

2025-07-02 Thread via GitHub
TheBuilderJR opened a new issue, #14757: URL: https://github.com/apache/datafusion/issues/14757 ### Describe the bug I'd expect as I add fields to structs, I should be able to cast one into another. You can see in the repro below this doesn't seem to be allowed: ### To Reproduc

Re: [PR] Per file filter evaluation [datafusion]

2025-07-02 Thread via GitHub
alamb commented on PR #15057: URL: https://github.com/apache/datafusion/pull/15057#issuecomment-3029071696 Sorry @adriangb -- I lost track of this PR -- I will put iy on my review queue for tomorrow -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-02 Thread via GitHub
Dandandan commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3029069816 > This is quite cool @Dandandan > > My only real concern is that this code will be tricky to maintain and could easily get reverted / regressed as part of a follow on change

Re: [PR] Update all spark SLT files [datafusion]

2025-07-02 Thread via GitHub
findepi merged PR #16637: URL: https://github.com/apache/datafusion/pull/16637 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafu

Re: [PR] Fix discrepancy in Float64 to timestamp(9) casts [datafusion]

2025-07-02 Thread via GitHub
findepi commented on PR #16639: URL: https://github.com/apache/datafusion/pull/16639#issuecomment-3029062215 I now realized that's exactly what @jatin510 pointed our earlier, just i didn't understand. I pushed a commit restoring the `to_timestamp(double)` behavior to whatever it was bef

Re: [I] Push Dynamic Join Predicates into Scan ("Sideways Information Passing", etc) [datafusion]

2025-07-02 Thread via GitHub
alamb commented on issue #7955: URL: https://github.com/apache/datafusion/issues/7955#issuecomment-3029058997 @adriangb says https://github.com/apache/datafusion/pull/16445#issuecomment-3026127559: > Btw here's an article that explains how DuckDB does join filter pushdown. It sounds

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-02 Thread via GitHub
alamb commented on code in PR #16647: URL: https://github.com/apache/datafusion/pull/16647#discussion_r2180789490 ## datafusion/physical-plan/src/sorts/stream.rs: ## @@ -88,6 +90,9 @@ pub struct RowCursorStream { streams: FusedStreams, /// Tracks the memory used by `co

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-02 Thread via GitHub
alamb commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3028914931 🤖 `./gh_compare_branch.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh) Running Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubun

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-02 Thread via GitHub
alamb commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3028969141 🤖: Benchmark completed Details ``` Comparing HEAD and reuse_rows Benchmark sort_tpch.json ┏━━┳━

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-02 Thread via GitHub
alamb commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3028965756 🤖 `./gh_compare_branch.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh) Running Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubun

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-02 Thread via GitHub
alamb commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3028965671 🤖: Benchmark completed Details ``` Comparing HEAD and reuse_rows Benchmark clickbench_extended.json ┏━━

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-02 Thread via GitHub
Dandandan commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3028828620 > 🤖: Benchmark completed > > Details > > ``` > Comparing HEAD and reuse_rows > > Benchmark sort_tpch.json > > ┏

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-02 Thread via GitHub
alamb commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3028800817 🤖: Benchmark completed Details ``` Comparing HEAD and reuse_rows Benchmark sort_tpch.json ┏━━┳━

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-02 Thread via GitHub
alamb commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3028797204 🤖 `./gh_compare_branch.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh) Running Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubun

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-02 Thread via GitHub
alamb commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3028797121 🤖: Benchmark completed Details ``` Comparing HEAD and reuse_rows Benchmark clickbench_extended.json ┏━━

Re: [PR] docs: Minor improvements to Spark SQL test docs [datafusion-comet]

2025-07-02 Thread via GitHub
codecov-commenter commented on PR #1980: URL: https://github.com/apache/datafusion-comet/pull/1980#issuecomment-3028758895 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1980?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] refactor filter pushdown APIs [datafusion]

2025-07-02 Thread via GitHub
alamb commented on code in PR #16642: URL: https://github.com/apache/datafusion/pull/16642#discussion_r2180625134 ## datafusion/physical-plan/src/filter_pushdown.rs: ## @@ -317,24 +152,74 @@ impl FilterPushdownPropagation { } #[derive(Debug, Clone)] -struct ChildFilterDescri

Re: [PR] Implementation for regex_instr [datafusion]

2025-07-02 Thread via GitHub
nirnayroy commented on code in PR #15928: URL: https://github.com/apache/datafusion/pull/15928#discussion_r2180625721 ## datafusion/functions/src/regex/regexpinstr.rs: ## @@ -0,0 +1,826 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lic

Re: [PR] WIP Blog post for Datafusion 47.0.0 [datafusion-site]

2025-07-02 Thread via GitHub
alamb commented on code in PR #70: URL: https://github.com/apache/datafusion-site/pull/70#discussion_r2180593236 ## content/blog/2025-04-28-datafusion-47.0.0.md: ## @@ -0,0 +1,240 @@ +--- +layout: post +title: Apache DataFusion 45.0.0 Released +date: 2025-02-20 +author: pmc +cat

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-02 Thread via GitHub
alamb commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3028641060 > @alamb may I request some benchmark run? I really need to figure out how to script this automatically. I will see if I can get claude to do something for me -- This is an au

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-02 Thread via GitHub
alamb commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3028639856 🤖 `./gh_compare_branch.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh) Running Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubun

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-02 Thread via GitHub
Dandandan commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3028611122 @alamb may I request some benchmark run? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Reuse Rows allocation in RowCursorStream [datafusion]

2025-07-02 Thread via GitHub
Dandandan commented on PR #16647: URL: https://github.com/apache/datafusion/pull/16647#issuecomment-3028609059 > I believe we can increase the in-place memory for sorting benchmark here, here the default is 1MB. > > The result will largely affected by the in place sort memory buffer f

Re: [PR] Comet 0.9.0 [datafusion-site]

2025-07-02 Thread via GitHub
kazuyukitanimura commented on code in PR #78: URL: https://github.com/apache/datafusion-site/pull/78#discussion_r2180507437 ## content/blog/2025-07-01-datafusion-comet-0.9.0.md: ## @@ -0,0 +1,176 @@ +--- +layout: post +title: Apache DataFusion Comet 0.9.0 Release +date: 2025-07-

  1   2   3   >