Re: [PR] chore: Re-enable GitHub discussions [datafusion-comet]

2025-03-15 Thread via GitHub
codecov-commenter commented on PR #1535: URL: https://github.com/apache/datafusion-comet/pull/1535#issuecomment-2726766217 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1535?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] Fix wildcard dataframe case [datafusion]

2025-03-15 Thread via GitHub
jayzhan211 commented on code in PR #15230: URL: https://github.com/apache/datafusion/pull/15230#discussion_r1995599910 ## datafusion/sql/src/select.rs: ## @@ -741,8 +722,17 @@ impl SqlToRel<'_, S> { } /// Wrap a plan in a projection -fn project(&self, input: Logi

Re: [PR] feat: Attach `Diagnostic` to more than one column errors in scalar_subquery and in_subquery [datafusion]

2025-03-15 Thread via GitHub
changsun20 commented on PR #15143: URL: https://github.com/apache/datafusion/pull/15143#issuecomment-2725384326 Thank you all for your help! Looking forward to contributing more to the project soon. -- This is an automated message from the Apache Git Service. To respond to the message, pl

[I] Add `ctx = SessionContext()` to __init__ [datafusion-python]

2025-03-15 Thread via GitHub
deanm opened a new issue, #1065: URL: https://github.com/apache/datafusion-python/issues/1065 That way there'd just be one line ```python from datafusion import ctx, col, lit, functions as F ``` ![Image](https://github.com/user-attachments/assets/8047b372-6002-40b4-8e

[PR] [branch-46] Update ring to v0.17.13 (#15063) [datafusion]

2025-03-15 Thread via GitHub
alamb opened a new pull request, #15228: URL: https://github.com/apache/datafusion/pull/15228 - Part of https://github.com/apache/datafusion/issues/15151 Rationale: get a clean CI run on 46 Changes: - Backport https://github.com/apache/datafusion/pull/15063 -- This is an a

Re: [I] distinct_query_sql benchmark is failing [datafusion]

2025-03-15 Thread via GitHub
getChan commented on issue #15213: URL: https://github.com/apache/datafusion/issues/15213#issuecomment-2724220964 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] Implement tree explain for InterleaveExec [datafusion]

2025-03-15 Thread via GitHub
alamb merged PR #15219: URL: https://github.com/apache/datafusion/pull/15219 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove commented on code in PR #1443: URL: https://github.com/apache/datafusion-comet/pull/1443#discussion_r1996051635 ## native/core/src/parquet/mod.rs: ## @@ -645,65 +653,42 @@ pub unsafe extern "system" fn Java_org_apache_comet_parquet_Native_initRecordBat .g

Re: [I] Parse MySQL `SET GLOBAL` variables [datafusion-sqlparser-rs]

2025-03-15 Thread via GitHub
iffyio closed issue #1694: Parse MySQL `SET GLOBAL` variables URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1694 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Migrate dataframe tests to `insta` [datafusion]

2025-03-15 Thread via GitHub
jsai28 commented on issue #15245: URL: https://github.com/apache/datafusion/issues/15245#issuecomment-2726750730 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

[I] `native_datafusion` scan is only enabled when `spark.comet.exec.enabled` is set [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove opened a new issue, #1536: URL: https://github.com/apache/datafusion-comet/issues/1536 ### Describe the bug `native_datafusion` scan is only enabled when `spark.comet.exec.enabled` is set. In `CometScanRule` we add `CometScanExec` as a placeholder for a `native_dataf

Re: [PR] build: bump spark version to 3.3.4, 3.4.4, 3.5.4 for spark-3.3, spark-3.4 and spark-3.5 [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove commented on PR #1243: URL: https://github.com/apache/datafusion-comet/pull/1243#issuecomment-2726743649 I will close this PR since it is has been inactive for a while. We have dropped Spark 3.3.x support and upgraded our 3.5 support to 3.5.4. We have not updated 3.4.x yet.

Re: [I] Implement tree explain for `CoalescePartitionsExec` [datafusion]

2025-03-15 Thread via GitHub
alamb closed issue #15195: Implement tree explain for `CoalescePartitionsExec` URL: https://github.com/apache/datafusion/issues/15195 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] feat: Add `datafusion-spark` crate [datafusion]

2025-03-15 Thread via GitHub
andygrove commented on code in PR #15168: URL: https://github.com/apache/datafusion/pull/15168#discussion_r1995461595 ## datafusion/sqllogictest/test_files/spark/math/expm1.slt: ## @@ -0,0 +1,32 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contribu

Re: [PR] Simpler to see expressions in tree explain mode [datafusion]

2025-03-15 Thread via GitHub
alamb commented on code in PR #15163: URL: https://github.com/apache/datafusion/pull/15163#discussion_r1996154726 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -739,43 +736,42 @@ physical_plan 01)┌───┐ 02)│ ProjectionExec │ 0

Re: [PR] Simpler to see expressions in explain `tree` mode [datafusion]

2025-03-15 Thread via GitHub
alamb commented on code in PR #15163: URL: https://github.com/apache/datafusion/pull/15163#discussion_r1996802719 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -704,29 +704,26 @@ physical_plan 01)┌───┐ 02)│ ProjectionExec │ 0

Re: [PR] Update ring to v0.17.13 [datafusion]

2025-03-15 Thread via GitHub
alamb commented on PR #15063: URL: https://github.com/apache/datafusion/pull/15063#issuecomment-2724652379 Backport pR: - https://github.com/apache/datafusion/pull/15228 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] chore: add an "expr_planners" method to SessionState [datafusion]

2025-03-15 Thread via GitHub
niebayes commented on code in PR #15119: URL: https://github.com/apache/datafusion/pull/15119#discussion_r1988269733 ## datafusion/core/src/execution/context/mod.rs: ## @@ -1632,7 +1632,7 @@ impl FunctionRegistry for SessionContext { } fn expr_planners(&self) -> Vec>

Re: [PR] feat: rand expression support [datafusion-comet]

2025-03-15 Thread via GitHub
akupchinskiy commented on PR #1199: URL: https://github.com/apache/datafusion-comet/pull/1199#issuecomment-2704600195 > @akupchinskiy do you plan to resolve the conflicts? Yeah, thanks for the reminder. Will do it tomorrow -- This is an automated message from the Apache Git Service

Re: [PR] Improve collection during repr and repr_html [datafusion-python]

2025-03-15 Thread via GitHub
timsaucer commented on PR #1036: URL: https://github.com/apache/datafusion-python/pull/1036#issuecomment-2726455740 @konjac would you mind reviewing this? I pulled a portion of your code/idea into it so that this can supersede #1015 -- This is an automated message from the Apache Git Ser

Re: [PR] test: add pytest asyncio tests [datafusion-python]

2025-03-15 Thread via GitHub
timsaucer merged PR #1063: URL: https://github.com/apache/datafusion-python/pull/1063 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

Re: [PR] enhance sql-using-python-udf example [datafusion-python]

2025-03-15 Thread via GitHub
timsaucer commented on PR #1054: URL: https://github.com/apache/datafusion-python/pull/1054#issuecomment-2726455179 I'm not quite sure what was lacking in the examples before. For the sql example, I think all of the try/except blocks lead to more complexity than we want for a user example.

Re: [PR] feat: expose regex_count function [datafusion-python]

2025-03-15 Thread via GitHub
timsaucer merged PR #1066: URL: https://github.com/apache/datafusion-python/pull/1066 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

Re: [PR] Simpler to see expressions in explain `tree` mode [datafusion]

2025-03-15 Thread via GitHub
alamb commented on PR #15163: URL: https://github.com/apache/datafusion/pull/15163#issuecomment-2726481038 Thanks again @irenjj -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[I] Migrate user_defined tests to insta [datafusion]

2025-03-15 Thread via GitHub
blaginin opened a new issue, #15247: URL: https://github.com/apache/datafusion/issues/15247 In https://github.com/apache/datafusion/issues/15178, we're switching hard-coded constants in tests to `insta.` This issue targets updating **user_defined** (`datafusion/core/tests/user_defin

Re: [I] Graphviz PhysicalPlan [datafusion]

2025-03-15 Thread via GitHub
alamb closed issue #219: Graphviz PhysicalPlan URL: https://github.com/apache/datafusion/issues/219 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gith

Re: [I] Support different `EXPLAIN` formats via SQL [datafusion]

2025-03-15 Thread via GitHub
alamb closed issue #15021: Support different `EXPLAIN` formats via SQL URL: https://github.com/apache/datafusion/issues/15021 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] Improve RepartitionExec for better query performance [datafusion]

2025-03-15 Thread via GitHub
alamb commented on issue #7001: URL: https://github.com/apache/datafusion/issues/7001#issuecomment-2726483812 > [@alamb](https://github.com/alamb) [@ozankabak](https://github.com/ozankabak) What are the ways to make RoundRobin more NUMA aware. I could come up with only this approach I was r

[PR] chore: [FOLLOWUP] Drop support for Spark 3.3 (EOL) [datafusion-comet]

2025-03-15 Thread via GitHub
kazuyukitanimura opened a new pull request, #1534: URL: https://github.com/apache/datafusion-comet/pull/1534 ## Which issue does this PR close? Part of #646 ## Rationale for this change follow up on #1529 ## What changes are included in this PR? cleaned up S

Re: [I] Migrate physical plan tests to `insta` [datafusion]

2025-03-15 Thread via GitHub
Shreyaskr1409 commented on issue #15248: URL: https://github.com/apache/datafusion/issues/15248#issuecomment-2726571067 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[PR] Make `ListingTableUrl::try_new` public [datafusion]

2025-03-15 Thread via GitHub
linhr opened a new pull request, #15250: URL: https://github.com/apache/datafusion/pull/15250 ## Which issue does this PR close? N/A ## Rationale for this change I spent some time looking into #7393. It seems the simple cases can be supported in a few lines of code (e.g.

[PR] Add debug logging for default catalog overwrite in SessionState build [datafusion]

2025-03-15 Thread via GitHub
byte-sourcerer opened a new pull request, #15251: URL: https://github.com/apache/datafusion/pull/15251 ## Rationale for this change Overwriting the default catalog when building the `SessionState` can be surprising. ## What changes are included in this PR? Add logs for t

Re: [PR] Implement tree explain for CoalescePartitionsExec [datafusion]

2025-03-15 Thread via GitHub
alamb commented on PR #15225: URL: https://github.com/apache/datafusion/pull/15225#issuecomment-2726479559 Thanks again @Shreyaskr1409 ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] feat/improve ruff test coverage [datafusion-python]

2025-03-15 Thread via GitHub
kevinjqliu commented on code in PR #1055: URL: https://github.com/apache/datafusion-python/pull/1055#discussion_r1991919996 ## uv.lock: ## @@ -351,7 +351,6 @@ wheels = [ [[package]] name = "datafusion" -version = "44.0.0" Review Comment: i cant find any reasons why this

Re: [PR] Saner handling of nulls inside arrays [datafusion]

2025-03-15 Thread via GitHub
joroKr21 commented on code in PR #15149: URL: https://github.com/apache/datafusion/pull/15149#discussion_r1996627956 ## datafusion/expr/src/type_coercion/functions.rs: ## @@ -364,98 +366,73 @@ fn get_valid_types( return Ok(vec![vec![]]); } -let ar

Re: [I] `core_expressions` feature is broken in the `datafusion-functions` [datafusion]

2025-03-15 Thread via GitHub
alamb commented on issue #15207: URL: https://github.com/apache/datafusion/issues/15207#issuecomment-2725280153 > I had no clue that feature even existed :/ Me neither -- maybe we just cargo culted it 🤷 -- This is an automated message from the Apache Git Service. To respond to the

[PR] support run mutiple queries in TPC-H benchmark [datafusion-ray]

2025-03-15 Thread via GitHub
zhangx opened a new pull request, #82: URL: https://github.com/apache/datafusion-ray/pull/82 **Changes** - Support run mutiple queries in tpcbench.py for both Datafusion on Ray and Datafusion without Ray, so we can run all queries in tpch q15. - Add --query argument for tpcbench.p

Re: [PR] feat: rand expression support [datafusion-comet]

2025-03-15 Thread via GitHub
akupchinskiy commented on PR #1199: URL: https://github.com/apache/datafusion-comet/pull/1199#issuecomment-2706942272 @kazuyukitanimura could you trigger the workflow? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [I] Change mapping of SQL `VARCHAR` from `Utf8` to `Utf8View` [datafusion]

2025-03-15 Thread via GitHub
zhuqi-lucas commented on issue #15096: URL: https://github.com/apache/datafusion/issues/15096#issuecomment-2724484583 New sub_task: Support optimizer project rule for Utf8view datatype combined with Utf8 datatype -- This is an automated message from the Apache Git Service. To resp

Re: [PR] Remove CountWildcardRule in Analyzer and move the functionality in ExprPlanner, add `plan_aggregate` and `plan_window` to planner [datafusion]

2025-03-15 Thread via GitHub
alamb commented on PR #14689: URL: https://github.com/apache/datafusion/pull/14689#issuecomment-2710269778 Update -- I ran the clickbench benchmarks on 45 and 46 and did not see any difference in performance I wonder if some/all of your plans went from having `count(1)` to having `co

Re: [PR] feat: Add `datafusion-spark` crate [datafusion]

2025-03-15 Thread via GitHub
shehabgamin commented on code in PR #15168: URL: https://github.com/apache/datafusion/pull/15168#discussion_r1994892781 ## datafusion/sqllogictest/test_files/spark/math/expm1.slt: ## @@ -0,0 +1,32 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contri

Re: [PR] Add upgrade notes for array signatures [datafusion]

2025-03-15 Thread via GitHub
jkosh44 commented on code in PR #15237: URL: https://github.com/apache/datafusion/pull/15237#discussion_r1996444071 ## docs/source/library-user-guide/upgrading.md: ## @@ -212,4 +212,79 @@ To include special characters (such as newlines via `\n`) you can use an `E` lit Elapsed

Re: [PR] Saner handling of nulls inside arrays [datafusion]

2025-03-15 Thread via GitHub
alamb commented on code in PR #15149: URL: https://github.com/apache/datafusion/pull/15149#discussion_r1996129842 ## datafusion/sqllogictest/test_files/array.slt: ## @@ -1204,7 +1204,7 @@ select array_element([1, 2], NULL); NULL -query I +query ? Review Comment: FYI

Re: [PR] chore: Upgrade `rand` crate and some other minor crates [datafusion]

2025-03-15 Thread via GitHub
comphead commented on PR #14967: URL: https://github.com/apache/datafusion/pull/14967#issuecomment-2725959015 Depends on https://github.com/apache/arrow-rs/pull/7126. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] chore: [FOLLOWUP] Drop support for Spark 3.3 (EOL) [datafusion-comet]

2025-03-15 Thread via GitHub
codecov-commenter commented on PR #1534: URL: https://github.com/apache/datafusion-comet/pull/1534#issuecomment-2726746954 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1534?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

[PR] chore: Re-enable GitHub discussions [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove opened a new pull request, #1535: URL: https://github.com/apache/datafusion-comet/pull/1535 ## Which issue does this PR close? Closes https://github.com/apache/datafusion-comet/issues/1533 ## Rationale for this change ## What changes are included

Re: [I] `native_datafusion` scan is only enabled when `spark.comet.exec.enabled` is set [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove commented on issue #1536: URL: https://github.com/apache/datafusion-comet/issues/1536#issuecomment-2726748501 @viirya Do you remember why the design is like this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] fix: Unconditonally wrap UNION BY NAME input nodes w/ `Projection` [datafusion]

2025-03-15 Thread via GitHub
Omega359 commented on PR #15242: URL: https://github.com/apache/datafusion/pull/15242#issuecomment-2726752609 I had issues with github this morning so here is a patch to get tests running: ``` Index: datafusion/sql/tests/sql_integration.rs IDEA additional info: Subsystem: com.int

Re: [PR] fix: Unconditonally wrap UNION BY NAME input nodes w/ `Projection` [datafusion]

2025-03-15 Thread via GitHub
Omega359 commented on PR #15242: URL: https://github.com/apache/datafusion/pull/15242#issuecomment-2726732069 > Was trying to add all the queries from the issue, but ran into problems with `normalize::convert_batches` during the SLTs 🤔 Which of the queries had issues? -- This is an

[I] [tree explain] Simplify display format of `AggregateFunctionExpr` [datafusion]

2025-03-15 Thread via GitHub
irenjj opened a new issue, #15252: URL: https://github.com/apache/datafusion/issues/15252 The output of `AggregateExec` also seems to contain redundant information. I debugged the code and found that the name of `AggregateFunctionExpr` is constructed in `create_aggregat

Re: [I] [tree explain] Simplify display format of `AggregateFunctionExpr` [datafusion]

2025-03-15 Thread via GitHub
irenjj commented on issue #15252: URL: https://github.com/apache/datafusion/issues/15252#issuecomment-2726764448 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

[PR] chore: Drop support for Spark 3.3 (EOL) [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove opened a new pull request, #1529: URL: https://github.com/apache/datafusion-comet/pull/1529 ## Which issue does this PR close? Closes https://github.com/apache/datafusion-comet/issues/646 ## Rationale for this change Spark 3.3 is EOL. Let's drop

Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-15 Thread via GitHub
jayzhan211 commented on code in PR #15201: URL: https://github.com/apache/datafusion/pull/15201#discussion_r1996487546 ## datafusion/core/tests/dataframe/mod.rs: ## @@ -1563,8 +1563,12 @@ async fn with_column_join_same_columns() -> Result<()> { \n Limit: skip=0, fetch=

Re: [I] Change naming of rust exposed structs to ease debugging [datafusion-python]

2025-03-15 Thread via GitHub
timsaucer closed issue #853: Change naming of rust exposed structs to ease debugging URL: https://github.com/apache/datafusion-python/issues/853 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Saner handling of nulls inside arrays [datafusion]

2025-03-15 Thread via GitHub
jayzhan211 commented on code in PR #15149: URL: https://github.com/apache/datafusion/pull/15149#discussion_r1996470664 ## datafusion/expr-common/src/signature.rs: ## @@ -865,6 +867,39 @@ impl Signature { volatility, } } + +/// Specialized Signature

Re: [PR] implement tree explain for CoalesceBatchesExec [datafusion]

2025-03-15 Thread via GitHub
irenjj commented on code in PR #15194: URL: https://github.com/apache/datafusion/pull/15194#discussion_r1992587634 ## datafusion/physical-plan/src/coalesce_batches.rs: ## @@ -117,14 +117,17 @@ impl DisplayAs for CoalesceBatchesExec { self.target_batch_size,

Re: [PR] Improve parsing `extra_info` in tree explain [datafusion]

2025-03-15 Thread via GitHub
alamb merged PR #15125: URL: https://github.com/apache/datafusion/pull/15125 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Apply additional ruff suggestions [datafusion-python]

2025-03-15 Thread via GitHub
Spaarsh commented on issue #1056: URL: https://github.com/apache/datafusion-python/issues/1056#issuecomment-2710921582 Sure. I'll begin the work as soon as it is merged to ```main```. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Improve explain tree formatting for longer lines / word wrap [datafusion]

2025-03-15 Thread via GitHub
irenjj commented on code in PR #15031: URL: https://github.com/apache/datafusion/pull/15031#discussion_r1983299366 ## datafusion/physical-plan/Cargo.toml: ## @@ -63,6 +63,8 @@ log = { workspace = true } parking_lot = { workspace = true } pin-project-lite = "^0.2.7" tokio = {

Re: [PR] feat: add `register_metadata` function for `GroupsAccumulator` [datafusion]

2025-03-15 Thread via GitHub
jayzhan211 commented on code in PR #15022: URL: https://github.com/apache/datafusion/pull/15022#discussion_r1988229738 ## datafusion/expr-common/src/groups_accumulator.rs: ## @@ -251,3 +261,18 @@ pub trait GroupsAccumulator: Send { /// compute, not `O(num_groups)` fn s

Re: [I] Introduce ProjectionMask To Allow Nested Projection Pushdown [datafusion]

2025-03-15 Thread via GitHub
alamb commented on issue #2581: URL: https://github.com/apache/datafusion/issues/2581#issuecomment-2704070949 Related ticket: - https://github.com/apache/datafusion/issues/14993 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] chore: revert "Upgrade to Spark 3.5.4 (#1471)" [datafusion-comet]

2025-03-15 Thread via GitHub
kazuyukitanimura commented on PR #1493: URL: https://github.com/apache/datafusion-comet/pull/1493#issuecomment-2711531218 @andygrove I was going to put a rpad fix in https://github.com/apache/datafusion-comet/pull/1482/files but if we decide to revert, I can separate the PR. Let me know.

Re: [I] Expose to `GroupsAccumulator` whether all the groups are sorted [datafusion]

2025-03-15 Thread via GitHub
vlad-arista commented on issue #14991: URL: https://github.com/apache/datafusion/issues/14991#issuecomment-2724763313 I think the issue here is not `LazyMemoryExec` but `UnnestExec` which doesn't propagate output ordering of the columns not involved in unnesting (judging by the implementati

Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-15 Thread via GitHub
jayzhan211 commented on code in PR #15201: URL: https://github.com/apache/datafusion/pull/15201#discussion_r1996487546 ## datafusion/core/tests/dataframe/mod.rs: ## @@ -1563,8 +1563,12 @@ async fn with_column_join_same_columns() -> Result<()> { \n Limit: skip=0, fetch=

Re: [PR] Renaming Internal Structs [datafusion-python]

2025-03-15 Thread via GitHub
timsaucer commented on PR #1059: URL: https://github.com/apache/datafusion-python/pull/1059#issuecomment-2724609734 Thank you for all the work on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] [DISCUSS] Release DataFusion `46.0.1` Patch or `46.1.0` minor release (March 2025) [datafusion]

2025-03-15 Thread via GitHub
xudong963 commented on issue #15151: URL: https://github.com/apache/datafusion/issues/15151#issuecomment-2717222611 After all related issues are solved, should we go through the process of releasing again? Such as voting in dev list. -- This is an automated message from the Apache Git Se

Re: [PR] Add all missing table options to be handled in any order [datafusion-sqlparser-rs]

2025-03-15 Thread via GitHub
tomershaniii commented on code in PR #1747: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1747#discussion_r1991268759 ## src/ast/dml.rs: ## @@ -138,6 +143,30 @@ pub struct CreateTable { pub engine: Option, pub comment: Option, pub auto_increment_off

Re: [PR] chore: Upgrade `rand` crate and some other minor crates [datafusion]

2025-03-15 Thread via GitHub
comphead commented on code in PR #14967: URL: https://github.com/apache/datafusion/pull/14967#discussion_r1996412978 ## datafusion/core/tests/parquet/filter_pushdown.rs: ## @@ -65,7 +65,12 @@ fn generate_file(tempdir: &TempDir, props: WriterProperties) -> TestParquetFile t

Re: [PR] Support `EXPLAIN ... FORMAT ...` [datafusion]

2025-03-15 Thread via GitHub
alamb merged PR #15166: URL: https://github.com/apache/datafusion/pull/15166 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Support for generating JSON formatted substrait plan [datafusion-python]

2025-03-15 Thread via GitHub
swayam0322 commented on issue #508: URL: https://github.com/apache/datafusion-python/issues/508#issuecomment-2724995989 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[PR] Improve feature flag CI coverage `datafusion` and `datafusion-functions` [datafusion]

2025-03-15 Thread via GitHub
alamb opened a new pull request, #15203: URL: https://github.com/apache/datafusion/pull/15203 ## Which issue does this PR close? - Closes https://github.com/apache/datafusion/issues/15155 - Follow on to https://github.com/apache/datafusion/pull/15156 ## Rationale for

Re: [I] Implement tree explain for `CoalescePartitionsExec` [datafusion]

2025-03-15 Thread via GitHub
Standing-Man commented on issue #15195: URL: https://github.com/apache/datafusion/issues/15195#issuecomment-2723024655 > [@Standing-Man](https://github.com/Standing-Man) is this issue or what? this doesn't have any description for the above-mentioned issue You can check the details i

Re: [PR] Add additional ruff suggestions [datafusion-python]

2025-03-15 Thread via GitHub
timsaucer commented on code in PR #1062: URL: https://github.com/apache/datafusion-python/pull/1062#discussion_r1996771284 ## examples/python-udf-comparisons.py: ## @@ -163,9 +163,9 @@ def udf_using_pyarrow_compute_impl( resultant_arr = pc.and_(resultant_arr, filtered_

Re: [PR] chore: [FOLLOWUP] Drop support for Spark 3.3 (EOL) [datafusion-comet]

2025-03-15 Thread via GitHub
kazuyukitanimura commented on code in PR #1534: URL: https://github.com/apache/datafusion-comet/pull/1534#discussion_r1997219275 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -438,7 +438,7 @@ class CometSparkSessionExtensions op

Re: [I] `FROM` first in `SELECT` statements [datafusion-sqlparser-rs]

2025-03-15 Thread via GitHub
iffyio closed issue #1400: `FROM` first in `SELECT` statements URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1400 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-15 Thread via GitHub
alamb commented on code in PR #15201: URL: https://github.com/apache/datafusion/pull/15201#discussion_r1995496219 ## datafusion/sqllogictest/test_files/ddl.slt: ## @@ -855,3 +855,29 @@ DROP TABLE t1; statement ok DROP TABLE t2; + +statement count 0 +create table t(a int) as

Re: [I] Migrate datasource tests to `insta` [datafusion]

2025-03-15 Thread via GitHub
shruti2522 commented on issue #15246: URL: https://github.com/apache/datafusion/issues/15246#issuecomment-2726938676 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] Weekly Plan (Andrew Lamb) March 3, 2025 [datafusion]

2025-03-15 Thread via GitHub
alamb commented on issue #14978: URL: https://github.com/apache/datafusion/issues/14978#issuecomment-2710258876 - Next week https://github.com/apache/datafusion/issues/15121 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Fix wildcard dataframe case [datafusion]

2025-03-15 Thread via GitHub
jayzhan211 commented on code in PR #15230: URL: https://github.com/apache/datafusion/pull/15230#discussion_r1996497940 ## datafusion/sql/src/select.rs: ## @@ -741,8 +722,17 @@ impl SqlToRel<'_, S> { } /// Wrap a plan in a projection -fn project(&self, input: Logi

Re: [PR] Simpler to see expressions in explain `tree` mode [datafusion]

2025-03-15 Thread via GitHub
irenjj commented on code in PR #15163: URL: https://github.com/apache/datafusion/pull/15163#discussion_r1996599515 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -704,29 +704,26 @@ physical_plan 01)┌───┐ 02)│ ProjectionExec │

Re: [PR] Saner handling of nulls inside arrays [datafusion]

2025-03-15 Thread via GitHub
jayzhan211 commented on code in PR #15149: URL: https://github.com/apache/datafusion/pull/15149#discussion_r1996515968 ## datafusion/expr/src/type_coercion/functions.rs: ## @@ -364,98 +366,73 @@ fn get_valid_types( return Ok(vec![vec![]]); } -let

Re: [I] Consider only runnning datafusion-cli tests for linux (not mac) [datafusion]

2025-03-15 Thread via GitHub
alamb commented on issue #15226: URL: https://github.com/apache/datafusion/issues/15226#issuecomment-2726364814 wow a few hours from file to finish 🚀 thanks @xudong963 and @zhuqi-lucas -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] Minor: exclude datafusion-cli testing for mac [datafusion]

2025-03-15 Thread via GitHub
alamb commented on PR #15240: URL: https://github.com/apache/datafusion/pull/15240#issuecomment-2726364940 fYI @comphead -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] Browser-accessible official DataFusion playground / DataFusion fiddle [datafusion]

2025-03-15 Thread via GitHub
XiangpengHao commented on issue #13818: URL: https://github.com/apache/datafusion/issues/13818#issuecomment-2725518898 > Thanks a lot I will look into the implementation of [parquet-viewer](https://parquet-viewer.xiangpeng.systems/) about how the caching is done. This is the in-memor

Re: [I] Add pytest-asyncio unit tests [datafusion-python]

2025-03-15 Thread via GitHub
timsaucer closed issue #991: Add pytest-asyncio unit tests URL: https://github.com/apache/datafusion-python/issues/991 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

Re: [PR] Add decorator for udwf [datafusion-python]

2025-03-15 Thread via GitHub
timsaucer merged PR #1061: URL: https://github.com/apache/datafusion-python/pull/1061 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

Re: [I] Add decorator for udwf [datafusion-python]

2025-03-15 Thread via GitHub
timsaucer closed issue #1057: Add decorator for udwf URL: https://github.com/apache/datafusion-python/issues/1057 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [I] Support for generating JSON formatted substrait plan [datafusion-python]

2025-03-15 Thread via GitHub
kosiew commented on issue #508: URL: https://github.com/apache/datafusion-python/issues/508#issuecomment-2726292346 hmm... Strange that github actions did not trigger: https://github.com/user-attachments/assets/b9525797-1a43-4bb9-932c-e46b4bb13386"; /> -- This is an automated

Re: [PR] Saner handling of nulls inside arrays [datafusion]

2025-03-15 Thread via GitHub
joroKr21 commented on code in PR #15149: URL: https://github.com/apache/datafusion/pull/15149#discussion_r1996626461 ## datafusion/expr-common/src/signature.rs: ## @@ -865,6 +867,39 @@ impl Signature { volatility, } } + +/// Specialized Signature f

Re: [PR] Update version to 46.0.1, add CHANGELOG (#15243) [datafusion]

2025-03-15 Thread via GitHub
alamb commented on PR #15244: URL: https://github.com/apache/datafusion/pull/15244#issuecomment-2726365739 🤔 something seems to be wrong here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Update version to 46.0.1, add CHANGELOG [datafusion]

2025-03-15 Thread via GitHub
alamb commented on PR #15243: URL: https://github.com/apache/datafusion/pull/15243#issuecomment-2726365405 Thank you @xudong963 ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] Simpler to see expressions in explain `tree` mode [datafusion]

2025-03-15 Thread via GitHub
irenjj commented on code in PR #15163: URL: https://github.com/apache/datafusion/pull/15163#discussion_r1996678781 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -704,29 +704,26 @@ physical_plan 01)┌───┐ 02)│ ProjectionExec │

Re: [PR] Simpler to see expressions in explain `tree` mode [datafusion]

2025-03-15 Thread via GitHub
irenjj commented on code in PR #15163: URL: https://github.com/apache/datafusion/pull/15163#discussion_r1996678781 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -704,29 +704,26 @@ physical_plan 01)┌───┐ 02)│ ProjectionExec │

Re: [I] Support for generating JSON formatted substrait plan [datafusion-python]

2025-03-15 Thread via GitHub
timsaucer commented on issue #508: URL: https://github.com/apache/datafusion-python/issues/508#issuecomment-2726462662 I'm not sure why the job was skipped: https://github.com/apache/datafusion-python/actions/runs/13870548324/job/38816616372 I assigned it though -- This is an auto

Re: [I] Avoid casting columns when comparing ints and strings [datafusion]

2025-03-15 Thread via GitHub
alamb commented on issue #15035: URL: https://github.com/apache/datafusion/issues/15035#issuecomment-2726478863 > month_id is integer and "2024" is utf8. > > In `type_coercion`, we cast month_id to utf8 based on the coercion rule. `CAST(foo.month_id AS Utf8) = Utf8("2024")` > >

[I] Migrate datasource tests to `insta` [datafusion]

2025-03-15 Thread via GitHub
blaginin opened a new issue, #15246: URL: https://github.com/apache/datafusion/issues/15246 In https://github.com/apache/datafusion/issues/15178, we're switching hard-coded constants in tests to `insta.` This issue targets updating **datasource tests** (`datafusion/core/src/datasour

Re: [I] Enable `used_underscore_binding` clippy lint [datafusion]

2025-03-15 Thread via GitHub
alamb closed issue #14649: Enable `used_underscore_binding` clippy lint URL: https://github.com/apache/datafusion/issues/14649 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] Implement tree explain for CoalescePartitionsExec [datafusion]

2025-03-15 Thread via GitHub
alamb merged PR #15225: URL: https://github.com/apache/datafusion/pull/15225 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] [Epic] Add snapshot tests (migrate to `insta` for tests) [datafusion]

2025-03-15 Thread via GitHub
blaginin commented on issue #15178: URL: https://github.com/apache/datafusion/issues/15178#issuecomment-2726480816 Those places may be good to check https://github.com/user-attachments/assets/1dfd4de9-131f-4244-8f08-8647fe3c2d13"; /> -- This is an automated message from the Apache

Re: [PR] Simpler to see expressions in explain `tree` mode [datafusion]

2025-03-15 Thread via GitHub
alamb merged PR #15163: URL: https://github.com/apache/datafusion/pull/15163 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

  1   2   3   >