Re: [PR] Support binary temporal arithmetic with integers [datafusion]

2024-12-12 Thread via GitHub
milevin commented on code in PR #13741: URL: https://github.com/apache/datafusion/pull/13741#discussion_r1881598760 ## datafusion/expr-common/src/type_coercion/binary.rs: ## @@ -1607,6 +1629,18 @@ mod tests { }}; } +/// Test coercion rules for assymetric bina

Re: [PR] Support binary temporal arithmetic with integers [datafusion]

2024-12-12 Thread via GitHub
milevin commented on code in PR #13741: URL: https://github.com/apache/datafusion/pull/13741#discussion_r1881596978 ## datafusion/expr/src/expr_schema.rs: ## @@ -453,6 +455,26 @@ impl ExprSchemable for Expr { } _ => Ok(Expr::Cast(Cast::new(Box::

Re: [PR] Support binary temporal arithmetic with integers [datafusion]

2024-12-12 Thread via GitHub
milevin commented on code in PR #13741: URL: https://github.com/apache/datafusion/pull/13741#discussion_r1881600889 ## datafusion/expr-common/src/type_coercion/binary.rs: ## @@ -1607,6 +1629,18 @@ mod tests { }}; } +/// Test coercion rules for assymetric bina

Re: [PR] Support binary temporal arithmetic with integers [datafusion]

2024-12-12 Thread via GitHub
milevin commented on code in PR #13741: URL: https://github.com/apache/datafusion/pull/13741#discussion_r1881601707 ## datafusion/expr-common/src/type_coercion/binary.rs: ## @@ -1869,6 +1905,51 @@ mod tests { Operator::Multiply, DataType::Float64

Re: [PR] Support binary temporal arithmetic with integers [datafusion]

2024-12-12 Thread via GitHub
findepi commented on code in PR #13741: URL: https://github.com/apache/datafusion/pull/13741#discussion_r1881583324 ## datafusion/expr-common/src/type_coercion/binary.rs: ## @@ -1449,6 +1455,22 @@ fn null_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option { } }

Re: [PR] Support binary temporal arithmetic with integers [datafusion]

2024-12-12 Thread via GitHub
milevin commented on code in PR #13741: URL: https://github.com/apache/datafusion/pull/13741#discussion_r1881606531 ## datafusion/expr-common/src/type_coercion/binary.rs: ## @@ -1449,6 +1455,22 @@ fn null_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option { } }

[PR] Support binary temporal arithmetic with integers [datafusion]

2024-12-12 Thread via GitHub
milevin opened a new pull request, #13741: URL: https://github.com/apache/datafusion/pull/13741 ## Which issue does this PR close? Closes #12342 ## Rationale for this change ``` > ./target/debug/datafusion-cli -c "select 4 + to_date('1970-01-03');" DataF

[PR] Update bzip2 requirement from 0.4.3 to 0.5.0 [datafusion]

2024-12-12 Thread via GitHub
dependabot[bot] opened a new pull request, #13740: URL: https://github.com/apache/datafusion/pull/13740 Updates the requirements on [bzip2](https://github.com/trifectatechfoundation/bzip2-rs) to permit the latest version. Release notes Sourced from https://github.com/trifectatechf

Re: [I] [BUG] Error when adding Date32 and Int64 [datafusion]

2024-12-12 Thread via GitHub
milevin commented on issue #12342: URL: https://github.com/apache/datafusion/issues/12342#issuecomment-2538336030 I put up a PR, but it's not perfect as I describe in its summary. Would love to get feedback and suggestions from the experts! -- This is an automated message from the Apache

Re: [PR] Document SQL dialect guidance [datafusion]

2024-12-12 Thread via GitHub
findepi commented on code in PR #13706: URL: https://github.com/apache/datafusion/pull/13706#discussion_r1881730569 ## docs/source/user-guide/sql/dialect.md: ## @@ -0,0 +1,53 @@ + + +# SQL Dialect + +The included SQL supported in Apache DataFusion mostly follows the [PostgreSQL

Re: [I] [DISCUSSION] More SqlLogicTest test coverage for queries, including join queries [datafusion]

2024-12-12 Thread via GitHub
findepi commented on issue #13470: URL: https://github.com/apache/datafusion/issues/13470#issuecomment-2538400316 thank you @Omega359 for your work on this! > The result is that everything runs however a LOT of queries are skipped for DF for various reasons. that's totally unde

Re: [PR] [WIP] Create memory table with target partitions [datafusion]

2024-12-12 Thread via GitHub
demetribu closed pull request #13719: [WIP] Create memory table with target partitions URL: https://github.com/apache/datafusion/pull/13719 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[I] Allow to filter null in `array_agg` [datafusion]

2024-12-12 Thread via GitHub
rluvaton opened a new issue, #13742: URL: https://github.com/apache/datafusion/issues/13742 ### Is your feature request related to a problem or challenge? Yes, I want nulls to be filtered from `array_agg` when I specify `with_ignore_nulls: true` in `AggregateExprBuilder` to have behav

Re: [I] Create memory table with target partitions [datafusion]

2024-12-12 Thread via GitHub
demetribu commented on issue #12905: URL: https://github.com/apache/datafusion/issues/12905#issuecomment-2538490093 Unassigning, no longer working on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Support binary temporal arithmetic with integers [datafusion]

2024-12-12 Thread via GitHub
findepi commented on code in PR #13741: URL: https://github.com/apache/datafusion/pull/13741#discussion_r1881988070 ## datafusion/expr-common/src/type_coercion/binary.rs: ## @@ -186,6 +186,12 @@ fn signature(lhs: &DataType, op: &Operator, rhs: &DataType) -> Result

Re: [PR] Implement predicate pruning for `like` expressions (prefix matching) [datafusion]

2024-12-12 Thread via GitHub
alamb commented on code in PR #12978: URL: https://github.com/apache/datafusion/pull/12978#discussion_r1882170862 ## datafusion/core/src/physical_optimizer/pruning.rs: ## @@ -3443,6 +3583,425 @@ mod tests { ); } +#[test] +fn test_increment_utf8() { +

Re: [PR] Add configurable normalization for configuration options and preserve case for S3 paths [datafusion]

2024-12-12 Thread via GitHub
findepi commented on PR #13576: URL: https://github.com/apache/datafusion/pull/13576#issuecomment-2539027475 I like this example, thank you @berkaysynnada. If an option accepts a file name (eg to write data to), would we want to normalize it? Maybe we just shouldn't. -- This is an auto

[I] Allow overriding SQL path base for benchmarks [datafusion]

2024-12-12 Thread via GitHub
m-mueller678 opened a new issue, #13744: URL: https://github.com/apache/datafusion/issues/13744 ### Is your feature request related to a problem or challenge? I am trying to run datafusion benchmarks on the hermit unikernel. Hermit does not support changing the current working directo

Re: [I] avro_to_arrow: Support in memory apache_avro Value's [datafusion]

2024-12-12 Thread via GitHub
mdroogh commented on issue #7690: URL: https://github.com/apache/datafusion/issues/7690#issuecomment-2539205259 So I had a look, and I think @alamb is spot on here: > It might also make sense to look into the upstream arrow crate The arrow crate does not even use the apache-avro

Re: [PR] `TypeSignatureClass` for mixed type function signature [datafusion]

2024-12-12 Thread via GitHub
goldmedal commented on code in PR #13372: URL: https://github.com/apache/datafusion/pull/13372#discussion_r1882517905 ## datafusion/expr-common/src/signature.rs: ## @@ -138,6 +141,48 @@ pub enum TypeSignature { NullAry, } +impl TypeSignature { +#[inline] +pub fn

Re: [PR] Add related source code locations to errors [datafusion]

2024-12-12 Thread via GitHub
alamb commented on code in PR #13664: URL: https://github.com/apache/datafusion/pull/13664#discussion_r1882461644 ## datafusion/common/src/column.rs: ## @@ -254,6 +305,23 @@ impl Column { .collect(), }) } + +/// Attaches a [`Span`] to the [`Col

Re: [I] TPCH DataGen Not working [datafusion-comet]

2024-12-12 Thread via GitHub
rajatma1993 commented on issue #1157: URL: https://github.com/apache/datafusion-comet/issues/1157#issuecomment-2539494866 Hi, I tried the Options Provided above , but still the issue Same. I am using JDK 17 for this , is this is could be reason ? Is jdk 17 Is compatible for this Data

Re: [I] Evaluate vectorized hash table for group aggregation [datafusion]

2024-12-12 Thread via GitHub
alamb commented on issue #7095: URL: https://github.com/apache/datafusion/issues/7095#issuecomment-2539506517 > I don't think there is a 'trivial' way to outperform HashBrown. I suspect that any performance improvements they achieved are due to factors other than having a better hashing mec

Re: [PR] More rigorous treatment of floats in tests [datafusion]

2024-12-12 Thread via GitHub
alamb commented on PR #13590: URL: https://github.com/apache/datafusion/pull/13590#issuecomment-2539511044 Thank you for the effort @leoyvens -- I am sorry we were not able to find an improvement to work on -- This is an automated message from the Apache Git Service. To respond to the me

Re: [I] [DISCUSSION] More SqlLogicTest test coverage for queries, including join queries [datafusion]

2024-12-12 Thread via GitHub
crepererum commented on issue #13470: URL: https://github.com/apache/datafusion/issues/13470#issuecomment-2539537716 Did anyone check the license of the sqlite test suite? Because if https://www.sqlite.org/sqllogictest/dir?ci=tip doesn't specify a license and https://www.sqlite.org/copyrigh

Re: [I] SQL/PGQ or even GQL support [datafusion]

2024-12-12 Thread via GitHub
georgiy-belyanin commented on issue #13545: URL: https://github.com/apache/datafusion/issues/13545#issuecomment-2538784756 As mentioned before, SQL/PGQ could be expressed with relational algebra operations plus recursion. Though, as it's mentioned in the issue on recursive CTEs (#462)

Re: [PR] Add configurable normalization for configuration options and preserve case for S3 paths [datafusion]

2024-12-12 Thread via GitHub
berkaysynnada commented on PR #13576: URL: https://github.com/apache/datafusion/pull/13576#issuecomment-2538929543 > If the option has hard-coded normalization (eg lowercasing), it means there is no difference between certain values (eg those differing in case). Why would a user be concerne

Re: [PR] Minor: make unsupported `nanosecond` part a real (not internal) error [datafusion]

2024-12-12 Thread via GitHub
alamb commented on PR #13733: URL: https://github.com/apache/datafusion/pull/13733#issuecomment-2539036584 > > ... unsupported nanosecond part > > nanosecond part or nanosecond unit? I double checked @findepi and indeed this is the nanosecond part, so I improved the message in

Re: [PR] Implement predicate pruning for `like` expressions (prefix matching) [datafusion]

2024-12-12 Thread via GitHub
findepi commented on code in PR #12978: URL: https://github.com/apache/datafusion/pull/12978#discussion_r1882188269 ## datafusion/core/src/physical_optimizer/pruning.rs: ## @@ -3443,6 +3583,425 @@ mod tests { ); } +#[test] +fn test_increment_utf8() { +

Re: [PR] Consolidate `MapAccess`, and `Subscript` into `CompoundExpr` to handle the complex field access chain [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
goldmedal commented on code in PR #1551: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1551#discussion_r1882455970 ## src/parser/mod.rs: ## @@ -1427,6 +1426,112 @@ impl<'a> Parser<'a> { } } +/// Try to parse an [Expr::CompoundExpr] like `a.b.c`

Re: [PR] Consolidate `MapAccess`, and `Subscript` into `CompoundExpr` to handle the complex field access chain [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
goldmedal commented on code in PR #1551: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1551#discussion_r1882455970 ## src/parser/mod.rs: ## @@ -1427,6 +1426,112 @@ impl<'a> Parser<'a> { } } +/// Try to parse an [Expr::CompoundExpr] like `a.b.c`

Re: [PR] Add related source code locations to errors [datafusion]

2024-12-12 Thread via GitHub
alamb commented on PR #13664: URL: https://github.com/apache/datafusion/pull/13664#issuecomment-2539384317 Sorry for my radio silence here -- I am reviewing this now. Thank you for your patience -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Implement predicate pruning for `like` expressions (prefix matching) [datafusion]

2024-12-12 Thread via GitHub
findepi commented on code in PR #12978: URL: https://github.com/apache/datafusion/pull/12978#discussion_r1882454335 ## datafusion/core/src/physical_optimizer/pruning.rs: ## @@ -1610,6 +1629,127 @@ fn build_statistics_expr( Ok(statistics_expr) } +/// Convert `column LIKE

Re: [I] [DISCUSSION] More SqlLogicTest test coverage for queries, including join queries [datafusion]

2024-12-12 Thread via GitHub
Omega359 commented on issue #13470: URL: https://github.com/apache/datafusion/issues/13470#issuecomment-2539395881 > > The result is that everything runs however a LOT of queries are skipped for DF for various reasons. > > that's totally understandable! what about having them with `qu

Re: [PR] Add arrow cast [datafusion-python]

2024-12-12 Thread via GitHub
timsaucer commented on PR #962: URL: https://github.com/apache/datafusion-python/pull/962#issuecomment-2538772977 So doing a little testing to see if this is necessary: ``` from datafusion import SessionContext, col, lit import pyarrow as pa import datetime ctx = SessionCo

Re: [PR] Minor: Add doc example to RecordBatchStreamAdapter [datafusion]

2024-12-12 Thread via GitHub
alamb commented on PR #13725: URL: https://github.com/apache/datafusion/pull/13725#issuecomment-2539047633 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] `TypeSignatureClass` for mixed type function signature [datafusion]

2024-12-12 Thread via GitHub
jayzhan211 commented on PR #13372: URL: https://github.com/apache/datafusion/pull/13372#issuecomment-2539065729 > Thank you @jayzhan211 -- I know this has been a long process. Sorry for the slower pace but I think as DataFusion attempts to be more stable, more attention is warranted for pot

Re: [PR] Minor: Add doc example to RecordBatchStreamAdapter [datafusion]

2024-12-12 Thread via GitHub
alamb merged PR #13725: URL: https://github.com/apache/datafusion/pull/13725 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Update version to 0.53.0 and add release notes [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb commented on code in PR #1592: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1592#discussion_r1882224477 ## changelog/0.53.0.md: ## @@ -0,0 +1,95 @@ + + +# sqlparser-rs 0.53.0 Changelog + +This release consists of 47 commits from 16 contributors. See credits

Re: [PR] Update version to 0.53.0 and add release notes [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb merged PR #1592: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1592 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] `TypeSignatureClass` for mixed type function signature [datafusion]

2024-12-12 Thread via GitHub
jayzhan211 commented on code in PR #13372: URL: https://github.com/apache/datafusion/pull/13372#discussion_r1882223539 ## datafusion/expr-common/src/signature.rs: ## @@ -138,6 +141,48 @@ pub enum TypeSignature { NullAry, } +impl TypeSignature { +#[inline] +pub fn

Re: [PR] `TypeSignatureClass` for mixed type function signature [datafusion]

2024-12-12 Thread via GitHub
jayzhan211 commented on code in PR #13372: URL: https://github.com/apache/datafusion/pull/13372#discussion_r1882200657 ## datafusion/sqllogictest/test_files/expr.slt: ## @@ -1100,24 +1100,23 @@ SELECT date_part('microsecond', timestamp '2020-09-08T12:00:12.12345678+00:00') que

Re: [I] [DISCUSSION] More SqlLogicTest test coverage for queries, including join queries [datafusion]

2024-12-12 Thread via GitHub
Omega359 commented on issue #13470: URL: https://github.com/apache/datafusion/issues/13470#issuecomment-2539425397 After thinking about it a bit I think I could find a way to switch the results by using the column type count in `query II` to determine how many columns should exist and then

Re: [I] SQL/PGQ or even GQL support [datafusion]

2024-12-12 Thread via GitHub
gsvgit commented on issue #13545: URL: https://github.com/apache/datafusion/issues/13545#issuecomment-2538959258 > Implementing one of the graph querying languages using this feature may require performance improvements to make consequent JOINs execute in decent time. [On the Optimiz

Re: [PR] More rigorous treatment of floats in tests [datafusion]

2024-12-12 Thread via GitHub
leoyvens commented on PR #13590: URL: https://github.com/apache/datafusion/pull/13590#issuecomment-2538963790 > Use rounding in the SLT engine for all floats. Don't use rounding for decimal. In that case the potential valuable changes would be: 1. Somehow drop the bigdecimal depend

Re: [PR] More rigorous treatment of floats in tests [datafusion]

2024-12-12 Thread via GitHub
leoyvens closed pull request #13590: More rigorous treatment of floats in tests URL: https://github.com/apache/datafusion/pull/13590 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] Add tests for date_part on columns + timestamps with / without timezones [datafusion]

2024-12-12 Thread via GitHub
jayzhan211 commented on PR #13732: URL: https://github.com/apache/datafusion/pull/13732#issuecomment-2538830582 I test with #13372 and all test here pass -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] Allow to filter null in `array_agg` [datafusion]

2024-12-12 Thread via GitHub
findepi commented on issue #13742: URL: https://github.com/apache/datafusion/issues/13742#issuecomment-2538750719 AFAICT, the `with_ignore_nulls: true` is currently... ignored? Given it's part of the "logical plan", it should be obeyed. It's probably part of LP because the LP is half-way

Re: [PR] Update to apache-avro 0.17, fix compatibility changes schema handling [datafusion]

2024-12-12 Thread via GitHub
alamb merged PR #13727: URL: https://github.com/apache/datafusion/pull/13727 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Update to apache-avro 0.17, fix compatibility changes schema handling [datafusion]

2024-12-12 Thread via GitHub
alamb commented on PR #13727: URL: https://github.com/apache/datafusion/pull/13727#issuecomment-2539047084 Thanks again @mdroogh -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] Update apache-avro requirement from 0.16 to 0.17 [datafusion]

2024-12-12 Thread via GitHub
alamb closed pull request #13588: Update apache-avro requirement from 0.16 to 0.17 URL: https://github.com/apache/datafusion/pull/13588 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Implement predicate pruning for `like` expressions (prefix matching) [datafusion]

2024-12-12 Thread via GitHub
adriangb commented on code in PR #12978: URL: https://github.com/apache/datafusion/pull/12978#discussion_r1882193146 ## datafusion/core/src/physical_optimizer/pruning.rs: ## @@ -3443,6 +3583,425 @@ mod tests { ); } +#[test] +fn test_increment_utf8() { +

Re: [PR] Update apache-avro requirement from 0.16 to 0.17 [datafusion]

2024-12-12 Thread via GitHub
dependabot[bot] commented on PR #13588: URL: https://github.com/apache/datafusion/pull/13588#issuecomment-2539047080 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version

Re: [PR] Add configurable normalization for configuration options and preserve case for S3 paths [datafusion]

2024-12-12 Thread via GitHub
ozankabak commented on PR #13576: URL: https://github.com/apache/datafusion/pull/13576#issuecomment-2539052939 The config option is basically an escape hatch for users in case they find themselves in a situation where the normalization we enforce is inapplicable in their (possibly edge) cas

Re: [I] [DISCUSSION] More SqlLogicTest test coverage for queries, including join queries [datafusion]

2024-12-12 Thread via GitHub
alamb commented on issue #13470: URL: https://github.com/apache/datafusion/issues/13470#issuecomment-2539538377 > I'm going to push all of this to a df branch in my github account. I would really really like some assistance with evaluating the files and submitting tickets for any issues tha

Re: [I] [DISCUSSION] More SqlLogicTest test coverage for queries, including join queries [datafusion]

2024-12-12 Thread via GitHub
alamb commented on issue #13470: URL: https://github.com/apache/datafusion/issues/13470#issuecomment-2539538771 I wll look into the copyright issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] [DISCUSSION] More SqlLogicTest test coverage for queries, including join queries [datafusion]

2024-12-12 Thread via GitHub
alamb commented on issue #13470: URL: https://github.com/apache/datafusion/issues/13470#issuecomment-2539560959 https://github.com/hydromatic/sql-logic-test > Yes, though that wasn't where I grabbed them from. I think I grabbed the files from https://github.com/hydromatic/sql-logic-te

Re: [I] [DISCUSSION] More SqlLogicTest test coverage for queries, including join queries [datafusion]

2024-12-12 Thread via GitHub
alamb commented on issue #13470: URL: https://github.com/apache/datafusion/issues/13470#issuecomment-2539563921 > Even reading it during tests over the internet might be a bit of a gray zone. At least according to my understanding of US copyright law, reading public files from the in

Re: [PR] Add configurable normalization for configuration options and preserve case for S3 paths [datafusion]

2024-12-12 Thread via GitHub
blaginin commented on PR #13576: URL: https://github.com/apache/datafusion/pull/13576#issuecomment-2539236344 Feels like the right move is to add a warning to that option, see if we get any requests or issues, and if not, just remove it completely. I also don’t think this option shoul

[PR] chore: Move remaining expressions to spark-expr crate + some minor refactoring [datafusion-comet]

2024-12-12 Thread via GitHub
andygrove opened a new pull request, #1165: URL: https://github.com/apache/datafusion-comet/pull/1165 ## Which issue does this PR close? Closes https://github.com/apache/datafusion-comet/issues/659 ## Rationale for this change This PR moves the final expre

Re: [I] Release sqlparser-rs version `0.53.0` / sqlparser_derive `0.3.0` [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb commented on issue #1517: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1517#issuecomment-2539119427 BTW I have verified this release works with DataFusion, see PR: - https://github.com/apache/datafusion/pull/13546 -- This is an automated message from the Apache

[PR] Add Apache license header to spans.rs [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb opened a new pull request, #1594: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1594 - part of https://github.com/apache/datafusion-sqlparser-rs/issues/1517 While trying to make a release candidate, the license check failed ``` Running rat license checker on

Re: [PR] Add Apache license header to spans.rs [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb commented on PR #1594: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1594#issuecomment-2539110199 I plan to cherry-pick this commit to a release branch so it doesn't block the release -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] `TypeSignatureClass` for mixed type function signature [datafusion]

2024-12-12 Thread via GitHub
jayzhan211 commented on code in PR #13372: URL: https://github.com/apache/datafusion/pull/13372#discussion_r1882223539 ## datafusion/expr-common/src/signature.rs: ## @@ -138,6 +141,48 @@ pub enum TypeSignature { NullAry, } +impl TypeSignature { +#[inline] +pub fn

Re: [PR] Reduce token cloning [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb commented on code in PR #1587: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1587#discussion_r1882362868 ## src/parser/mod.rs: ## @@ -3516,37 +3578,50 @@ impl<'a> Parser<'a> { Ok(keyword) } else { let keywords: Vec = keywor

Re: [PR] Implement `Spanned` to retrieve source locations on AST nodes [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb commented on PR #1435: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1435#issuecomment-2539280176 BTW here is a PR from @davisp that recovers all the performance lost adding tokens (and then some) ❤️ - https://github.com/apache/datafusion-sqlparser-rs/pull/1587 --

Re: [PR] Add configurable normalization for configuration options and preserve case for S3 paths [datafusion]

2024-12-12 Thread via GitHub
berkaysynnada commented on PR #13576: URL: https://github.com/apache/datafusion/pull/13576#issuecomment-2538843000 > I feel we should deprecate and remove the global config, in favor of per-value normalization. @berkaysynnada What's the use-case to have per-value normalization _**and**_ a g

Re: [PR] `TypeSignatureClass` for mixed type function signature [datafusion]

2024-12-12 Thread via GitHub
alamb commented on code in PR #13372: URL: https://github.com/apache/datafusion/pull/13372#discussion_r1882165666 ## datafusion/sqllogictest/test_files/expr.slt: ## @@ -1100,24 +1100,23 @@ SELECT date_part('microsecond', timestamp '2020-09-08T12:00:12.12345678+00:00') query er

Re: [PR] POC to show performance improvements of not copying token [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb commented on PR #1561: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1561#issuecomment-2539288785 > > Would you be willing to start making some smaller PRs with parts of your findings? > > I've opened #1587 for the token cloning work and #1588 for investigating w

Re: [PR] Find keywords using perfect hashing [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb commented on code in PR #1590: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1590#discussion_r1882396934 ## build.rs: ## @@ -0,0 +1,101 @@ +use std::env; +use std::fs::File; +use std::io::{BufWriter, Write}; +use std::path::Path; + +fn read_keywords() -> Vec

Re: [PR] Slightly faster keyword lookups [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb commented on code in PR #1591: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1591#discussion_r1882401509 ## src/keywords.rs: ## @@ -973,3 +973,61 @@ pub const RESERVED_FOR_IDENTIFIER: &[Keyword] = &[ Keyword::STRUCT, Keyword::TRIM, ]; + +pub const

Re: [PR] Find keywords using perfect hashing [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb commented on PR #1590: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1590#issuecomment-2539333487 As much as I love not rewriting code, I am quite loath to take on a new dependency in this crate. My concerns are 1. The binary / dependency size 2. (mostly) that eve

Re: [I] SQL/PGQ or even GQL support [datafusion]

2024-12-12 Thread via GitHub
gsvgit commented on issue #13545: URL: https://github.com/apache/datafusion/issues/13545#issuecomment-2538644017 Possible steps. * [Extend SLQ parser with PGQ support](https://github.com/apache/datafusion-sqlparser-rs/issues/1572) Independent task. Technical. * Express PGQ in terms of

Re: [PR] Add tests for date_part on columns + timestamps with / without timezones [datafusion]

2024-12-12 Thread via GitHub
findepi commented on code in PR #13732: URL: https://github.com/apache/datafusion/pull/13732#discussion_r1881941848 ## datafusion/sqllogictest/test_files/expr/date_part.slt: ## @@ -876,3 +1070,5 @@ true query error DataFusion error: Internal error: unit Nanosecond not support

Re: [PR] Add configurable normalization for configuration options and preserve case for S3 paths [datafusion]

2024-12-12 Thread via GitHub
findepi commented on PR #13576: URL: https://github.com/apache/datafusion/pull/13576#issuecomment-2538856480 If the option has hard-coded normalization (eg lowercasing), it means there is no difference between certain values (eg those differing in case). Why would a user be concerned about

Re: [PR] Add configurable normalization for configuration options and preserve case for S3 paths [datafusion]

2024-12-12 Thread via GitHub
findepi commented on PR #13576: URL: https://github.com/apache/datafusion/pull/13576#issuecomment-2538734612 I feel we should deprecate and remove the global config, in favor of per-value normalization. @berkaysynnada What's the use-case to have per-value normalization _**and**_ a globa

[PR] Round floats but not decimals in SqlLogicTests [datafusion]

2024-12-12 Thread via GitHub
findepi opened a new pull request, #13743: URL: https://github.com/apache/datafusion/pull/13743 ## Which issue does this PR close? Closes #. ## Rationale for this change - stop rounding decimal values in SLT. It's the very nature of decimal arithmetics that i

Re: [PR] Round floats but not decimals in SqlLogicTests [datafusion]

2024-12-12 Thread via GitHub
findepi commented on PR #13743: URL: https://github.com/apache/datafusion/pull/13743#issuecomment-2539139118 cc @gliga -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] Implement predicate pruning for `like` expressions (prefix matching) [datafusion]

2024-12-12 Thread via GitHub
findepi commented on code in PR #12978: URL: https://github.com/apache/datafusion/pull/12978#discussion_r1881970319 ## datafusion/core/src/physical_optimizer/pruning.rs: ## @@ -3443,6 +3583,425 @@ mod tests { ); } +#[test] +fn test_increment_utf8() { +

Re: [PR] Implement predicate pruning for `like` expressions (prefix matching) [datafusion]

2024-12-12 Thread via GitHub
findepi commented on code in PR #12978: URL: https://github.com/apache/datafusion/pull/12978#discussion_r1881968871 ## datafusion/core/src/physical_optimizer/pruning.rs: ## @@ -1610,6 +1629,127 @@ fn build_statistics_expr( Ok(statistics_expr) } +/// Convert `column LIKE

[I] Limit together with pushdown_filters [datafusion]

2024-12-12 Thread via GitHub
bchalk101 opened a new issue, #13745: URL: https://github.com/apache/datafusion/issues/13745 ### Describe the bug I am trying to load a parquet dataset, using both a limit and filter. When combining this with the `pushdown_filters` config, no data is found. If I either remove the

Re: [PR] Implement predicate pruning for `like` expressions (prefix matching) [datafusion]

2024-12-12 Thread via GitHub
etseidl commented on code in PR #12978: URL: https://github.com/apache/datafusion/pull/12978#discussion_r1882425151 ## datafusion/core/src/physical_optimizer/pruning.rs: ## @@ -3443,6 +3583,425 @@ mod tests { ); } +#[test] +fn test_increment_utf8() { +

Re: [PR] Implement predicate pruning for `like` expressions (prefix matching) [datafusion]

2024-12-12 Thread via GitHub
etseidl commented on code in PR #12978: URL: https://github.com/apache/datafusion/pull/12978#discussion_r1882429087 ## datafusion/core/src/physical_optimizer/pruning.rs: ## @@ -1610,6 +1629,127 @@ fn build_statistics_expr( Ok(statistics_expr) } +/// Convert `column LIKE

Re: [I] Expose a method for mutating Parser::index [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
niebayes commented on issue #1593: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1593#issuecomment-2539359094 @alamb Sure! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Add sum statistics and PhysicalExpr::column_statistics [datafusion]

2024-12-12 Thread via GitHub
gatesn commented on PR #13736: URL: https://github.com/apache/datafusion/pull/13736#issuecomment-2539143271 Is there any combined script to run all the linting checks at once? I don't want to burn all your CI credits! -- This is an automated message from the Apache Git Service. To respond

[I] Verificiation script fails on `cargo publish --dry-run` when there are changes to derive [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb opened a new issue, #1596: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1596 * part of https://github.com/apache/datafusion-sqlparser-rs/issues/1517 When you run `./dev/release/verify_release.sh 0.53.0 2` I got the following error: ``` + find -inam

[PR] Run cargo fmt in derive crate [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb opened a new pull request, #1595: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1595 - part of https://github.com/apache/datafusion-sqlparser-rs/issues/1517 While running the release verification script with rc1 the cargo fmt check failed: ``` + cargo fmt

Re: [I] sqlite analyze don't require a table keyword?? [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb commented on issue #1583: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1583#issuecomment-2539179391 I am not sure what you are asking here. If a syntax you are looking for is not supported, we would welcome a PR to add support and tests. -- This is an automated

Re: [PR] chore: Move string kernels and expressions to spark-expr crate [datafusion-comet]

2024-12-12 Thread via GitHub
andygrove merged PR #1164: URL: https://github.com/apache/datafusion-comet/pull/1164 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [I] SQL/PGQ support [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb commented on issue #1572: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1572#issuecomment-2539180419 I don't know of any plans yet, but it seems like a reasonable thing to add to me -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [I] The syntax of mysql RENAME TABLE tb1 TO tb2 is not supported. [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb commented on issue #1582: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1582#issuecomment-2539181882 Thanks for the report @charmfocus . We would welcome a PR with tests to improve support -- This is an automated message from the Apache Git Service. To respond to th

Re: [I] Expose a method for mutating Parser::index [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb commented on issue #1593: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1593#issuecomment-2539194226 > We propose adding and exposing a new method in sqlparser's Parser, such as set_index or index_mut, to allow users to mutate the index. I think this makes sense

Re: [I] MySQL's `CREATE TABLE ... SELECT` syntax [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb closed issue #1509: MySQL's `CREATE TABLE ... SELECT` syntax URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1509 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [I] MySQL's `CREATE TABLE ... SELECT` syntax [datafusion-sqlparser-rs]

2024-12-12 Thread via GitHub
alamb commented on issue #1509: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1509#issuecomment-2539196412 I believe this is fixed in https://github.com/apache/datafusion-sqlparser-rs/pull/1515 -- This is an automated message from the Apache Git Service. To respond to the

Re: [I] Evaluate vectorized hash table for group aggregation [datafusion]

2024-12-12 Thread via GitHub
Dandandan commented on issue #7095: URL: https://github.com/apache/datafusion/issues/7095#issuecomment-2539778985 > > I don't think there is a 'trivial' way to outperform HashBrown. I suspect that any performance improvements they achieved are due to factors other than having a better hashi

Re: [I] [DISCUSSION] More SqlLogicTest test coverage for queries, including join queries [datafusion]

2024-12-12 Thread via GitHub
Omega359 commented on issue #13470: URL: https://github.com/apache/datafusion/issues/13470#issuecomment-2539863013 > Did anyone check the license of the sqlite test suite? Because if https://www.sqlite.org/sqllogictest/dir?ci=tip doesn't specify a license and https://www.sqlite.org/copyrigh

Re: [PR] Optimize performance of `initcap` function (~2x faster) [datafusion]

2024-12-12 Thread via GitHub
alamb commented on code in PR #13691: URL: https://github.com/apache/datafusion/pull/13691#discussion_r1882985670 ## datafusion/functions/src/string/initcap.rs: ## @@ -132,21 +132,22 @@ fn initcap_utf8view(args: &[ArrayRef]) -> Result { Ok(Arc::new(result) as ArrayRef) }

Re: [PR] Optimize performance of `initcap` function (~2x faster) [datafusion]

2024-12-12 Thread via GitHub
alamb commented on PR #13691: URL: https://github.com/apache/datafusion/pull/13691#issuecomment-2540154941 Thanks again @tlm365 @jayzhan211 @Dandandan @Weijun-H and @Dandandan (quite a distinguished list of contributors!) -- This is an automated message from the Apache Git Service. To r

Re: [PR] chore: Add ignored tests for reading complex types from Parquet [datafusion-comet]

2024-12-12 Thread via GitHub
viirya commented on code in PR #1167: URL: https://github.com/apache/datafusion-comet/pull/1167#discussion_r1882986634 ## spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala: ## @@ -2195,6 +2195,133 @@ class CometExpressionSuite extends CometTestBase with AdaptiveS

[PR] Minor: Add documentation explaining that initcap oly works for ASCII [datafusion]

2024-12-12 Thread via GitHub
alamb opened a new pull request, #13749: URL: https://github.com/apache/datafusion/pull/13749 ## Which issue does this PR close? - Related to https://github.com/apache/datafusion/pull/13691 ## Rationale for this change @tlm365 says: https://github.com/apache/datafusion/

  1   2   >