Re: [PR] Fix UNION field nullability tracking [datafusion]

2025-01-30 Thread via GitHub
wiedld commented on code in PR #14356: URL: https://github.com/apache/datafusion/pull/14356#discussion_r1936790068 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -2645,6 +2643,106 @@ pub struct Union { pub schema: DFSchemaRef, } +impl Union { +/// Constructs new

Re: [PR] Support within group syntax for existing aggregate functions [datafusion]

2025-01-30 Thread via GitHub
Garamda commented on code in PR #13511: URL: https://github.com/apache/datafusion/pull/13511#discussion_r1929539094 ## datafusion/sqllogictest/test_files/aggregate.slt: ## @@ -77,36 +77,38 @@ SELECT approx_distinct(c9) count_c9, approx_distinct(cast(c9 as varchar)) count_ #

Re: [PR] Fix UNION field nullability tracking [datafusion]

2025-01-30 Thread via GitHub
wiedld commented on code in PR #14356: URL: https://github.com/apache/datafusion/pull/14356#discussion_r1936767438 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -2645,6 +2643,106 @@ pub struct Union { pub schema: DFSchemaRef, } +impl Union { +/// Constructs new

Re: [I] [DISCUSSION] Add separate crate to cover spark builtin functions [datafusion]

2025-01-30 Thread via GitHub
shehabgamin commented on issue #5600: URL: https://github.com/apache/datafusion/issues/5600#issuecomment-2626525138 > > I will say that we have encountered numerous problems relying on downstream DataFusion-based crates,...The issue isn't with the crates themselves but arises when it's time

Re: [PR] Add RETURNS TABLE() support for CREATE FUNCTION in Postgresql [datafusion-sqlparser-rs]

2025-01-30 Thread via GitHub
remysaissy commented on PR #1687: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1687#issuecomment-2626465386 Yes, I’ll do it today. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Querying Parquet file specifically with a predicate returns invalid data error but works in other situations [datafusion]

2025-01-30 Thread via GitHub
senyosimpson commented on issue #14281: URL: https://github.com/apache/datafusion/issues/14281#issuecomment-2626439477 > Since `0x02` is not supported in arrow-rs What's interesting here is that function seems to be a helper function. `0x02` is handled here when reading a field for ex

Re: [PR] Fix Type Coercion for UDF Arguments [datafusion]

2025-01-30 Thread via GitHub
alamb commented on code in PR #14268: URL: https://github.com/apache/datafusion/pull/14268#discussion_r1936187986 ## datafusion/sqllogictest/test_files/expr.slt: ## @@ -571,8 +601,10 @@ select repeat('-1.2', arrow_cast(3, 'Int32')); -1.2-1.2-1.2 -query error DataFusion

Re: [I] Improve `repeat` so it errors if the second argument is a non integer [datafusion]

2025-01-30 Thread via GitHub
jonathanc-n commented on issue #14376: URL: https://github.com/apache/datafusion/issues/14376#issuecomment-2625428761 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] Fix Type Coercion for UDF Arguments [datafusion]

2025-01-30 Thread via GitHub
alamb commented on PR #14268: URL: https://github.com/apache/datafusion/pull/14268#issuecomment-2625430773 @jayzhan211 and @shehabgamin what is the status of this PR? It looks to me like there are some unresolved comments It looks like there are some unresolved comments like https:

Re: [I] Improve `repeat` so it errors if the second argument is a non integer [datafusion]

2025-01-30 Thread via GitHub
alamb commented on issue #14376: URL: https://github.com/apache/datafusion/issues/14376#issuecomment-2625439850 FYI @jonathanc-n it is not clear to me what should happen with https://github.com/apache/datafusion/pull/14268 (if it will be merged or not) -- This is an automated message fro

Re: [PR] ignore: Test impact of proposed decimal/float type coercion changes upstream [datafusion-comet]

2025-01-30 Thread via GitHub
andygrove closed pull request #1355: ignore: Test impact of proposed decimal/float type coercion changes upstream URL: https://github.com/apache/datafusion-comet/pull/1355 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[I] Improve `repeat` so it errors if the second argument is a non integer [datafusion]

2025-01-30 Thread via GitHub
alamb opened a new issue, #14376: URL: https://github.com/apache/datafusion/issues/14376 ### Is your feature request related to a problem or challenge? Prior to https://github.com/apache/datafusion/pull/14268, the following query would error: ```sql select repeat('-1.2', 3.2

[PR] fix: disable ArrayRemove by default as described in the doc [datafusion-comet]

2025-01-30 Thread via GitHub
kazuyukitanimura opened a new pull request, #1356: URL: https://github.com/apache/datafusion-comet/pull/1356 ## Which issue does this PR close? Closes #. ## Rationale for this change The doc says ArrayRemove is experimental https://github.com/apache/datafusion-comet/blob

Re: [PR] add manual trigger for extended tests in pull requests [datafusion]

2025-01-30 Thread via GitHub
alamb commented on PR #14331: URL: https://github.com/apache/datafusion/pull/14331#issuecomment-2625442484 I looked more at what happened with this PR: - https://github.com/alamb/datafusion/pull/28 > I am going to wait to see what happens on success and then I will purposely introd

[PR] Alamb/string view coercion [datafusion]

2025-01-30 Thread via GitHub
alamb opened a new pull request, #14377: URL: https://github.com/apache/datafusion/pull/14377 ## Which issue does this PR close? - closes of https://github.com/apache/datafusion/issues/13359 - closes https://github.com/apache/datafusion/pull/13366 - closes https://github.com/apac

Re: [I] String sqllogictest error when running the test with `complete` [datafusion]

2025-01-30 Thread via GitHub
alamb closed issue #12752: String sqllogictest error when running the test with `complete` URL: https://github.com/apache/datafusion/issues/12752 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Test for string / numeric coercion [datafusion]

2025-01-30 Thread via GitHub
alamb closed pull request #13606: Test for string / numeric coercion URL: https://github.com/apache/datafusion/pull/13606 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [I] String sqllogictest error when running the test with `complete` [datafusion]

2025-01-30 Thread via GitHub
alamb commented on issue #12752: URL: https://github.com/apache/datafusion/issues/12752#issuecomment-2625515646 @logan-keede fixed this upstream last week: https://github.com/risinglightdb/sqllogictest-rs/pull/249 -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Test for string / numeric coercion [datafusion]

2025-01-30 Thread via GitHub
alamb commented on PR #13606: URL: https://github.com/apache/datafusion/pull/13606#issuecomment-2625516761 - Superceded by https://github.com/apache/datafusion/pull/14377 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Support `Utf8View` to `numeric` coercion [datafusion]

2025-01-30 Thread via GitHub
Omega359 commented on PR #14377: URL: https://github.com/apache/datafusion/pull/14377#issuecomment-2625565209 Looking good, thx for putting this together -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] get_wider_type in binary.rs does not support utf8view [datafusion]

2025-01-30 Thread via GitHub
alamb commented on issue #13360: URL: https://github.com/apache/datafusion/issues/13360#issuecomment-2625588251 It turns out the only user of `get_wider_type` as noted by @jayzhan211 and @Omega359 in https://github.com/apache/datafusion/pull/13370#discussion_r1843956046 and in fact `get_

Re: [PR] Fix UNION field nullability tracking [datafusion]

2025-01-30 Thread via GitHub
findepi commented on code in PR #14356: URL: https://github.com/apache/datafusion/pull/14356#discussion_r1936295124 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -2645,6 +2643,106 @@ pub struct Union { pub schema: DFSchemaRef, } +impl Union { +/// Constructs new

Re: [PR] Make numeric literal underscore test dialect agnostic [datafusion-sqlparser-rs]

2025-01-30 Thread via GitHub
alamb merged PR #1685: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1685 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] bug: Fix NULL handling in array_slice, introduce `NullHandling` enum to `Signature` [datafusion]

2025-01-30 Thread via GitHub
alamb commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2625816267 I marked this PR as an API change and updated the title to reflect it. I suggest we wait until we cut the release to merge - #14008 I hope to make a RC in the next few days

Re: [PR] fix: expressions doc for ArrayRemove [datafusion-comet]

2025-01-30 Thread via GitHub
kazuyukitanimura commented on PR #1356: URL: https://github.com/apache/datafusion-comet/pull/1356#issuecomment-2625822827 Thank you @andygrove -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] [WIP] Introduce the "parser" feature to gate the SQL text processing and leaving only AST and other support types [datafusion-sqlparser-rs]

2025-01-30 Thread via GitHub
alamb commented on PR #1691: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1691#issuecomment-2625826803 > I tried, but there are non-trivial dependencies on the ast:: and dynamic dispatch to the dialect trait. I think it's reasonable to share the ast:: types instead of abstra

Re: [PR] [WIP] Introduce the "parser" feature to gate the SQL text processing and leaving only AST and other support types [datafusion-sqlparser-rs]

2025-01-30 Thread via GitHub
felipecrv commented on PR #1691: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1691#issuecomment-2625815117 > > This crate can become a very lightweight dependency to datafusion (and other projects) that have their own SQL parser but need to use datafusion-sqlparser-rs AST ty

Re: [PR] fix: expressions doc for ArrayRemove [datafusion-comet]

2025-01-30 Thread via GitHub
kazuyukitanimura merged PR #1356: URL: https://github.com/apache/datafusion-comet/pull/1356 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsub

Re: [PR] [WIP] Introduce the "parser" feature to gate the SQL text processing and leaving only AST and other support types [datafusion-sqlparser-rs]

2025-01-30 Thread via GitHub
alamb commented on PR #1691: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1691#issuecomment-2625799920 > This crate can become a very lightweight dependency to datafusion (and other projects) that have their own SQL parser but need to use datafusion-sqlparser-rs AST types to

Re: [PR] [WIP] Introduce the "parser" feature to gate the SQL text processing and leaving only AST and other support types [datafusion-sqlparser-rs]

2025-01-30 Thread via GitHub
felipecrv commented on PR #1691: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1691#issuecomment-2625844582 > > I tried, but there are non-trivial dependencies on the ast:: and dynamic dispatch to the dialect trait. I think it's reasonable to share the ast:: types instead of

Re: [PR] Fix Type Coercion for UDF Arguments [datafusion]

2025-01-30 Thread via GitHub
jayzhan211 commented on PR #14268: URL: https://github.com/apache/datafusion/pull/14268#issuecomment-2626205413 > > > @jayzhan211 and @shehabgamin what is the status of this PR? It looks to me like there are some unresolved comments > > > It looks like there are some unresolved comments

Re: [PR] Support `array_concat` for `Utf8View` [datafusion]

2025-01-30 Thread via GitHub
alamb commented on code in PR #14378: URL: https://github.com/apache/datafusion/pull/14378#discussion_r1936297339 ## datafusion/expr-common/src/type_coercion/binary.rs: ## @@ -869,54 +867,6 @@ fn get_wider_decimal_type( } } -/// Returns the wider type among arguments `lh

Re: [PR] Support `array_concat` for `Utf8View` [datafusion]

2025-01-30 Thread via GitHub
alamb commented on code in PR #14378: URL: https://github.com/apache/datafusion/pull/14378#discussion_r1936297339 ## datafusion/expr-common/src/type_coercion/binary.rs: ## @@ -869,54 +867,6 @@ fn get_wider_decimal_type( } } -/// Returns the wider type among arguments `lh

[PR] Support `array_concat` for `Utf8View` [datafusion]

2025-01-30 Thread via GitHub
alamb opened a new pull request, #14378: URL: https://github.com/apache/datafusion/pull/14378 ## Which issue does this PR close? - Closes https://github.com/apache/datafusion/issues/13360 ## Rationale for this change As part of completing https://github.com/apache/dat

Re: [PR] Fix UNION field nullability tracking [datafusion]

2025-01-30 Thread via GitHub
findepi commented on code in PR #14356: URL: https://github.com/apache/datafusion/pull/14356#discussion_r1936297998 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -2645,6 +2643,106 @@ pub struct Union { pub schema: DFSchemaRef, } +impl Union { +/// Constructs new

Re: [PR] Fix UNION field nullability tracking [datafusion]

2025-01-30 Thread via GitHub
findepi commented on code in PR #14356: URL: https://github.com/apache/datafusion/pull/14356#discussion_r1936298445 ## datafusion/sqllogictest/test_files/union.slt: ## @@ -836,3 +836,18 @@ physical_plan # Clean up after the test statement ok drop table aggregate_test_100; + +

Re: [PR] fix: Support `Utf8View` in `numeric_string_coercion` [datafusion]

2025-01-30 Thread via GitHub
alamb commented on PR #13366: URL: https://github.com/apache/datafusion/pull/13366#issuecomment-2625517576 I have added some additional tests to this code and made a new PR here - https://github.com/apache/datafusion/pull/14377 -- This is an automated message from the Apache Git

Re: [PR] fix: expressions doc for ArrayRemove [datafusion-comet]

2025-01-30 Thread via GitHub
kazuyukitanimura commented on PR #1356: URL: https://github.com/apache/datafusion-comet/pull/1356#issuecomment-2625755924 Thank you @andygrove, changed this PR for updating doc only -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [I] Prepared physical plan reusage [datafusion]

2025-01-30 Thread via GitHub
alamb commented on issue #14342: URL: https://github.com/apache/datafusion/issues/14342#issuecomment-2625759843 > So, what do you think about an idea to extract metrics to the some place like TaskContext and associate them with physical plan nodes some way? I think this will be challe

Re: [PR] Fix `CREATE FUNCTION` round trip for Hive dialect [datafusion-sqlparser-rs]

2025-01-30 Thread via GitHub
alamb merged PR #1693: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1693 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-30 Thread via GitHub
chenkovsky commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2625954231 > So you're worried that if a Parquet file has this special metadata on a field we will wrongly interpret it as a system column? Or are you saying that's a good thing / the goal?

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-30 Thread via GitHub
chenkovsky commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2625942249 > I don't think (2) or (3) are real concerns. Things manipulating metadata should already be careful about not clobbering existing metadata and I really, really don't think the pe

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-30 Thread via GitHub
adriangb commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2625952339 So you're worried that if a Parquet file has this special metadata on a field we will wrongly interpret it as a system column? Or are you saying that's a good thing / the goal? --

Re: [PR] bug: Fix NULL handling in array_slice, introduce `NullHandling` enum to `Signature` [datafusion]

2025-01-30 Thread via GitHub
jkosh44 commented on code in PR #14289: URL: https://github.com/apache/datafusion/pull/14289#discussion_r1936605304 ## datafusion/physical-expr/src/scalar_function.rs: ## @@ -186,6 +187,15 @@ impl PhysicalExpr for ScalarFunctionExpr { .map(|e| e.evaluate(batch))

Re: [PR] feat: add experimental remote HDFS support for native DataFusion reader [datafusion-comet]

2025-01-30 Thread via GitHub
codecov-commenter commented on PR #1359: URL: https://github.com/apache/datafusion-comet/pull/1359#issuecomment-2626183618 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1359?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [I] [DISCUSSION] Add separate crate to cover spark builtin functions [datafusion]

2025-01-30 Thread via GitHub
kazuyukitanimura commented on issue #5600: URL: https://github.com/apache/datafusion/issues/5600#issuecomment-2626192955 Echoing @andygrove 's point. Also if we move the spark-expr to DataFusion core, release management might get harder. E.g. we may want to fix spark-expr bugs quickly, but

Re: [PR] Fully support LIKE/NLIKE with Utf8View [datafusion]

2025-01-30 Thread via GitHub
alamb commented on code in PR #14379: URL: https://github.com/apache/datafusion/pull/14379#discussion_r1936332144 ## datafusion/sql/src/expr/mod.rs: ## @@ -819,10 +820,6 @@ impl SqlToRel<'_, S> { return not_impl_err!("ANY in LIKE expression"); } le

Re: [I] binary_to_string_coercion in binary.rs does not support utf8view [datafusion]

2025-01-30 Thread via GitHub
alamb commented on issue #13361: URL: https://github.com/apache/datafusion/issues/13361#issuecomment-2625656602 I see now that @jonathanc-n proposes some additions in - https://github.com/apache/datafusion/pull/13370#discussion_r1936327292 However I am not sure how important those

[I] binary_to_string_coercion in binary.rs does not support utf8view [datafusion]

2025-01-30 Thread via GitHub
Omega359 opened a new issue, #13361: URL: https://github.com/apache/datafusion/issues/13361 ### Describe the bug The coercion rules defined in binary_to_string_coercion do not account for utf8view ### To Reproduce _No response_ ### Expected behavior binary

Re: [PR] feat: Support `Utf8View` for `get_wider_type` + `binary_to_string_coercion` functions [datafusion]

2025-01-30 Thread via GitHub
alamb commented on code in PR #13370: URL: https://github.com/apache/datafusion/pull/13370#discussion_r1936327292 ## datafusion/expr-common/src/type_coercion/binary.rs: ## @@ -1172,16 +1174,22 @@ fn binary_to_string_coercion( match (lhs_type, rhs_type) { (Binary, U

Re: [PR] Feat: support array_compact function [datafusion-comet]

2025-01-30 Thread via GitHub
codecov-commenter commented on PR #1321: URL: https://github.com/apache/datafusion-comet/pull/1321#issuecomment-2625927936 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1321?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] Provide user-defined invariants for logical node extensions. [datafusion]

2025-01-30 Thread via GitHub
wiedld commented on code in PR #14329: URL: https://github.com/apache/datafusion/pull/14329#discussion_r1936635870 ## datafusion/expr/src/logical_plan/extension.rs: ## @@ -54,6 +57,22 @@ pub trait UserDefinedLogicalNode: fmt::Debug + Send + Sync { /// Return the output sche

Re: [PR] Provide user-defined invariants for logical node extensions. [datafusion]

2025-01-30 Thread via GitHub
wiedld commented on code in PR #14329: URL: https://github.com/apache/datafusion/pull/14329#discussion_r1936635870 ## datafusion/expr/src/logical_plan/extension.rs: ## @@ -54,6 +57,22 @@ pub trait UserDefinedLogicalNode: fmt::Debug + Send + Sync { /// Return the output sche

Re: [PR] minor: update fuzz dependency [datafusion-comet]

2025-01-30 Thread via GitHub
kazuyukitanimura commented on PR #1357: URL: https://github.com/apache/datafusion-comet/pull/1357#issuecomment-2626242018 Thanks @andygrove -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] minor: update fuzz dependency [datafusion-comet]

2025-01-30 Thread via GitHub
kazuyukitanimura merged PR #1357: URL: https://github.com/apache/datafusion-comet/pull/1357 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsub

Re: [PR] bug: Fix NULL handling in array_slice, introduce `NullHandling` enum to `Signature` [datafusion]

2025-01-30 Thread via GitHub
jayzhan211 commented on code in PR #14289: URL: https://github.com/apache/datafusion/pull/14289#discussion_r1936645403 ## datafusion/physical-expr/src/scalar_function.rs: ## @@ -186,6 +187,15 @@ impl PhysicalExpr for ScalarFunctionExpr { .map(|e| e.evaluate(batch))

Re: [PR] Feature: AggregateMonotonicity [datafusion]

2025-01-30 Thread via GitHub
ozankabak commented on code in PR #14271: URL: https://github.com/apache/datafusion/pull/14271#discussion_r1936681167 ## datafusion/sqllogictest/test_files/window.slt: ## @@ -5452,3 +5452,89 @@ order by c1, c2, rank1, rank2; statement ok drop table t1; + + +# Set-Monotonic W

Re: [PR] Feature: AggregateMonotonicity [datafusion]

2025-01-30 Thread via GitHub
ozankabak merged PR #14271: URL: https://github.com/apache/datafusion/pull/14271 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] Fix `CREATE FUNCTION` round trip for Hive dialect [datafusion-sqlparser-rs]

2025-01-30 Thread via GitHub
iffyio commented on PR #1693: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1693#issuecomment-2626368069 Thanks @alamb! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Fix UNION field nullability tracking [datafusion]

2025-01-30 Thread via GitHub
alamb commented on code in PR #14356: URL: https://github.com/apache/datafusion/pull/14356#discussion_r1936301257 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -2645,6 +2643,106 @@ pub struct Union { pub schema: DFSchemaRef, } +impl Union { +/// Constructs new U

Re: [I] Type Coercion fails for List with inner type struct which has large/view types [datafusion]

2025-01-30 Thread via GitHub
alamb commented on issue #14154: URL: https://github.com/apache/datafusion/issues/14154#issuecomment-2626099386 I tried to make a datafusion only reproducer but it turns out I can't create LargeLists of structs via SQL 🤔 I'll have to think about how to do so a bit more tomorrow... -- Thi

Re: [PR] feat: add experimental remote HDFS support for native DataFusion reader [datafusion-comet]

2025-01-30 Thread via GitHub
parthchandra commented on code in PR #1359: URL: https://github.com/apache/datafusion-comet/pull/1359#discussion_r1936547960 ## native/core/src/parquet/parquet_support.rs: ## @@ -1861,6 +1864,40 @@ fn trim_end(s: &str) -> &str { } } +#[cfg(not(feature = "hdfs"))] +pub(cr

Re: [I] [DISCUSSION]: Unified approach for joins to output batches close to `batch_size` [datafusion]

2025-01-30 Thread via GitHub
comphead commented on issue #14238: URL: https://github.com/apache/datafusion/issues/14238#issuecomment-2626102317 Thanks @korowa for running the benchmarks. 🚀 Speaking to `Coalescer` It is probably worth to think if we want to have a unified approach, like calling the coalescer on t

Re: [PR] feat: add experimental remote HDFS support for native DataFusion reader [datafusion-comet]

2025-01-30 Thread via GitHub
comphead commented on code in PR #1359: URL: https://github.com/apache/datafusion-comet/pull/1359#discussion_r1936551696 ## native/core/src/execution/planner.rs: ## @@ -1220,7 +1217,7 @@ impl PhysicalPlanner { // TODO: I think we can remove partition_count in th

Re: [I] [DISCUSSION] Add separate crate to cover spark builtin functions [datafusion]

2025-01-30 Thread via GitHub
shehabgamin commented on issue #5600: URL: https://github.com/apache/datafusion/issues/5600#issuecomment-2626095699 > [@shehabgamin](https://github.com/shehabgamin) if we did that, would you be willing to help implement / upstream some of your implementations and tests? Yes! @a

Re: [PR] feat: add experimental remote HDFS support for native DataFusion reader [datafusion-comet]

2025-01-30 Thread via GitHub
comphead commented on code in PR #1359: URL: https://github.com/apache/datafusion-comet/pull/1359#discussion_r1936553571 ## native/core/src/parquet/parquet_support.rs: ## @@ -1861,6 +1864,40 @@ fn trim_end(s: &str) -> &str { } } +#[cfg(not(feature = "hdfs"))] +pub(crate)

Re: [PR] Improve speed of `median` by implementing special `GroupsAccumulator` [datafusion]

2025-01-30 Thread via GitHub
Rachelint commented on PR #13681: URL: https://github.com/apache/datafusion/pull/13681#issuecomment-2626112366 > Will plan to merge this one tomorrow if there is not anyone else who would like time to review Thanks all for reviewing! -- This is an automated message from the Apache

Re: [PR] feat: add experimental remote HDFS support for native DataFusion reader [datafusion-comet]

2025-01-30 Thread via GitHub
comphead commented on code in PR #1359: URL: https://github.com/apache/datafusion-comet/pull/1359#discussion_r1936553571 ## native/core/src/parquet/parquet_support.rs: ## @@ -1861,6 +1864,40 @@ fn trim_end(s: &str) -> &str { } } +#[cfg(not(feature = "hdfs"))] +pub(crate)

Re: [PR] Improve speed of `median` by implementing special `GroupsAccumulator` [datafusion]

2025-01-30 Thread via GitHub
Rachelint merged PR #13681: URL: https://github.com/apache/datafusion/pull/13681 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [I] Improve performance of `median` function [datafusion]

2025-01-30 Thread via GitHub
Rachelint closed issue #13550: Improve performance of `median` function URL: https://github.com/apache/datafusion/issues/13550 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] minor: update fuzz dependency [datafusion-comet]

2025-01-30 Thread via GitHub
codecov-commenter commented on PR #1357: URL: https://github.com/apache/datafusion-comet/pull/1357#issuecomment-2626116353 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1357?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] Fix incorrect searched CASE optimization [datafusion]

2025-01-30 Thread via GitHub
findepi commented on code in PR #14349: URL: https://github.com/apache/datafusion/pull/14349#discussion_r1936291719 ## datafusion/sqllogictest/test_files/case.slt: ## @@ -289,12 +289,22 @@ query B select case when a=1 then false end from foo; false -false -false -false -

Re: [I] Invalid query result when searched CASE has nullable condition and boolean result [datafusion]

2025-01-30 Thread via GitHub
findepi closed issue #14343: Invalid query result when searched CASE has nullable condition and boolean result URL: https://github.com/apache/datafusion/issues/14343 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Fix incorrect searched CASE optimization [datafusion]

2025-01-30 Thread via GitHub
findepi merged PR #14349: URL: https://github.com/apache/datafusion/pull/14349 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafu

Re: [I] User Defined Coercion Rules [datafusion]

2025-01-30 Thread via GitHub
alamb commented on issue #14296: URL: https://github.com/apache/datafusion/issues/14296#issuecomment-2625773366 > In particular, the coercion rules should be applied by the analyzer, during the logical plan construction. I agree with this goal 100% -- This is an automated message f

Re: [I] [DISCUSSION] Add separate crate to cover spark builtin functions [datafusion]

2025-01-30 Thread via GitHub
alamb commented on issue #5600: URL: https://github.com/apache/datafusion/issues/5600#issuecomment-2625779458 > I love the idea of collaborating on Spark compatible `UDF`s. > > As of writing, `243/402` Spark functions doc-tests pass on Sail. We haven't focused on performance yet and i

[PR] feat: add experimental remote HDFS support for native DataFusion reader [datafusion-comet]

2025-01-30 Thread via GitHub
comphead opened a new pull request, #1359: URL: https://github.com/apache/datafusion-comet/pull/1359 ## Which issue does this PR close? Closes #1337. ## Rationale for this change ## What changes are included in this PR? ## How are these chan

Re: [PR] [WIP] Introduce the "parser" feature to gate the SQL text processing and leaving only AST and other support types [datafusion-sqlparser-rs]

2025-01-30 Thread via GitHub
alamb commented on PR #1691: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1691#issuecomment-2626048512 > But then sqlparser would depend on datafusion and not just the other way around. I meant literally copy/pasting (or some variant thereof) of the relevant s

Re: [PR] feat: add experimental remote HDFS support for native DataFusion reader [datafusion-comet]

2025-01-30 Thread via GitHub
comphead commented on code in PR #1359: URL: https://github.com/apache/datafusion-comet/pull/1359#discussion_r1936518217 ## native/core/Cargo.toml: ## @@ -77,6 +77,7 @@ datafusion-comet-proto = { workspace = true } object_store = { workspace = true } url = { workspace = true }

Re: [PR] feat: add experimental remote HDFS support for native DataFusion reader [datafusion-comet]

2025-01-30 Thread via GitHub
comphead commented on code in PR #1359: URL: https://github.com/apache/datafusion-comet/pull/1359#discussion_r1936518708 ## native/core/src/execution/planner.rs: ## @@ -1220,7 +1217,7 @@ impl PhysicalPlanner { // TODO: I think we can remove partition_count in th

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-30 Thread via GitHub
adriangb commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-262595 See also https://github.com/apache/datafusion/issues/12736 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-30 Thread via GitHub
adriangb commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2625961004 > > So you're worried that if a Parquet file has this special metadata on a field we will wrongly interpret it as a system column? Or are you saying that's a good thing / the goal?

Re: [PR] bug: Fix NULL handling in array_slice, introduce `NullHandling` enum to `Signature` [datafusion]

2025-01-30 Thread via GitHub
jkosh44 commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2625975316 There's still the open question of how window and aggregate functions should treat `NullBehavior::Propagate`. Table functions don't use the `Signature` struct, so we can ignore them

Re: [I] [DISCUSSION] Add separate crate to cover spark builtin functions [datafusion]

2025-01-30 Thread via GitHub
andygrove commented on issue #5600: URL: https://github.com/apache/datafusion/issues/5600#issuecomment-2626014384 I almost started a conversation about this but held back. Moving this crate upstream has a lot of value, and I support doing so. However, assuming that most DataFusion con

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-30 Thread via GitHub
adriangb commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2626122404 One thing I didn't understand from you PR @chenkovsky: you got it to work only modifying TableScan? I had trouble understanding when I was grokking your PR. If you have a better way

Re: [PR] Add support for data type specific methods [datafusion-sqlparser-rs]

2025-01-30 Thread via GitHub
github-actions[bot] closed pull request #1535: Add support for data type specific methods URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1535 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] POC: Eliminate unnecessary group by keys (q35 in clickbench 1.35x faster) [datafusion]

2025-01-30 Thread via GitHub
github-actions[bot] commented on PR #13617: URL: https://github.com/apache/datafusion/pull/13617#issuecomment-2626123488 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-30 Thread via GitHub
adriangb commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2626145613 For now I pushed https://github.com/apache/datafusion/pull/14362/commits/c2b58ee1632b7090188f4f7d5af6c78fbf40462e which has a test that I think covers all of the cases discussed.

[I] Parse MySQL `SET GLOBAL` variables [datafusion-sqlparser-rs]

2025-01-30 Thread via GitHub
mvzink opened a new issue, #1694: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1694 In addition to `SESSION` and `LOCAL` qualifiers, MySQL allows `GLOBAL` for modifying system variables: ``` mysql> SET GLOBAL max_connections = 1000; Query OK, 0 rows affected (0.

[PR] minor: update fuzz dependency [datafusion-comet]

2025-01-30 Thread via GitHub
kazuyukitanimura opened a new pull request, #1357: URL: https://github.com/apache/datafusion-comet/pull/1357 ## Which issue does this PR close? Closes #. ## Rationale for this change fuzz comet dependency version was hardcoded ## What changes are included in this P

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-30 Thread via GitHub
chenkovsky commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2625969618 > > > So you're worried that if a Parquet file has this special metadata on a field we will wrongly interpret it as a system column? Or are you saying that's a good thing / the go

Re: [PR] minor: commit compatibility doc [datafusion-comet]

2025-01-30 Thread via GitHub
kazuyukitanimura commented on PR #1358: URL: https://github.com/apache/datafusion-comet/pull/1358#issuecomment-2626041757 cc @andygrove -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

[PR] minor: commit compatibility doc [datafusion-comet]

2025-01-30 Thread via GitHub
kazuyukitanimura opened a new pull request, #1358: URL: https://github.com/apache/datafusion-comet/pull/1358 Leftover from #1349 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] minor: update fuzz dependency [datafusion-comet]

2025-01-30 Thread via GitHub
kazuyukitanimura commented on PR #1357: URL: https://github.com/apache/datafusion-comet/pull/1357#issuecomment-2626041966 cc @andygrove -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-30 Thread via GitHub
adriangb commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2625987325 I see. That makes sense for something like row_id: once you write it to a new file it may not be valid anymore. But for other metadata you might want to round trip it as a system co

Re: [PR] minor: commit compatibility doc [datafusion-comet]

2025-01-30 Thread via GitHub
kazuyukitanimura commented on PR #1358: URL: https://github.com/apache/datafusion-comet/pull/1358#issuecomment-2626045460 Thanks @andygrove -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Fix Type Coercion for UDF Arguments [datafusion]

2025-01-30 Thread via GitHub
jayzhan211 commented on PR #14268: URL: https://github.com/apache/datafusion/pull/14268#issuecomment-2626045546 > @jayzhan211 and @shehabgamin what is the status of this PR? It looks to me like there are some unresolved comments > > It looks like there are some unresolved comments l

Re: [PR] bug: Fix NULL handling in array_slice, introduce `NullHandling` enum to `Signature` [datafusion]

2025-01-30 Thread via GitHub
jayzhan211 commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2626019855 Implement it as the method in the trait and call it inside 'invoke' methods requires more changes. However, I think it makes more sense now given null handling should be the logic

Re: [PR] feat: add experimental remote HDFS support for native DataFusion reader [datafusion-comet]

2025-01-30 Thread via GitHub
comphead commented on code in PR #1359: URL: https://github.com/apache/datafusion-comet/pull/1359#discussion_r1936519276 ## native/core/src/parquet/parquet_support.rs: ## @@ -1861,6 +1864,40 @@ fn trim_end(s: &str) -> &str { } } +#[cfg(not(feature = "hdfs"))] +pub(crate)

Re: [PR] minor: commit compatibility doc [datafusion-comet]

2025-01-30 Thread via GitHub
kazuyukitanimura merged PR #1358: URL: https://github.com/apache/datafusion-comet/pull/1358 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsub

  1   2   3   >