[PR] Use LogicalType for TypeSignature `Coercible` and `String` [datafusion]

2024-11-03 Thread via GitHub
jayzhan211 opened a new pull request, #13240: URL: https://github.com/apache/datafusion/pull/13240 ## Which issue does this PR close? Closes #. ## Rationale for this change An attempt to use logical type for TypeSignature ## What changes are include

Re: [I] Implement nested join optimization [datafusion]

2024-11-03 Thread via GitHub
maruschin commented on issue #3843: URL: https://github.com/apache/datafusion/issues/3843#issuecomment-2453801494 Hi, is there any progress? I can take the task for initial development. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[PR] Introduce `full_qualified_col` option for the unparser dialect [datafusion]

2024-11-03 Thread via GitHub
goldmedal opened a new pull request, #13241: URL: https://github.com/apache/datafusion/pull/13241 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes tested?

Re: [PR] Deprecate `LexOrderingRef` and `LexRequirementRef` [datafusion]

2024-11-03 Thread via GitHub
alamb commented on code in PR #13233: URL: https://github.com/apache/datafusion/pull/13233#discussion_r1826957395 ## datafusion/physical-expr-common/src/sort_expr.rs: ## @@ -352,20 +346,26 @@ pub struct LexOrdering { pub inner: Vec, } +impl AsRef for LexOrdering { +f

Re: [I] [Python] Support pathlib.Path arguments for ExecutionContext.register_* methods [datafusion]

2024-11-03 Thread via GitHub
drauschenbach commented on issue #491: URL: https://github.com/apache/datafusion/issues/491#issuecomment-2453457353 Obsolete, now that `datafusion-python` is an external git repo. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [I] [Ballista] Improve task and job metadata [datafusion]

2024-11-03 Thread via GitHub
drauschenbach commented on issue #472: URL: https://github.com/apache/datafusion/issues/472#issuecomment-2453458575 This issue should be closed now that Ballista no longer resides in this git repo. -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [I] [substrait] handle AggregateRel grouping_expressions field [datafusion]

2024-11-03 Thread via GitHub
alamb closed issue #12957: [substrait] handle AggregateRel grouping_expressions field URL: https://github.com/apache/datafusion/issues/12957 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] feat(substrait): AggregateRel grouping_expressions support [datafusion]

2024-11-03 Thread via GitHub
alamb merged PR #13173: URL: https://github.com/apache/datafusion/pull/13173 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Apply projection to `Statistics` in `FilterExec` [datafusion]

2024-11-03 Thread via GitHub
alamb commented on PR #13187: URL: https://github.com/apache/datafusion/pull/13187#issuecomment-2453403770 > Thanks for the fix. LGTM! > > I wonder we could and some debug_assert or something to catch bugs of this sort in some general way. This is an interesting idea @eejbyfeld

Re: [PR] Apply projection to `Statistics` in `FilterExec` [datafusion]

2024-11-03 Thread via GitHub
alamb commented on PR #13187: URL: https://github.com/apache/datafusion/pull/13187#issuecomment-2453403860 Thank you for the reviews @Dandandan and @eejbyfeldt -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Hashmap type alias [datafusion]

2024-11-03 Thread via GitHub
Dandandan commented on code in PR #13236: URL: https://github.com/apache/datafusion/pull/13236#discussion_r1827033898 ## datafusion/sql/src/unparser/rewrite.rs: ## @@ -15,15 +15,12 @@ // specific language governing permissions and limitations // under the License. -use std::

Re: [I] [Python] Support pathlib.Path arguments for ExecutionContext.register_* methods [datafusion]

2024-11-03 Thread via GitHub
andygrove commented on issue #491: URL: https://github.com/apache/datafusion/issues/491#issuecomment-2453514155 Closing this. Thanks @drauschenbach. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] Review the contract between DataFusion and Arrow [datafusion]

2024-11-03 Thread via GitHub
andygrove commented on issue #196: URL: https://github.com/apache/datafusion/issues/196#issuecomment-2453514109 Closing this. Thanks @drauschenbach. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] : Tracking issue for big endian platforms [datafusion]

2024-11-03 Thread via GitHub
andygrove closed issue #111: : Tracking issue for big endian platforms URL: https://github.com/apache/datafusion/issues/111 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [I] : Tracking issue for big endian platforms [datafusion]

2024-11-03 Thread via GitHub
andygrove commented on issue #111: URL: https://github.com/apache/datafusion/issues/111#issuecomment-2453513953 Not relevant to this repo. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [I] Using string datatype from python raises "Exception: The type 13 is not valid" [datafusion]

2024-11-03 Thread via GitHub
andygrove closed issue #693: Using string datatype from python raises "Exception: The type 13 is not valid" URL: https://github.com/apache/datafusion/issues/693 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [I] Add support of HDFS as remote object store [datafusion]

2024-11-03 Thread via GitHub
andygrove closed issue #1060: Add support of HDFS as remote object store URL: https://github.com/apache/datafusion/issues/1060 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [I] Add support of HDFS as remote object store [datafusion]

2024-11-03 Thread via GitHub
andygrove commented on issue #1060: URL: https://github.com/apache/datafusion/issues/1060#issuecomment-2453514413 Closing this. Thanks @drauschenbach. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] [Ballista] Improve task and job metadata [datafusion]

2024-11-03 Thread via GitHub
andygrove closed issue #472: [Ballista] Improve task and job metadata URL: https://github.com/apache/datafusion/issues/472 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [I] [Ballista] Improve task and job metadata [datafusion]

2024-11-03 Thread via GitHub
andygrove commented on issue #472: URL: https://github.com/apache/datafusion/issues/472#issuecomment-2453514186 Closing this. Thanks @drauschenbach. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] Review the contract between DataFusion and Arrow [datafusion]

2024-11-03 Thread via GitHub
andygrove closed issue #196: Review the contract between DataFusion and Arrow URL: https://github.com/apache/datafusion/issues/196 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [I] Evaluate pyo3 abi3 wheel limitations for datafusion python binding [datafusion]

2024-11-03 Thread via GitHub
andygrove commented on issue #955: URL: https://github.com/apache/datafusion/issues/955#issuecomment-2453514339 Closing this. Thanks @drauschenbach. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] [Python] Support pathlib.Path arguments for ExecutionContext.register_* methods [datafusion]

2024-11-03 Thread via GitHub
andygrove closed issue #491: [Python] Support pathlib.Path arguments for ExecutionContext.register_* methods URL: https://github.com/apache/datafusion/issues/491 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] Simplify `EXPR LIKE 'constant'` to `expr = 'constant'` [datafusion]

2024-11-03 Thread via GitHub
adriangb commented on PR #13061: URL: https://github.com/apache/datafusion/pull/13061#issuecomment-2453610034 @alamb @findepi I redid this PR now to only handle the equality cases and close the issue. I do still think it'd be nice to figure out how to rewrite to startswith but it see

Re: [PR] Introduce a HashMap type alias [datafusion]

2024-11-03 Thread via GitHub
drauschenbach commented on code in PR #13236: URL: https://github.com/apache/datafusion/pull/13236#discussion_r1827090665 ## datafusion/sql/src/unparser/rewrite.rs: ## @@ -15,15 +15,12 @@ // specific language governing permissions and limitations // under the License. -use s

Re: [PR] Handle type coercion in signature for `ApproxPercentileCont` [datafusion]

2024-11-03 Thread via GitHub
github-actions[bot] commented on PR #12274: URL: https://github.com/apache/datafusion/pull/12274#issuecomment-2453702633 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] Fail on optimization cycles [datafusion]

2024-11-03 Thread via GitHub
github-actions[bot] commented on PR #11288: URL: https://github.com/apache/datafusion/pull/11288#issuecomment-2453702705 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] feat: support normalized expr in CSE [datafusion]

2024-11-03 Thread via GitHub
zhuliquan closed pull request #13235: feat: support normalized expr in CSE URL: https://github.com/apache/datafusion/pull/13235 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [minor] overload from_unixtime func to have optional timezone parameter [datafusion]

2024-11-03 Thread via GitHub
berkaysynnada commented on code in PR #13130: URL: https://github.com/apache/datafusion/pull/13130#discussion_r1827285629 ## datafusion/functions/src/datetime/from_unixtime.rs: ## @@ -59,28 +63,63 @@ impl ScalarUDFImpl for FromUnixtimeFunc { &self.signature } +

Re: [PR] Fix license header [datafusion]

2024-11-03 Thread via GitHub
waynexia commented on PR #12008: URL: https://github.com/apache/datafusion/pull/12008#issuecomment-2453990662 Almost forgot this ticket... Thanks to that great remainder 🤣 Renew the context: - About the consistency between two different tools: - I've replaced all the usages of

Re: [I] [Python]: Expose Dataframe.schema() to Python binding [datafusion]

2024-11-03 Thread via GitHub
drauschenbach commented on issue #1007: URL: https://github.com/apache/datafusion/issues/1007#issuecomment-2453471001 This appears to be resolved. In `datafusion-python`, I can call `df.schema()`. -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [I] Creating dataframe with Recordbatch using pyarrow.Table.to_batches gives "type16 not valid error" when schema includes date32[day] type [datafusion]

2024-11-03 Thread via GitHub
drauschenbach commented on issue #949: URL: https://github.com/apache/datafusion/issues/949#issuecomment-2453473453 This issue appears to have been resolved. ```python import datafusion import pyarrow import datetime ctx = datafusion.SessionContext() batch =

Re: [I] Evaluate pyo3 abi3 wheel limitations for datafusion python binding [datafusion]

2024-11-03 Thread via GitHub
drauschenbach commented on issue #955: URL: https://github.com/apache/datafusion/issues/955#issuecomment-2453471535 This issue seems obsolete, now that `datafusion-python` as relocated to an external git repo. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Minor: Improve documentation about `OnceAsync` [datafusion]

2024-11-03 Thread via GitHub
korowa commented on code in PR #13223: URL: https://github.com/apache/datafusion/pull/13223#discussion_r1827012220 ## datafusion/physical-plan/src/joins/hash_join.rs: ## @@ -314,6 +316,11 @@ pub struct HashJoinExec { /// if there is a projection, the schema isn't the same a

Re: [I] Refactor the hash_aggregate [datafusion]

2024-11-03 Thread via GitHub
drauschenbach commented on issue #839: URL: https://github.com/apache/datafusion/issues/839#issuecomment-2453475452 If I'm reading the code correctly, this specific reference to code no longer exists. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [I] persistence for `ExecutionContextState`? [datafusion]

2024-11-03 Thread via GitHub
drauschenbach commented on issue #755: URL: https://github.com/apache/datafusion/issues/755#issuecomment-2453478499 FYI `ExecutionContextState` is called `SessionState` nowadays. I briefly added `#[derive(..., Deserialize)` to it and a few other things, and this looks like a non-trivial cha

[PR] feat: support normalized expr in CSE [datafusion]

2024-11-03 Thread via GitHub
zhuliquan opened a new pull request, #13235: URL: https://github.com/apache/datafusion/pull/13235 ## Which issue does this PR close? ## Rationale for this change I notice that some expressions are semantically equivalent. (i.g. `a + b` and `b + a` is equivalent). I think their

Re: [PR] Minor: Improve documentation about `OnceAsync` [datafusion]

2024-11-03 Thread via GitHub
korowa commented on code in PR #13223: URL: https://github.com/apache/datafusion/pull/13223#discussion_r1827020010 ## datafusion/physical-plan/src/joins/cross_join.rs: ## @@ -46,16 +46,23 @@ use datafusion_physical_expr::equivalence::join_equivalence_properties; use async_trai

Re: [PR] fix bugs explain with non-correlated query [datafusion]

2024-11-03 Thread via GitHub
Lordworms commented on PR #13210: URL: https://github.com/apache/datafusion/pull/13210#issuecomment-2453603916 > Thank you @Lordworms -- I played around with this and I think I found another way to do it: > > [Lordworms#2](https://github.com/Lordworms/arrow-datafusion/pull/2) >

[I] Overflow in Join Processing for Large Ranges [datafusion]

2024-11-03 Thread via GitHub
demetribu opened a new issue, #13237: URL: https://github.com/apache/datafusion/issues/13237 ### Describe the bug The error occurs when performing an inner join on two large `unnest(range(...))` datasets in DataFusion. ``` ./datafusion-cli/target/debug/datafusion-cli -m 512

Re: [PR] Deprecate invoke and invoke_no_args in favor of invoke_batch [datafusion]

2024-11-03 Thread via GitHub
findepi commented on PR #13174: URL: https://github.com/apache/datafusion/pull/13174#issuecomment-2453554138 > I think we could file a ticket (good first issue) for migrating the rest of the built in functions (removing the #allow_deprecated) https://github.com/apache/datafusion/issue

Re: [PR] Improve push down filter of join [datafusion]

2024-11-03 Thread via GitHub
JasonLi-cn commented on code in PR #13184: URL: https://github.com/apache/datafusion/pull/13184#discussion_r1827195991 ## datafusion/optimizer/src/push_down_filter.rs: ## @@ -429,41 +434,63 @@ fn push_down_all_join( let mut keep_predicates = vec![]; let mut join_condit

Re: [PR] Improve push down filter of join [datafusion]

2024-11-03 Thread via GitHub
JasonLi-cn commented on PR #13184: URL: https://github.com/apache/datafusion/pull/13184#issuecomment-2453835952 > For anyone following along, this PR appears to have had some correctness issues so @eejbyfeldt reverted it #13229 > > @JasonLi-cn are you willing to create a new PR that w

[PR] feat: basic support for executing prepared statements [datafusion]

2024-11-03 Thread via GitHub
jonahgao opened a new pull request, #13242: URL: https://github.com/apache/datafusion/pull/13242 ## Which issue does this PR close? Closes #4549. ## Rationale for this change Store the logical plan of the prepared statement in the session state. During `EXECUTE`, ret

Re: [PR] Introduce `full_qualified_col` option for the unparser dialect [datafusion]

2024-11-03 Thread via GitHub
goldmedal commented on PR #13241: URL: https://github.com/apache/datafusion/pull/13241#issuecomment-2453923228 cc @sgrebnov @phillipleblanc -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] feat: basic support for executing prepared statements [datafusion]

2024-11-03 Thread via GitHub
jonahgao commented on code in PR #13242: URL: https://github.com/apache/datafusion/pull/13242#discussion_r1827248287 ## datafusion/core/tests/sql/select.rs: ## @@ -176,27 +175,6 @@ async fn prepared_statement_type_coercion() -> Result<()> { Ok(()) } -#[tokio::test] -asyn

Re: [PR] feat: basic support for executing prepared statements [datafusion]

2024-11-03 Thread via GitHub
jonahgao commented on code in PR #13242: URL: https://github.com/apache/datafusion/pull/13242#discussion_r1827252251 ## datafusion/core/tests/sql/select.rs: ## @@ -106,9 +105,9 @@ async fn test_prepare_statement() -> Result<()> { let ctx = create_ctx_with_partition(&tmp_dir

Re: [PR] Convert nth_value builtIn function to UDWF [datafusion]

2024-11-03 Thread via GitHub
jcsherin commented on PR #13201: URL: https://github.com/apache/datafusion/pull/13201#issuecomment-2453934188 > External error: query failed: DataFusion error: Arrow error: Invalid argument error: It is not possible to concatenate arrays of different data types. In the built-in (olde

Re: [PR] feat: basic support for executing prepared statements [datafusion]

2024-11-03 Thread via GitHub
jonahgao commented on code in PR #13242: URL: https://github.com/apache/datafusion/pull/13242#discussion_r1827252251 ## datafusion/core/tests/sql/select.rs: ## @@ -106,9 +105,9 @@ async fn test_prepare_statement() -> Result<()> { let ctx = create_ctx_with_partition(&tmp_dir

Re: [PR] Add support for SHOW DATABASES/SCHEMAS/TABLES/VIEWS in Hive [datafusion-sqlparser-rs]

2024-11-03 Thread via GitHub
yoavcloud commented on PR #1487: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1487#issuecomment-2453521842 Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Move SortExec partition check to constructor [datafusion]

2024-11-03 Thread via GitHub
andygrove closed issue #179: Move SortExec partition check to constructor URL: https://github.com/apache/datafusion/issues/179 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Support pruning on string columns using starts_with [datafusion]

2024-11-03 Thread via GitHub
adriangb commented on issue #507: URL: https://github.com/apache/datafusion/issues/507#issuecomment-2453539740 Thanks for the idea. We are close to implementing this in #12978. It adds support for pushing down `like` but notably does _not_ add support for `starts_with` as that will be done

Re: [PR] feat(substrait): AggregateRel grouping_expressions support [datafusion]

2024-11-03 Thread via GitHub
alamb commented on PR #13173: URL: https://github.com/apache/datafusion/pull/13173#issuecomment-2453397754 I am going to merge this PR to keep the code flowing. If there are any additional issues found as part of @vbarua 's review we can address them as a follow on PR. Thanks again @

Re: [PR] Deprecate `LexOrderingRef` and `LexRequirementRef` [datafusion]

2024-11-03 Thread via GitHub
jatin510 commented on PR #13233: URL: https://github.com/apache/datafusion/pull/13233#issuecomment-2453433695 Thanks ! made the suggested changes. We can come up with a follow-up PR for more refinements! @alamb -- This is an automated message from the Apache Git Service. To respo

Re: [I] Move filter_push_down::split_members to be reused outside of DataFusion [datafusion]

2024-11-03 Thread via GitHub
drauschenbach commented on issue #382: URL: https://github.com/apache/datafusion/issues/382#issuecomment-2453461040 This issue seems obsolete, given that no `split_members` function exists anymore, so IoX must have worked around that. -- This is an automated message from the Apache Git Se

Re: [PR] Support timestamp(n) SQL type [datafusion]

2024-11-03 Thread via GitHub
caicancai commented on code in PR #13231: URL: https://github.com/apache/datafusion/pull/13231#discussion_r1826981147 ## datafusion/sqllogictest/test_files/timestamps.slt: ## @@ -402,6 +402,41 @@ SELECT COUNT(*) FROM ts_data_secs where ts > from_unixtime(1599566400) 2

Re: [PR] Support vectorized append and compare for multi group by [datafusion]

2024-11-03 Thread via GitHub
alamb commented on PR #12996: URL: https://github.com/apache/datafusion/pull/12996#issuecomment-2453401637 BTW I think this code is fairly well covered by the aggregate fuzz tester (also added by @Rachelint :)) Also, @LeslieKid is adding additional data type coverage which is great:

Re: [I] Potential performance regression for TPCH q18 [datafusion]

2024-11-03 Thread via GitHub
devanbenz commented on issue #13188: URL: https://github.com/apache/datafusion/issues/13188#issuecomment-2453401410 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] fix bugs explain with non-correlated query [datafusion]

2024-11-03 Thread via GitHub
alamb commented on PR #13210: URL: https://github.com/apache/datafusion/pull/13210#issuecomment-2453409118 I merged up from main to try and get the CI passing (I won't force push this PR again!) -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] fix bugs explain with non-correlated query [datafusion]

2024-11-03 Thread via GitHub
alamb commented on code in PR #13210: URL: https://github.com/apache/datafusion/pull/13210#discussion_r1826970119 ## datafusion/core/src/physical_planner.rs: ## @@ -1792,11 +1792,19 @@ impl DefaultPhysicalPlanner { Err(e) => return Err(e),

Re: [PR] Support vectorized append and compare for multi group by [datafusion]

2024-11-03 Thread via GitHub
alamb commented on PR #12996: URL: https://github.com/apache/datafusion/pull/12996#issuecomment-2453401222 > 🤔 I personally prefer the second one? What do you think about it @alamb ? I think this makes sense -- thank you -- This is an automated message from the Apache Git Service. T

Re: [PR] Do not push down filter through distinct on [datafusion]

2024-11-03 Thread via GitHub
alamb commented on PR #12943: URL: https://github.com/apache/datafusion/pull/12943#issuecomment-2453400951 Marking as a draft as I think this PR is no longer waiting on feedback (and I am trying to work down the review queue). Please mark it as ready for review when it is ready for an

Re: [PR] Deprecate `LexOrderingRef` and `LexRequirementRef` [datafusion]

2024-11-03 Thread via GitHub
jatin510 commented on code in PR #13233: URL: https://github.com/apache/datafusion/pull/13233#discussion_r1826985958 ## datafusion/physical-plan/src/windows/bounded_window_agg_exec.rs: ## @@ -1553,7 +1552,7 @@ mod tests { Arc::new(BuiltInWindowExpr::new(

Re: [I] Add support of HDFS as remote object store [datafusion]

2024-11-03 Thread via GitHub
drauschenbach commented on issue #1060: URL: https://github.com/apache/datafusion/issues/1060#issuecomment-2453469825 Obsoleted by #1062 where consensus led to this feature landing in https://github.com/datafusion-contrib/hdfs-native-object-store. -- This is an automated message from the

Re: [PR] Convert nth_value builtIn function to UDWF [datafusion]

2024-11-03 Thread via GitHub
buraksenn commented on PR #13201: URL: https://github.com/apache/datafusion/pull/13201#issuecomment-2453567445 Wanted to update here. I think I'm almost finished but probably encountered a side effect. This query fails in slt file: ``` External error: query failed: DataFusion error: Ar

Re: [PR] Support 'NULL' as Null in csv parser. [datafusion]

2024-11-03 Thread via GitHub
dhegberg commented on code in PR #13228: URL: https://github.com/apache/datafusion/pull/13228#discussion_r1827063553 ## datafusion/core/Cargo.toml: ## @@ -127,6 +127,7 @@ parquet = { workspace = true, optional = true, default-features = true } paste = "1.0.15" pin-project-lit

Re: [PR] [minor] overload from_unixtime func to have optional timezone parameter [datafusion]

2024-11-03 Thread via GitHub
alamb commented on code in PR #13130: URL: https://github.com/apache/datafusion/pull/13130#discussion_r1826963449 ## datafusion/functions/src/datetime/from_unixtime.rs: ## @@ -93,12 +124,59 @@ fn get_from_unixtime_doc() -> &'static Documentation { Documentation::builder

Re: [PR] Support timestamp(n) SQL type [datafusion]

2024-11-03 Thread via GitHub
caicancai commented on code in PR #13231: URL: https://github.com/apache/datafusion/pull/13231#discussion_r1826981147 ## datafusion/sqllogictest/test_files/timestamps.slt: ## @@ -402,6 +402,41 @@ SELECT COUNT(*) FROM ts_data_secs where ts > from_unixtime(1599566400) 2

Re: [PR] Implement predicate pruning for `like` expressions [datafusion]

2024-11-03 Thread via GitHub
drauschenbach commented on PR #12978: URL: https://github.com/apache/datafusion/pull/12978#issuecomment-2453453900 This PR should possibly indicate that it Closes #507. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [I] Using string datatype from python raises "Exception: The type 13 is not valid" [datafusion]

2024-11-03 Thread via GitHub
drauschenbach commented on issue #693: URL: https://github.com/apache/datafusion/issues/693#issuecomment-2453448921 This issue seems obsolete now. __To Verify__ ```python import datafusion import pyarrow f = datafusion.functions batch = pyarrow.RecordBatch.fr

Re: [PR] Minor: Improve documentation about `OnceAsync` [datafusion]

2024-11-03 Thread via GitHub
korowa commented on code in PR #13223: URL: https://github.com/apache/datafusion/pull/13223#discussion_r1827020010 ## datafusion/physical-plan/src/joins/cross_join.rs: ## @@ -46,16 +46,23 @@ use datafusion_physical_expr::equivalence::join_equivalence_properties; use async_trai

Re: [PR] Minor: Improve documentation about `OnceAsync` [datafusion]

2024-11-03 Thread via GitHub
korowa commented on code in PR #13223: URL: https://github.com/apache/datafusion/pull/13223#discussion_r1827020010 ## datafusion/physical-plan/src/joins/cross_join.rs: ## @@ -46,16 +46,23 @@ use datafusion_physical_expr::equivalence::join_equivalence_properties; use async_trai

Re: [PR] allow passing in metadata_size_hint on a per-file basis [datafusion]

2024-11-03 Thread via GitHub
adriangb commented on PR #13213: URL: https://github.com/apache/datafusion/pull/13213#issuecomment-2453594954 @alamb I added a test in b3890a8. It doesn't test below `create_reader` because that part was not touched and it starts getting complicated (tracking ObjectStore calls, etc.) so I t

Re: [I] switch to using AsyncBencher for datafusion benches? [datafusion]

2024-11-03 Thread via GitHub
drauschenbach commented on issue #563: URL: https://github.com/apache/datafusion/issues/563#issuecomment-2453451890 I suggest closing this issue because the questions were not populated, so I'm not clear what the intended benefit would be. -- This is an automated message from the Apache G

Re: [PR] Deprecate invoke and invoke_no_args in favor of invoke_batch [datafusion]

2024-11-03 Thread via GitHub
alamb commented on code in PR #13174: URL: https://github.com/apache/datafusion/pull/13174#discussion_r1826968605 ## datafusion/expr/src/udf.rs: ## @@ -489,19 +493,40 @@ pub trait ScalarUDFImpl: Debug + Send + Sync { /// Invoke the function with `args` and the number of r

Re: [I] Regression: "Internal error: Only intervals with the same data type are comparable, lhs:Float32, rhs:UInt64." [datafusion]

2024-11-03 Thread via GitHub
alamb closed issue #13186: Regression: "Internal error: Only intervals with the same data type are comparable, lhs:Float32, rhs:UInt64." URL: https://github.com/apache/datafusion/issues/13186 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] fix bugs for explain supporting non-correlated subquery [datafusion]

2024-11-03 Thread via GitHub
alamb commented on PR #13116: URL: https://github.com/apache/datafusion/pull/13116#issuecomment-2453407410 New PR: https://github.com/apache/datafusion/pull/13210 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Apply projection to `Statistics` in `FilterExec` [datafusion]

2024-11-03 Thread via GitHub
alamb merged PR #13187: URL: https://github.com/apache/datafusion/pull/13187 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] [minor] overload from_unixtime func to have optional timezone parameter [datafusion]

2024-11-03 Thread via GitHub
buraksenn commented on code in PR #13130: URL: https://github.com/apache/datafusion/pull/13130#discussion_r1826990025 ## datafusion/functions/src/datetime/from_unixtime.rs: ## @@ -93,12 +124,59 @@ fn get_from_unixtime_doc() -> &'static Documentation { Documentation::bui

Re: [PR] [minor] overload from_unixtime func to have optional timezone parameter [datafusion]

2024-11-03 Thread via GitHub
buraksenn commented on code in PR #13130: URL: https://github.com/apache/datafusion/pull/13130#discussion_r1826989925 ## datafusion/functions/src/datetime/from_unixtime.rs: ## @@ -93,12 +124,59 @@ fn get_from_unixtime_doc() -> &'static Documentation { Documentation::bui

Re: [PR] [minor] overload from_unixtime func to have optional timezone parameter [datafusion]

2024-11-03 Thread via GitHub
buraksenn commented on PR #13130: URL: https://github.com/apache/datafusion/pull/13130#issuecomment-2453439946 > Thanks @buraksenn -- I am marking this PR as draft as it is no longer waiting on review (and I am trying to clear the review queue) > > I did merge again from main to fix t

Re: [I] Review the contract between DataFusion and Arrow [datafusion]

2024-11-03 Thread via GitHub
drauschenbach commented on issue #196: URL: https://github.com/apache/datafusion/issues/196#issuecomment-2453440137 It's worth considering whether this issue is obsolete, now that the project depends on an external `arrow` crate. -- This is an automated message from the Apache Git Service

[PR] Hashmap type alias [datafusion]

2024-11-03 Thread via GitHub
drauschenbach opened a new pull request, #13236: URL: https://github.com/apache/datafusion/pull/13236 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes tes

Re: [PR] feat: Add `Time`/`Interval`/`Decimal` in aggregate fuzz testing [datafusion]

2024-11-03 Thread via GitHub
alamb commented on PR #13226: URL: https://github.com/apache/datafusion/pull/13226#issuecomment-2453391721 This PR looks great -- thank you @LeslieKid -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Support 'NULL' as Null in csv parser. [datafusion]

2024-11-03 Thread via GitHub
alamb commented on code in PR #13228: URL: https://github.com/apache/datafusion/pull/13228#discussion_r1826960596 ## datafusion/core/src/datasource/file_format/csv.rs: ## @@ -454,7 +455,8 @@ impl CsvFormat { .has_header

Re: [PR] Support timestamp(n) SQL type [datafusion]

2024-11-03 Thread via GitHub
alamb commented on code in PR #13231: URL: https://github.com/apache/datafusion/pull/13231#discussion_r1826960764 ## datafusion/sqllogictest/test_files/timestamps.slt: ## @@ -402,6 +402,41 @@ SELECT COUNT(*) FROM ts_data_secs where ts > from_unixtime(1599566400) 2 +que

Re: [PR] fix bugs explain with non-correlated query [datafusion]

2024-11-03 Thread via GitHub
alamb commented on code in PR #13210: URL: https://github.com/apache/datafusion/pull/13210#discussion_r1826973882 ## datafusion/core/src/physical_planner.rs: ## @@ -1797,11 +1797,19 @@ impl DefaultPhysicalPlanner { Err(e) => return Err(e),

Re: [PR] Add support for SHOW DATABASES/SCHEMAS/TABLES/VIEWS in Hive [datafusion-sqlparser-rs]

2024-11-03 Thread via GitHub
alamb commented on PR #1487: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1487#issuecomment-2453421898 > @iffyio @alamb why isn't this merged? Sorry -- am behind. -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] Add support for SHOW DATABASES/SCHEMAS/TABLES/VIEWS in Hive [datafusion-sqlparser-rs]

2024-11-03 Thread via GitHub
alamb merged PR #1487: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1487 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [I] Add optional rust features for functions in library to keep dependencies down [datafusion]

2024-11-03 Thread via GitHub
drauschenbach commented on issue #146: URL: https://github.com/apache/datafusion/issues/146#issuecomment-2453446642 This issue looks obsolete. Use of feature flags such as `avro` are now used to control dependency scope. -- This is an automated message from the Apache Git Service. To resp

Re: [I] Move filter_push_down::split_members to be reused outside of DataFusion [datafusion]

2024-11-03 Thread via GitHub
andygrove commented on issue #382: URL: https://github.com/apache/datafusion/issues/382#issuecomment-2453514776 Closing this. Thanks @drauschenbach. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] Move filter_push_down::split_members to be reused outside of DataFusion [datafusion]

2024-11-03 Thread via GitHub
andygrove closed issue #382: Move filter_push_down::split_members to be reused outside of DataFusion URL: https://github.com/apache/datafusion/issues/382 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] [Python]: Expose Dataframe.schema() to Python binding [datafusion]

2024-11-03 Thread via GitHub
andygrove closed issue #1007: [Python]: Expose Dataframe.schema() to Python binding URL: https://github.com/apache/datafusion/issues/1007 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Add optional rust features for functions in library to keep dependencies down [datafusion]

2024-11-03 Thread via GitHub
andygrove closed issue #146: Add optional rust features for functions in library to keep dependencies down URL: https://github.com/apache/datafusion/issues/146 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [I] Add optional rust features for functions in library to keep dependencies down [datafusion]

2024-11-03 Thread via GitHub
andygrove commented on issue #146: URL: https://github.com/apache/datafusion/issues/146#issuecomment-2453514638 Closing this. Thanks @drauschenbach. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] Using string datatype from python raises "Exception: The type 13 is not valid" [datafusion]

2024-11-03 Thread via GitHub
andygrove commented on issue #693: URL: https://github.com/apache/datafusion/issues/693#issuecomment-2453514524 Closing this. Thanks @drauschenbach. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] Evaluate pyo3 abi3 wheel limitations for datafusion python binding [datafusion]

2024-11-03 Thread via GitHub
andygrove closed issue #955: Evaluate pyo3 abi3 wheel limitations for datafusion python binding URL: https://github.com/apache/datafusion/issues/955 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [I] Creating dataframe with Recordbatch using pyarrow.Table.to_batches gives "type16 not valid error" when schema includes date32[day] type [datafusion]

2024-11-03 Thread via GitHub
andygrove closed issue #949: Creating dataframe with Recordbatch using pyarrow.Table.to_batches gives "type16 not valid error" when schema includes date32[day] type URL: https://github.com/apache/datafusion/issues/949 -- This is an automated message from the Apache Git Service. To respond to

Re: [I] Creating dataframe with Recordbatch using pyarrow.Table.to_batches gives "type16 not valid error" when schema includes date32[day] type [datafusion]

2024-11-03 Thread via GitHub
andygrove commented on issue #949: URL: https://github.com/apache/datafusion/issues/949#issuecomment-2453514970 Closing this. Thanks @drauschenbach. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] Refactor the hash_aggregate [datafusion]

2024-11-03 Thread via GitHub
andygrove commented on issue #839: URL: https://github.com/apache/datafusion/issues/839#issuecomment-2453515002 Closing this. Thanks @drauschenbach. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

  1   2   >