Samyak2 commented on PR #16500:
URL: https://github.com/apache/datafusion/pull/16500#issuecomment-3039543288
I have added asserts for metrics in the existing join tests. The ones in
hash_join and cross_join are working. The asserts in nested_loop_join are
currently failing due to a mismatch
2010YOUY01 opened a new issue, #16689:
URL: https://github.com/apache/datafusion/issues/16689
### Describe the bug
datafusion-cli is compiled from the latest main (commit 25c2a079fc)
```
> create table tt(v1 decimal(50,2));
0 row(s) fetched.
Elapsed 0.002 seconds.
alamb commented on issue #14757:
URL: https://github.com/apache/datafusion/issues/14757#issuecomment-3039034005
π -- thank you for your attention to this @kosiew
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
UR
alamb commented on PR #79:
URL: https://github.com/apache/datafusion-site/pull/79#issuecomment-3039045610
Thanks @zhuqi-lucas -- I will keep looking at this later today
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
alamb commented on issue #16565:
URL: https://github.com/apache/datafusion/issues/16565#issuecomment-3039045185
> Implementation considerations: We would need to extend CastExpr and
TryCastExpr to detect when the input/output types are structs and invoke the
struct-aware casting logic appro
swaingotnochill commented on issue #16365:
URL: https://github.com/apache/datafusion/issues/16365#issuecomment-3039046493
I observed a significant performance improvement just testing on the master
branch without any metadata caching as compared to before.
```
β datafusion git:(m
adriangb commented on issue #16565:
URL: https://github.com/apache/datafusion/issues/16565#issuecomment-3039050784
I agree with you @alamb.
The issue I see with both approaches is going to be nested types: once you
call the arrow cast kernel you can't take back control, so this approa
jonathanc-n commented on PR #16660:
URL: https://github.com/apache/datafusion/pull/16660#issuecomment-3039228697
cc @alamb @ozankabak Seems like you guys were part of the discussion for
range joins, this is a nice start to it? @Dandandan @comphead might be
interested?
--
This is an autom
blaginin commented on PR #16644:
URL: https://github.com/apache/datafusion/pull/16644#issuecomment-3039402927
@findepi hey π FYI as you suggested the change - in case you want to approve
/ merge π
--
This is an automated message from the Apache Git Service.
To respond to the message, plea
kevinjqliu commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-3039741826
Just read the post. Coming back here to thank everyone for the contribution!
I always learn so much from these.
I think it would be great to have a comment section for the b
kevinjqliu opened a new issue, #80:
URL: https://github.com/apache/datafusion-site/issues/80
I think its beneficial to the community to add a comment section to the
datafusion blogs. Personally, I would like to have follow up discussions on the
blog post topic.
I found a possible so
Samyak2 commented on PR #16500:
URL: https://github.com/apache/datafusion/pull/16500#issuecomment-3039652143
Actually, I see the same behavior on latest main. Looks like the output_rows
metric in nested loop join is currently wrong?
--
This is an automated message from the Apache Git Serv
jonathanc-n commented on code in PR #16443:
URL: https://github.com/apache/datafusion/pull/16443#discussion_r2181119355
##
datafusion/physical-plan/src/joins/nested_loop_join.rs:
##
@@ -828,13 +833,127 @@ impl NestedLoopJoinStream {
handle_state!(self.proces
fvj commented on PR #16687:
URL: https://github.com/apache/datafusion/pull/16687#issuecomment-3038866616
Weird, it worked on my machine (tm). Let me investigate! Should I close the
PR for now and reopen when I fixed?
--
This is an automated message from the Apache Git Service.
To respond
alamb commented on PR #16690:
URL: https://github.com/apache/datafusion/pull/16690#issuecomment-3038917238
I see some failures in the row_group_pruning tests
```
$ cargo test --package datafusion --test parquet_config
parquet::row_group_pruning
```
```
failures:
fvj commented on PR #16687:
URL: https://github.com/apache/datafusion/pull/16687#issuecomment-3038925425
latest patch should include the missing files:
```
$ docker build . -t datafusion-devcontainer:latest &>/dev/null && docker run
--rm -it datafusion-devcontainer:latest bash -c "
alamb commented on PR #16674:
URL: https://github.com/apache/datafusion/pull/16674#issuecomment-3038992838
Than you @Standing-Man and @2010YOUY01
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
alamb commented on PR #15928:
URL: https://github.com/apache/datafusion/pull/15928#issuecomment-3039007195
π
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe
alamb commented on PR #1892:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1892#issuecomment-3039005788
π
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
alamb commented on PR #16663:
URL: https://github.com/apache/datafusion/pull/16663#issuecomment-3038886263
> > Thanks @melroy12 -- I started the CI tests
> > I suspect this PR will fail at least `clippy` -- you'll perhaps have to
run Clippy and make the changes it suggests (note you can a
alamb merged PR #16674:
URL: https://github.com/apache/datafusion/pull/16674
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
alamb closed issue #16496: Refactor `StreamJoinMetrics` to reuse
`BaselineMetrics`
URL: https://github.com/apache/datafusion/issues/16496
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
JigaoLuo commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-3039204719
Hi @zhuqi-lucas,
While proofreading the blog, I had one major general question: **What are
the limitations of such an embedded index?**
- Is it limited to just one emb
zhuqi-lucas commented on PR #79:
URL: https://github.com/apache/datafusion-site/pull/79#issuecomment-3039224050
> Hi @zhuqi-lucas, I've gone through the blog twice, and it looks great
overall. I just have one very small nitpick above.
>
> Regarding the content: One suggestion would be
zhuqi-lucas commented on PR #79:
URL: https://github.com/apache/datafusion-site/pull/79#issuecomment-3039222981
> Thanks @zhuqi-lucas -- I will keep looking at this later today
Thank you @alamb !
--
This is an automated message from the Apache Git Service.
To respond to the message,
JigaoLuo commented on PR #79:
URL: https://github.com/apache/datafusion-site/pull/79#issuecomment-3039990579
This also ties into my initial impression that βthe Embedded Index is just a
hashset to speed up scans, which adds overhead to Parquet." as mentioned as a
follow-up here:
https://gi
kevinjqliu commented on issue #80:
URL: https://github.com/apache/datafusion-site/issues/80#issuecomment-3040039493
> I only worry that we won't see the comments and therefore won't respond.
Thats a good point. I think subscribing to the "Discussions" events on the
github repo will he
UBarney commented on PR #16443:
URL: https://github.com/apache/datafusion/pull/16443#issuecomment-3038952572
> Thanks @UBarney, just some comments
Thanks @jonathanc-n for reviewing. I have addressed all of your comments.
--
This is an automated message from the Apache Git Service.
T
JigaoLuo commented on PR #79:
URL: https://github.com/apache/datafusion-site/pull/79#issuecomment-3039088620
Hi @zhuqi-lucas, I've gone through the blog twice, and it looks great
overall. I just have one very small nitpick above.
Regarding the content: One suggestion would be to inclu
geoffreyclaude commented on code in PR #16685:
URL: https://github.com/apache/datafusion/pull/16685#discussion_r2187358106
##
datafusion/sql/src/relation/mod.rs:
##
@@ -281,3 +796,49 @@ fn optimize_subquery_sort(plan: LogicalPlan) ->
Result>
});
new_plan
}
+
+/// Hel
jatin510 commented on issue #16676:
URL: https://github.com/apache/datafusion/issues/16676#issuecomment-3039286131
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To u
adriangb commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-3039806254
Index suggestion: a tablesample index.
And a general thought: exploring these sorts of indexes could do very cool
stuff for DataFusion in general in terms of pushing us t
simonvandel commented on code in PR #1921:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1921#discussion_r2187644576
##
tests/sqlparser_common.rs:
##
@@ -11125,16 +11125,9 @@ fn parse_trailing_comma() {
);
trailing_commas.verified_stmt(r#"SELECT "from" F
alamb commented on issue #80:
URL: https://github.com/apache/datafusion-site/issues/80#issuecomment-3039788072
Adding comments to the blog would be great -- I only worry that we won't see
the comments and therefore won't respond. Maybe we could make a git hub issue
or something for each blo
alamb commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-3039796047
> Hi [@zhuqi-lucas](https://github.com/zhuqi-lucas),
>
> While proofreading the blog, I had one major general question: **What are
the limitations of such an embedded index?
dependabot[bot] opened a new pull request, #1180:
URL: https://github.com/apache/datafusion-python/pull/1180
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.45.0 to 1.46.1.
Release notes
Sourced from https://github.com/tokio-rs/tokio/releases";>tokio's releases.
Tokio
alamb commented on code in PR #16630:
URL: https://github.com/apache/datafusion/pull/16630#discussion_r2187647772
##
datafusion/physical-plan/src/sorts/cursor.rs:
##
@@ -288,6 +288,64 @@ impl CursorArray for StringViewArray {
}
}
+/// Todo use arrow-rs side api after:
<
alamb commented on PR #79:
URL: https://github.com/apache/datafusion-site/pull/79#issuecomment-3039820680
Thanks -- I am going to spend an hour or so taking a pass through this blog
trying to get the formatting to work out
So exciting
--
This is an automated message from the Apache
JigaoLuo commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-3039853245
> > Hi [@zhuqi-lucas](https://github.com/zhuqi-lucas),
> > While proofreading the blog, I had one major general question: **What
are the limitations of such an embedded index
alamb opened a new pull request, #16690:
URL: https://github.com/apache/datafusion/pull/16690
## Which issue does this PR close?
- Related to https://github.com/apache/arrow-rs/issues/7395
## Rationale for this change
There are several non trivial changes in arrow
adriangb commented on code in PR #16686:
URL: https://github.com/apache/datafusion/pull/16686#discussion_r2187232804
##
datafusion/datasource/src/source.rs:
##
@@ -325,6 +328,9 @@ impl ExecutionPlan for DataSourceExec {
new_node.data_source = data_source;
alamb commented on PR #79:
URL: https://github.com/apache/datafusion-site/pull/79#issuecomment-3040245413
I just pushed a commit that reworked the intro a bit and started filling out
the background
https://github.com/user-attachments/assets/efe36816-7fed-44d7-9158-1b2fc19ffb19";
/>
alamb commented on PR #79:
URL: https://github.com/apache/datafusion-site/pull/79#issuecomment-3040245879
I need to run now to attend to to some family matters. I'll be back tomorrow
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to Gi
kevinjqliu opened a new pull request, #81:
URL: https://github.com/apache/datafusion-site/pull/81
I was trying to make changes to the templates and noticed that there are
some unused ones. This is confusing so I'm cleaning it up.
- Removed `frontpage.html`, this is unused and referenc
kevinjqliu opened a new pull request, #82:
URL: https://github.com/apache/datafusion-site/pull/82
Local setup clones the `apache/infrastructure-actions` repo into the
`infrastructure-actions` local directory.
See
https://github.com/apache/datafusion-site?tab=readme-ov-file#setup-for-doc
yoavcloud opened a new pull request, #1926:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1926
(no comment)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
corasaurus-hex opened a new issue, #16688:
URL: https://github.com/apache/datafusion/issues/16688
### Is your feature request related to a problem or challenge?
Datafusion currently supports [registering files in the Arrow IPC file
format as
tables](https://gist.github.com/corasaurus
alamb commented on issue #16486:
URL: https://github.com/apache/datafusion/issues/16486#issuecomment-3038693992
We have begun voting:
https://lists.apache.org/thread/xobyfmpcvtcclcbgo0wz6c74b35mxcdx
--
This is an automated message from the Apache Git Service.
To respond to the message, pl
alamb commented on PR #16686:
URL: https://github.com/apache/datafusion/pull/16686#issuecomment-3038695163
Thank you @liamzwbao
@xudong963 or @adriangb do you have time to review this PR?
--
This is an automated message from the Apache Git Service.
To respond to the message, please
brunal commented on code in PR #16342:
URL: https://github.com/apache/datafusion/pull/16342#discussion_r2187061361
##
datafusion/datasource/src/file_sink_config.rs:
##
@@ -77,13 +79,34 @@ pub trait FileSink: DataSink {
.runtime_env()
.object_store(&conf
ryanschneider opened a new pull request, #1927:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1927
Fixes #1920 by adding `NOT NULL` expression support to DuckDB and SQLite,
and `NOTNULL` support to DuckDB, SQLite, and Postgres.
Since this is my first non-trivial PR pleas
zhuqi-lucas commented on PR #79:
URL: https://github.com/apache/datafusion-site/pull/79#issuecomment-3040695911
> I just pushed a commit that reworked the intro a bit and started filling
out the background
>
> https://private-user-images.githubusercontent.com/490673/462836291-efe36816
zhuqi-lucas commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-3040702910
Thank you @alamb @JigaoLuo @adriangb , i agree current example is the start,
we can further add more advanced examples!
--
This is an automated message from the Ap
2010YOUY01 commented on code in PR #16500:
URL: https://github.com/apache/datafusion/pull/16500#discussion_r2187952197
##
datafusion/physical-plan/src/joins/nested_loop_join.rs:
##
@@ -825,7 +825,8 @@ impl NestedLoopJoinStream {
handle_state!(ready!(self.fet
2010YOUY01 commented on PR #16500:
URL: https://github.com/apache/datafusion/pull/16500#issuecomment-3040715483
> Actually, I see the same behavior on latest main. Looks like the
output_rows metric in nested loop join is currently wrong?
Thank you for the catch! Here might also need `
Samyak2 commented on PR #16500:
URL: https://github.com/apache/datafusion/pull/16500#issuecomment-3040935741
> > Actually, I see the same behavior on latest main. Looks like the
output_rows metric in nested loop join is currently wrong?
>
> Thank you for the catch! Here might also nee
iffyio merged PR #1921:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1921
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr
iffyio merged PR #1922:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1922
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr
ryanschneider commented on code in PR #1927:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1927#discussion_r2187875896
##
src/parser/mod.rs:
##
@@ -3571,6 +3572,11 @@ impl<'a> Parser<'a> {
),
regexp,
ryanschneider commented on code in PR #1927:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1927#discussion_r2187875712
##
src/dialect/duckdb.rs:
##
@@ -94,4 +94,12 @@ impl Dialect for DuckDbDialect {
fn supports_order_by_all(&self) -> bool {
true
60 matches
Mail list logo