date:20250315

Re: [PR] Update version to 46.0.1, add CHANGELOG [datafusion]

2025-03-15 Thread via GitHub



xudong963 merged PR #15243:
URL: https://github.com/apache/datafusion/pull/15243


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Saner handling of nulls inside arrays [datafusion]

2025-03-15 Thread via GitHub



joroKr21 commented on code in PR #15149:
URL: https://github.com/apache/datafusion/pull/15149#discussion_r1996627243


##
datafusion/sqllogictest/test_files/array.slt:
##
@@ -6232,12 +6244,12 @@ select array_intersect(arrow_cast([1, 1, 2, 2, 3, 3], 
'LargeList(Int64)'), null)
 query ?
 select array_intersect(null, [1, 1, 2, 2, 3, 3]);
 
-NULL
+[]

Review Comment:
   Above we have
   ```
   query ?
   select array_intersect([1, 1, 2, 2, 3, 3], null);
   
   []
   ```
   
   So I think it's better that `array_intersect` is commutative. I can make 
both `null` or both empty list but I think it should be the same if we flip the 
arguments. Which one should it be?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Saner handling of nulls inside arrays [datafusion]

2025-03-15 Thread via GitHub



joroKr21 commented on code in PR #15149:
URL: https://github.com/apache/datafusion/pull/15149#discussion_r1996627020


##
datafusion/sqllogictest/test_files/array.slt:
##
@@ -4408,12 +4422,10 @@ select array_union(arrow_cast([], 'LargeList(Int64)'), 
arrow_cast([], 'LargeList
 query ?
 select array_union([[null]], []);
 
-[[NULL]]
+[[]]
 
-query ?
+query error DataFusion error: Error during planning: Failed to unify argument 
types of array_union:

Review Comment:
   The problem is that the argument types don't make sense. How are we supposed 
to union `LargeList(List(Int64))` and `LargeList(Int64)`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Saner handling of nulls inside arrays [datafusion]

2025-03-15 Thread via GitHub



joroKr21 commented on code in PR #15149:
URL: https://github.com/apache/datafusion/pull/15149#discussion_r1996628218


##
datafusion/functions-nested/src/extract.rs:
##
@@ -200,6 +199,7 @@ fn array_element_inner(args: &[ArrayRef]) -> 
Result {
 let [array, indexes] = take_function_args("array_element", args)?;
 
 match &array.data_type() {
+Null => Ok(Arc::new(NullArray::new(array.len(,

Review Comment:
   I will write a test that shows the issue. But basically something like 
`array_union(make_array(), ['x', 'y', 'z'])` will be an issue because we can't 
union a list of ints and a list of strings.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Saner handling of nulls inside arrays [datafusion]

2025-03-15 Thread via GitHub



joroKr21 commented on code in PR #15149:
URL: https://github.com/apache/datafusion/pull/15149#discussion_r1996628327


##
datafusion/functions-nested/src/extract.rs:
##
@@ -200,6 +199,7 @@ fn array_element_inner(args: &[ArrayRef]) -> 
Result {
 let [array, indexes] = take_function_args("array_element", args)?;
 
 match &array.data_type() {
+Null => Ok(Arc::new(NullArray::new(array.len(,

Review Comment:
   Or in this case similar nested expression with array element



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] `FROM` first in `SELECT` statements [datafusion-sqlparser-rs]

2025-03-15 Thread via GitHub



iffyio commented on issue #1400:
URL: 
https://github.com/apache/datafusion-sqlparser-rs/issues/1400#issuecomment-2726264373

   added in https://github.com/apache/datafusion-sqlparser-rs/pull/1713


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Re-enable github discussion [datafusion]

2025-03-15 Thread via GitHub



2010YOUY01 merged PR #15241:
URL: https://github.com/apache/datafusion/pull/15241


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Saner handling of nulls inside arrays [datafusion]

2025-03-15 Thread via GitHub



joroKr21 commented on code in PR #15149:
URL: https://github.com/apache/datafusion/pull/15149#discussion_r1996627420


##
datafusion/sqllogictest/test_files/array.slt:
##
@@ -6232,12 +6244,12 @@ select array_intersect(arrow_cast([1, 1, 2, 2, 3, 3], 
'LargeList(Int64)'), null)
 query ?
 select array_intersect(null, [1, 1, 2, 2, 3, 3]);
 
-NULL
+[]

Review Comment:
   Also, below we have:
   ```
   query ?
   select array_intersect([], null);
   
   []
   ```



##
datafusion/sqllogictest/test_files/array.slt:
##
@@ -6232,12 +6244,12 @@ select array_intersect(arrow_cast([1, 1, 2, 2, 3, 3], 
'LargeList(Int64)'), null)
 query ?
 select array_intersect(null, [1, 1, 2, 2, 3, 3]);
 
-NULL
+[]

Review Comment:
   Also, below we have:
   ```
   query ?
   select array_intersect([], null);
   
   []
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Use insta for `DataFrame` tests [datafusion]

2025-03-15 Thread via GitHub



alamb merged PR #15165:
URL: https://github.com/apache/datafusion/pull/15165


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[PR] Update version to 46.0.1, add CHANGELOG (#15243) [datafusion]

2025-03-15 Thread via GitHub



xudong963 opened a new pull request, #15244:
URL: https://github.com/apache/datafusion/pull/15244

   ## Which issue does this PR close?
   
   
   
   - part of #https://github.com/apache/datafusion/issues/15151
   
   ## Rationale for this change
   
   
   
   ## What changes are included in this PR?
   
   
   Port change from branch-46
   ## Are these changes tested?
   
   
   
   ## Are there any user-facing changes?
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [datafusion-comet]

2025-03-15 Thread via GitHub



andygrove commented on code in PR #1443:
URL: https://github.com/apache/datafusion-comet/pull/1443#discussion_r1996054834


##
native/core/src/parquet/parquet_exec.rs:
##
@@ -0,0 +1,141 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::execution::operators::ExecutionError;
+use crate::parquet::parquet_support::SparkParquetOptions;
+use crate::parquet::schema_adapter::SparkSchemaAdapterFactory;
+use arrow_schema::{Field, SchemaRef};
+use datafusion::datasource::listing::PartitionedFile;
+use datafusion::datasource::physical_plan::{FileScanConfig, FileSource, 
ParquetSource};
+use datafusion::datasource::source::DataSourceExec;
+use datafusion::physical_expr::PhysicalExpr;
+use datafusion_comet_spark_expr::EvalMode;
+use datafusion_common::config::TableParquetOptions;
+use datafusion_execution::object_store::ObjectStoreUrl;
+use datafusion_physical_expr::expressions::BinaryExpr;
+use itertools::Itertools;
+use std::sync::Arc;
+
+/// Initializes a DataSourceExec plan with a ParquetSource. This may be used 
by either the
+/// `native_datafusion` scan or the `native_iceberg_compat` scan.
+///
+///   `required_schema`: Schema to be projected by the scan.
+///
+///   `data_schema`: Schema of the underlying data. It is optional and, if 
provided, is used
+/// instead of `required_schema` to initialize the file scan
+///
+///   `partition_schema` and `partition_fields` are optional. If 
`partition_schema` is specified,
+/// then `partition_fields` must also be specified
+///
+///   `object_store_url`: Url to read data from
+///
+///   `file_groups`: A collection of groups of `PartitionedFiles` that are to 
be read by the scan
+///
+///   `projection_vector`: A vector of the indexes in the schema of the fields 
to be projected
+///
+///   `data_filters`: Any predicate that must be applied to the data returned 
by the scan. If
+/// specified, then `data_schema` must also be specified.
+#[allow(clippy::too_many_arguments)]
+pub(crate) fn init_parquet_exec(
+required_schema: SchemaRef,
+data_schema: Option,
+partition_schema: Option,
+partition_fields: Option>,
+object_store_url: ObjectStoreUrl,
+file_groups: Vec>,
+projection_vector: Option>,
+data_filters: Option>>,
+session_timezone: &str,
+) -> Result, ExecutionError> {
+let (table_parquet_options, spark_parquet_options) = 
get_options(session_timezone);
+let mut parquet_source = 
ParquetSource::new(table_parquet_options).with_schema_adapter_factory(
+Arc::new(SparkSchemaAdapterFactory::new(spark_parquet_options)),
+);
+// Create a conjunctive form of the vector because ParquetExecBuilder takes
+// a single expression
+if let Some(data_filters) = data_filters {
+let cnf_data_filters = data_filters.clone().into_iter().reduce(|left, 
right| {
+Arc::new(BinaryExpr::new(
+left,
+datafusion::logical_expr::Operator::And,
+right,
+))
+});
+
+if let (Some(filter), Some(data_schema)) = (cnf_data_filters, 
&data_schema) {
+parquet_source = 
parquet_source.with_predicate(Arc::clone(data_schema), filter);
+}
+}
+let file_scan_config = match (data_schema, projection_vector, 
partition_fields) {
+(Some(data_schema), Some(projection_vector), Some(partition_fields)) 
=> get_file_config(
+data_schema,
+partition_schema,
+file_groups,
+object_store_url,
+Arc::new(parquet_source),
+)
+.with_projection(Some(projection_vector))
+.with_table_partition_cols(partition_fields),
+_ => get_file_config(
+required_schema,
+partition_schema,
+file_groups,
+object_store_url,
+Arc::new(parquet_source),
+),
+};
+
+let scan = Arc::new(DataSourceExec::new(Arc::new(file_scan_config)));
+
+Ok(scan)

Review Comment:
   The variable `scan` isn't really required here.
   
   ```suggestion
   Ok(Arc::new(DataSourceExec::new(Arc::new(file_scan_config
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message,

Re: [PR] Snowflake: Support dollar quoted comment when creating tables, views, and their fields [datafusion-sqlparser-rs]

2025-03-15 Thread via GitHub



iffyio commented on code in PR #1755:
URL: 
https://github.com/apache/datafusion-sqlparser-rs/pull/1755#discussion_r1995819665


##
src/parser/mod.rs:
##
@@ -6926,14 +6927,16 @@ impl<'a> Parser<'a> {
 let comment = if self.parse_keyword(Keyword::COMMENT) {
 let has_eq = self.consume_token(&Token::Eq);
 let next_token = self.next_token();
-match next_token.token {
-Token::SingleQuotedString(str) => Some(if has_eq {
-CommentDef::WithEq(str)
-} else {
-CommentDef::WithoutEq(str)
-}),
+let comment = match next_token.token {
+Token::SingleQuotedString(str) => str,
+Token::DollarQuotedString(str) => str.value,
 _ => self.expected("comment", next_token)?,
-}
+};

Review Comment:
   can we move this part to be reusable? something like
   ```rust
   fn parse_comment_value() -> Result {
 let value = match next_token.token {
   Token::SingleQuotedString(str) => str,
   Token::DollarQuotedString(str) => str.value,
   _ => self.expected("comment", next_token)?,
   };
Ok(value)
   }
   ```
   and then to call the function from both places



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[I] Add all functions to the Expr class so that they're chainable. [datafusion-python]

2025-03-15 Thread via GitHub



deanm opened a new issue, #1064:
URL: https://github.com/apache/datafusion-python/issues/1064

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   Instead of doing
   ```python
   df.select(
   F.abs(
   col("id1")
   )
   .alias("id1")
   )
   ```
   
   It would be nicer to do
   
   ```python
   df.select(
   col("id1")
   .abs()
   .alias("id1")
   )
   ```
   
   **Describe the solution you'd like**
   This is already partly there, for example, `alias` is already in the Expr 
class. It would be a bit tedious but easy to add under `class Expr`, for 
example:
   
   ```python
   def abs(self) -> datafusion.Expr:
   """Return the absolute value of a given number.
   
   Returns:
   
   Expr
   A new expression representing the absolute value of the input 
expression.
   """
   return F.abs(self)
   ```
   
   **Describe alternatives you've considered**
   if it weren't for the type hinter, monkey patching.
   
   **Additional context**
   There will still be functions that don't make sense to chain off of a call 
to `col` such as `when` since it doesn't return an Expr. But, even functions 
that take multiple inputs can have this for instance `col("a").atan2("b")`. 
Additionally, this is completely backwards compatible since I'm not proposing 
eliminating the functions module.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[PR] fix: re-enable GitHub discussions [datafusion-comet]

2025-03-15 Thread via GitHub



andygrove opened a new pull request, #1532:
URL: https://github.com/apache/datafusion-comet/pull/1532

   ## Which issue does this PR close?
   
   
   
   N/A
   
   ## Rationale for this change
   
   
   
   Our GitHub discussions disappeared! Possibly related to an update in ASF 
tooling mentioned in https://infra.apache.org/blog/newsletter_02_25.html ?
   
   ## What changes are included in this PR?
   
   
   
   Enable discussions via asf.yaml
   
   ## How are these changes tested?
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] docs: various improvements to tuning guide [datafusion-comet]

2025-03-15 Thread via GitHub



andygrove commented on code in PR #1525:
URL: https://github.com/apache/datafusion-comet/pull/1525#discussion_r1996138171


##
docs/source/user-guide/tuning.md:
##
@@ -23,12 +23,84 @@ Comet provides some tuning options to help you get the best 
performance from you
 
 ## Memory Tuning
 
-### Unified Memory Management with Off-Heap Memory
+It is necessary to specify how much memory Comet can use in addition to memory 
already allocated to Spark. In some
+cases, it may be possible to reduce the amount of memory allocated to Spark so 
that overall memory allocation is
+the same or lower than the original configuration. In other cases, enabling 
Comet may require allocating more memory
+than before. See the `Determining How Much Memory to Allocate` section for 
more details.
 
-The recommended way to share memory between Spark and Comet is to set 
`spark.memory.offHeap.enabled=true`. This allows
-Comet to share an off-heap memory pool with Spark. The size of the pool is 
specified by `spark.memory.offHeap.size`. For more details about Spark off-heap 
memory mode, please refer to Spark documentation: 
https://spark.apache.org/docs/latest/configuration.html.
+Comet supports Spark's on-heap (the default) and off-heap mode for allocating 
memory. However, we strongly recommend
+using off-heap mode. Comet has some limitations when running in on-heap mode, 
such as requiring more memory overall,
+and requiring shuffle memory to be separately configured.
 
-The type of pool can be specified with `spark.comet.exec.memoryPool`.
+### Configuring Comet Memory in Off-Heap Mode
+
+The recommended way to allocate memory for Comet is to set 
`spark.memory.offHeap.enabled=true`. This allows
+Comet to share an off-heap memory pool with Spark, reducing the overall memory 
overhead. The size of the pool is
+specified by `spark.memory.offHeap.size`. For more details about Spark 
off-heap memory mode, please refer to
+Spark documentation: https://spark.apache.org/docs/latest/configuration.html.
+
+### Configuring Comet Memory in On-Heap Mode
+
+When running in on-heap mode, Comet memory can be allocated by setting 
`spark.comet.memoryOverhead`. If this setting
+is not provided, it will be calculated by multiplying the current Spark 
executor memory by
+`spark.comet.memory.overhead.factor` (default value is `1.0`). This is a 
conservative default that provides Comet
+with the same amount of memory that Spark was originally using.
+
+Shuffle memory must be separately allocated using 
`spark.comet.columnar.shuffle.memorySize`. If this setting is not
+provided, it will be calculated by multiplying `spark.comet.memoryOverhead` by
+`spark.comet.columnar.shuffle.memory.factor` (default value is `1.0`).
+
+### Determining How Much Memory to Allocate
+
+Generally, increasing the amount of memory allocated to Comet will improve 
query performance by reducing the
+amount of time spent spilling to disk, especially for aggregate, join, and 
shuffle operations. Allocating insufficient
+memory can result in out-of-memory errors. This is no different from 
allocating memory in Spark and the amount of
+memory will vary for different workloads, so some experimentation will be 
required.
+
+Here is a real-world example, based on running benchmarks derived from TPC-H.
+
+**TODO: this section is a work-in-progress**
+
+The following table shows performance of Spark compared to Comet in both 
Off-Heap and On-Heap modes. The table shows
+total query time for TPC-H @ 100GB. Smaller is better.
+
+| Total Executor Memory (GB) | Spark | Comet Off-Heap | Comet On-Heap |
+| -- | - | -- | - |
+| 1  | OOM   | OOM| OOM   |
+| 2  | OOM   | OOM| OOM   |
+| 3  | 744   | OOM| OOM   |
+| 4  | 739   | OOM| OOM   |
+| 5  | 681   | 342| OOM   |
+| 6  | 665   || 344   |
+| 7  | 657   || 340   |
+| 8  | 632   | 295| 334   |
+| 9  | 623   ||   |
+| 10 | 622   ||   |
+
+TODO: WIP conclusions:
+
+- Spark can complete the benchmark with as little as 3GB but shows best 
performance at 9-10 GB
+- When Comet is enabled, Spark needs at least 5 GB of memory but provides a 
~2x improvement in performance for that level of memory allocation
+- With Comet enabled, performance with 5 GB is 1.8x faster than Spark with 
9-10 GB
+- TODO run Comet with half the CPUs and show same performance? i.e. 
demonstrate same performance for half the cost
+
+## Advanced Memory Tuning
+
+## Configuring spark.executor.memoryOverhead
+
+In some environments, such as Kubernetes a

[I] Unparse of logical plans with `LEFT ANTI` and `LEFT SEMI` joins generate invalid SQL [datafusion]

2025-03-15 Thread via GitHub



nuno-faria opened a new issue, #15127:
URL: https://github.com/apache/datafusion/issues/15127

   ### Describe the bug
   
   When unparsing a logical plan containing a `LeftAnti Join` or a `LeftSemi 
Join` operator with the `unparser::dialect::PostgreSqlDialect`, the resulting 
unparsed SQL contains `LEFT ANTI JOIN` or `LEFT SEMI JOIN`, respectively, which 
are not supported by Postgres. The same thing is true for other dialects such 
as `unparser::dialect::MySqlDialect` and `unparser::dialect::SqliteDialect`.
   
   This causes issues when attempting to run federated queries in Datafusion 
over systems such as Postgres (namely, in the 
[datafusion-table-providers](https://github.com/datafusion-contrib/datafusion-table-providers)
 repo).
   
   I'm not sure if this is a bug or if the responsibility to generate valid SQL 
in a specific dialect should be delegated downstream. However, since there 
appears to be a previous PR with a similar issue 
(https://github.com/apache/datafusion/pull/10625), I leave this one here just 
in case.
   
   ### To Reproduce
   
   ```rust
   use datafusion::{
   error::Result,
   prelude::SessionContext,
   sql::unparser::{self, Unparser},
   };
   
   #[tokio::main]
   async fn main() -> Result<()> {
   let ctx = SessionContext::new();
   ctx.sql("create table t1 (c int)").await?.collect().await?;
   ctx.sql("create table t2 (c int)").await?.collect().await?;
   
   let df = ctx
   .sql("select * from t1 where c not in (select c from t2)")
   .await?;
   
   let plan = df.into_optimized_plan()?; // optimizing the plan will 
introduct the LeftAnti Join
   println!("{}", plan);
   
   let sql = Unparser::new(&unparser::dialect::PostgreSqlDialect 
{}).plan_to_sql(&plan)?;
   println!("{}", sql);
   
   Ok(())
   }
   ```
   
   Which returns the (invalid in Postgres) query: 
   ```sql
   SELECT * FROM "t1" LEFT ANTI JOIN "t2" AS "__correlated_sq_1" ON "t1"."c" = 
"__correlated_sq_1"."c"
   ```
   
   ### Expected behavior
   
   Generate an equivalent valid query, such as:
   
   ```sql
   SELECT * FROM t1 WHERE c NOT IN (SELECT c FROM t2)
   ```
   
   ### Additional context
   
   ```toml
   datafusion = "45.0.0"
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [datafusion-comet]

2025-03-15 Thread via GitHub



parthchandra commented on PR #1443:
URL: 
https://github.com/apache/datafusion-comet/pull/1443#issuecomment-2725354473

   @mbutrovich, @comphead I've rebased this on main (after factoring out the 
object store related changes) and force pushed. If you could please take a look 
again.
   @andygrove I ran into conflicts because of the change from `ParquetExec` to 
`DataSourceExec`. Would appreciate it if you could double check that I resolved 
the conflicts correctly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Simplify display format of `AggregateFunctionExpr` [datafusion]

2025-03-15 Thread via GitHub



irenjj commented on code in PR #15253:
URL: https://github.com/apache/datafusion/pull/15253#discussion_r1997046455


##
datafusion/expr/src/expr.rs:
##
@@ -2596,6 +2600,43 @@ impl Display for SchemaDisplay<'_> {
 }
 }
 
+struct SqlDisplay<'a>(&'a Expr);
+impl Display for SqlDisplay<'_> {
+fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
+match self.0 {
+Expr::Column(c) => {
+write!(f, "{}", c.name)
+}
+Expr::Literal(_) => {
+write!(f, "aa")
+}
+Expr::ScalarVariable(..) => {
+write!(f, "bb")
+}
+Expr::OuterReferenceColumn(..) => {
+write!(f, "cc")
+}
+Expr::Placeholder(_) => {
+write!(f, "dd")
+}
+Expr::Wildcard { .. } => {
+write!(f, "ee")
+}
+Expr::AggregateFunction(AggregateFunction { func, params }) => {
+match func.sql_name(params) {
+Ok(name) => {
+write!(f, "{name}")
+}
+Err(e) => {
+write!(f, "got error from schema_name {}", e)
+}
+}
+}
+_ => write!(f, "{}", self.0.schema_name()),
+}
+}
+}
+

Review Comment:
   Just a demo here.
   `AggregateFunctionExpr`'s `name` is constructed at logical expr phase, we 
need a new member `sql_name` for tree explain, and a new constructor 
`SqlDisplay` to handle the inner `Expr` recursively. cc @alamb 👀



##
datafusion/sqllogictest/test_files/explain_tree.slt:
##
@@ -1338,7 +1336,7 @@ physical_plan
 12)-┌─┴─┐
 13)-│   AggregateExec   │
 14)-│   │
-15)-│   aggr: count(Int64(1))   │
+15)-│  aggr: count(aa)  │

Review Comment:
   We can see the final display changed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] chore: Attach Diagnostic to "incompatible type in unary expression" error [datafusion]

2025-03-15 Thread via GitHub



onlyjackfrost commented on PR #15209:
URL: https://github.com/apache/datafusion/pull/15209#issuecomment-2721725989

   @eliaperantoni  could you help review? :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[I] Change in behavior for deep structure columns with the latest sql parser upgrade [datafusion]

2025-03-15 Thread via GitHub



adragomir opened a new issue, #15118:
URL: https://github.com/apache/datafusion/issues/15118

   ### Describe the bug
   
   with SQLParser 0.53, this works: 
   ```
   SELECT * 
   FROM
   (meta_asset_featurization AS asset_meta
   INNER JOIN meta_asset_summary_metrics AS asset_metrics ON (
   asset_meta.struct_field['substruct']['substruct'] = 
asset_metrics.struct_field['substruct']['substruct']
   ))
   
   ```
   
   with SQLParser 0.54, this no longer works, the newly added 
`sql_compound_field_access_to_expr` interprets the `root` (in this case for 
example `asset_meta`) as a column name instead of a table. 
   The solution (workaround) for SQLParser 0.54 is to do 
   ```
   SELECT * 
   FROM
   (meta_asset_featurization AS asset_meta
   INNER JOIN meta_asset_summary_metrics AS asset_metrics ON (
   asset_meta.struct_field.substruct.substruct = 
asset_metrics.struct_field.substruct.substruct
   ))
   ```
   
   Is this intended ? I have tried to see the history, but unsure which 
behavior is correct. Since we do interpret the `[]` index access in the case of 
NOT having to specify the table correctly (so, if I have a single table and I 
specify directly the column `col['substruct']['substruct']` it works), why 
wouldn't this case work ? 
   
   ### To Reproduce
   
   _No response_
   
   ### Expected behavior
   
   Structure field access with either dot acess `.subfield` or subscript 
access`['subfield']` should work in the same way and yield the same result in 
all conditions
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] feat: topk functionality for aggregates should support utf8view and largeutf8 [datafusion]

2025-03-15 Thread via GitHub



alamb commented on PR #15152:
URL: https://github.com/apache/datafusion/pull/15152#issuecomment-2721226874

   FYI @avantgardnerio 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] [EPIC] A collection of tickets for improved WASM support in DataFusion [datafusion]

2025-03-15 Thread via GitHub



savaliyabhargav commented on issue #13815:
URL: https://github.com/apache/datafusion/issues/13815#issuecomment-2724623356

   Dear @alamb  ,
   
   I hope you’re doing well.
   
   I’m very interested in contributing to the WASM support for DataFusion 
project as part of GSoC 2025. Enhancing embeddability and ensuring robust WASM 
integration aligns with my passion for building reliable and maintainable 
software. I’d greatly appreciate it if you could share more details on the 
project’s current challenges and key milestones. Understanding these aspects 
will help me prepare a thoughtful and well-structured proposal.
   
   Thank you for your time and guidance.
   
   Best regards,
bhargav


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[I] [substrait] Build basic test suite to validate produced Substrait plans [datafusion]

2025-03-15 Thread via GitHub



amoeba opened a new issue, #15069:
URL: https://github.com/apache/datafusion/issues/15069

   ### Is your feature request related to a problem or challenge?
   
   In https://github.com/apache/datafusion/issues/12244 we found that 
DataFusion was producing invalid Substrait plans. The `datafusion-substrait` 
crate has quite a bit of tests but none that actually verify plans valid 
according to the https://github.com/substrait-io/substrait-validator.
   
   ### Describe the solution you'd like
   
   Add a test suite to the `datafusion-substrait` crate that uses 
https://github.com/substrait-io/substrait-validator to validate plans. We can 
start small and just add a test that would've failed prior to 
https://github.com/apache/datafusion/pull/15011.
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] Upgrade Guide for DataFusion 46 does not include the array signatures change [datafusion]

2025-03-15 Thread via GitHub



jkosh44 commented on issue #15105:
URL: https://github.com/apache/datafusion/issues/15105#issuecomment-2725983472

   > [@jkosh44](https://github.com/jkosh44) Would you be able to add a note 
about array signature changes to 
https://github.com/apache/datafusion/blob/main/docs/source/library-user-guide/upgrading.md
 ?
   
   Yes, I'll do that tonight.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] Expose global context [datafusion-python]

2025-03-15 Thread via GitHub



timsaucer commented on issue #1045:
URL: 
https://github.com/apache/datafusion-python/issues/1045#issuecomment-2710856896

   I understand what you mean by "global context is already exposed" but I mean 
that it should be treated in the wrapper functions as well so that users get a 
nice python interface and don't need to jump into the `_internal` package.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] chore: Use Datafusion's existing empty stream [datafusion-comet]

2025-03-15 Thread via GitHub



codecov-commenter commented on PR #1517:
URL: 
https://github.com/apache/datafusion-comet/pull/1517#issuecomment-2719120442

   ## 
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1517?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 Report
   All modified and coverable lines are covered by tests :white_check_mark:
   > Project coverage is 57.84%. Comparing base 
[(`f09f8af`)](https://app.codecov.io/gh/apache/datafusion-comet/commit/f09f8af64c6599255e116a376f4f008f2fd63b43?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 to head 
[(`56806da`)](https://app.codecov.io/gh/apache/datafusion-comet/commit/56806daa9b256ef01f565504a7e2669bdf6abe0a?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   > Report is 74 commits behind head on main.
   
   Additional details and impacted files
   
   
   ```diff
   @@ Coverage Diff  @@
   ##   main#1517  +/-   ##
   
   + Coverage 56.12%   57.84%   +1.71% 
   - Complexity  976  985   +9 
   
 Files   119  122   +3 
 Lines 1174312195 +452 
 Branches   2251 2303  +52 
   
   + Hits   6591 7054 +463 
   + Misses 4012 3958  -54 
   - Partials   1140 1183  +43 
   ```
   
   
   
   [:umbrella: View full report in Codecov by 
Sentry](https://app.codecov.io/gh/apache/datafusion-comet/pull/1517?dropdown=coverage&src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   
   :loudspeaker: Have feedback on the report? [Share it 
here](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   
   🚀 New features to boost your workflow: 
   
   - ❄ [Test Analytics](https://docs.codecov.com/docs/test-analytics): Detect 
flaky tests, report on failures, and find test suite problems.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[PR] datafusion-cli: add streaming state struct [datafusion]

2025-03-15 Thread via GitHub



shruti2522 opened a new pull request, #15234:
URL: https://github.com/apache/datafusion/pull/15234

   ## Which issue does this PR close?
   
   
   
   - Closes #14886 .
   
   ## Rationale for this change
   
   
   
   ## What changes are included in this PR?
add `OutputStreamStruct` to avoid dead code in datafusion-cli printing logic
   
   
   
   ## Are these changes tested?
   
   
   
   ## Are there any user-facing changes?
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] fix: compound_field_access doesn't identifier qualifier. [datafusion]

2025-03-15 Thread via GitHub



jonahgao commented on code in PR #15153:
URL: https://github.com/apache/datafusion/pull/15153#discussion_r1995820915


##
datafusion/core/tests/sql/select.rs:
##
@@ -350,3 +350,45 @@ async fn test_version_function() {
 
 assert_eq!(version.value(0), expected_version);
 }
+
+#[tokio::test]
+async fn test_subscript() -> Result<()> {

Review Comment:
   This is more suitable as a sqllogictest test. Maybe we can move it to 
[struct.slt](https://github.com/apache/datafusion/blob/main/datafusion/sqllogictest/test_files/struct.slt)
 or 
[identifiers.slt](https://github.com/apache/datafusion/blob/main/datafusion/sqllogictest/test_files/identifiers.slt)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Renaming Internal Structs [datafusion-python]

2025-03-15 Thread via GitHub



Spaarsh commented on PR #1059:
URL: 
https://github.com/apache/datafusion-python/pull/1059#issuecomment-2717683480

   Now the `test_wrapper_coverage.py` checks for modules in this way:
   Skip checks if it is global_context (No changes made in this part)
   For classes started with `Raw`, it breaks down the class name to extract the 
wrapper class and checks if that class has the given `Raw` class
   
   Later, when the conditional statement of `isinstance` is checked for, the 
`Raw` classes are skipped since they have been checked for in the earlier part 
of the test and, more importantly, unlike other classes that return true for 
`isinstance`, `Raw` classes are only internally exposed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Simpler to see expressions in explain `tree` mode [datafusion]

2025-03-15 Thread via GitHub



alamb commented on code in PR #15163:
URL: https://github.com/apache/datafusion/pull/15163#discussion_r1996232099


##
datafusion/sqllogictest/test_files/explain_tree.slt:
##
@@ -739,43 +736,42 @@ physical_plan
 01)┌───┐
 02)│   ProjectionExec  │
 03)│   │
-04)│rolling_sum:   │
-05)│ sum(t1.v1) ORDER BY [t1.v1│
-06)│ASC NULLS LAST] ROWS   │
-07)│ BETWEEN 1 PRECEDING   │
-08)│  AND CURRENT ROW@1│
-09)│   │
-10)│  v1: v1@0 │
-11)└─┬─┘
-12)┌─┴─┐
-13)│BoundedWindowAggExec   │
-14)│   │
-15)│mode: Sorted   │
-16)│   │
-17)│select_list:   │
-18)│ sum(t1.v1) ORDER BY [t1.v1│
-19)│ASC NULLS LAST] ROWS   │
-20)│ BETWEEN 1 PRECEDING   │
-21)│   AND CURRENT ROW │
-22)└─┬─┘
-23)┌─┴─┐
-24)│  SortExec │
-25)│   │
-26)│v1@0 ASC NULLS LAST│
-27)└─┬─┘
-28)┌─┴─┐
-29)│   ProjectionExec  │
-30)│   │
-31)│v1: value@0│
-32)└─┬─┘
-33)┌─┴─┐
-34)│   LazyMemoryExec  │
-35)│   │
-36)│ batch_generators: │
-37)│ generate_series: start=1, │
-38)│end=1000, batch_size   │
-39)│   =8192   │
-40)└───┘
+04)│ sum(t1.v1) ORDER BY [t1.v1│
+05)│ASC NULLS LAST] ROWS   │
+06)│ BETWEEN 1 PRECEDING   │
+07)│   AND CURRENT ROW │
+08)│   │
+09)│ v1│
+10)└─┬─┘
+11)┌─┴─┐
+12)│BoundedWindowAggExec   │
+13)│   │
+14)│mode: Sorted   │
+15)│   │
+16)│select_list:   │
+17)│ sum(t1.v1) ORDER BY [t1.v1│
+18)│ASC NULLS LAST] ROWS   │
+19)│ BETWEEN 1 PRECEDING   │
+20)│   AND CURRENT ROW │
+21)└─┬─┘
+22)┌─┴─┐
+23)│  SortExec │
+24)│   │
+25)│v1@0 ASC NULLS LAST│

Review Comment:
   Maybe we can file a follow on PR to improve other plans 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] feat: topk functionality for aggregates should support utf8view and largeutf8 [datafusion]

2025-03-15 Thread via GitHub



alamb commented on PR #15152:
URL: https://github.com/apache/datafusion/pull/15152#issuecomment-2725521365

   Thanks again @zhuqi-lucas 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] [branch-46] Fix wasm32 build on version 46 [datafusion]

2025-03-15 Thread via GitHub



alamb closed pull request #15229: [branch-46] Fix wasm32 build on version 46
URL: https://github.com/apache/datafusion/pull/15229


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] [DISCUSS] Release DataFusion `46.0.1` Patch or `46.1.0` minor release (March 2025) [datafusion]

2025-03-15 Thread via GitHub



alamb commented on issue #15151:
URL: https://github.com/apache/datafusion/issues/15151#issuecomment-2724618093

   - Unfortunately https://github.com/apache/datafusion/issues/15114 doesn't 
seem likely to make it into this release. I'll proceed with making backport PRs 
now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Saner handling of nulls inside arrays [datafusion]

2025-03-15 Thread via GitHub



jayzhan211 commented on code in PR #15149:
URL: https://github.com/apache/datafusion/pull/15149#discussion_r1996471597


##
datafusion/expr-common/src/signature.rs:
##
@@ -865,6 +867,39 @@ impl Signature {
 volatility,
 }
 }
+
+/// Specialized Signature for ArrayPrepend and similar functions
+pub fn element_and_array(volatility: Volatility) -> Self {
+Signature {
+type_signature: TypeSignature::ArraySignature(
+ArrayFunctionSignature::Array {
+arguments: vec![
+ArrayFunctionArgument::Element,
+ArrayFunctionArgument::Array,
+],
+array_coercion: Some(ListCoercion::FixedSizedListToList),

Review Comment:
   What we need is to provide customizable signature, so if they want List for 
append, they can add `array_coercion: 
Some(ListCoercion::FixedSizedListToList)`, if they want to keep fixed-size-list 
they can add `array_coercion: None`. Both cases should be possible.
   
   For datafusion implementation, we choose to convert to List by default



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [datafusion-comet]

2025-03-15 Thread via GitHub



mbutrovich commented on code in PR #1443:
URL: https://github.com/apache/datafusion-comet/pull/1443#discussion_r1996167458


##
native/core/src/parquet/mod.rs:
##
@@ -620,12 +619,21 @@ fn get_batch_context<'a>(handle: jlong) -> Result<&'a mut 
BatchContext, CometErr
 }
 }
 
-/*
-#[inline]
-fn get_batch_reader<'a>(handle: jlong) -> Result<&'a mut 
ParquetRecordBatchReader, CometError> {
-Ok(&mut get_batch_context(handle)?.batch_reader.unwrap())
+fn get_file_groups_single_file(
+path: &Path,
+file_size: u64,
+start: i64,
+length: i64,
+) -> Vec> {

Review Comment:
   Outer vector is partitions, inner vector is the files belonging to that 
partition, I believe:
   
   
https://github.com/apache/datafusion/blob/f8828abb46d6f9280f4de146896cd59a740e06ed/datafusion/datasource/src/file_scan_config.rs#L145



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] WIP: User defined sorting [datafusion]

2025-03-15 Thread via GitHub



tobixdev commented on PR #15106:
URL: https://github.com/apache/datafusion/pull/15106#issuecomment-2725048476

   Happy to hear any kind of feedback on that @paleolimbot . So take a look if 
you've time.  Also good to hear that others also have these requirements in 
their projects :). Could you think of a way to implement your use case by 
extending the proposed solution (to get a feeling whether this idea generalizes 
to other use cases)?
   
   As this is quite a fundamental change that can influence many more features 
to come, we would definitely need some input from someone more familiar with 
the project.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] chore: Remove all subdependencies [datafusion-comet]

2025-03-15 Thread via GitHub



andygrove merged PR #1514:
URL: https://github.com/apache/datafusion-comet/pull/1514


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[PR] chore: Minor code cleanup in native scan type-checking [datafusion-comet]

2025-03-15 Thread via GitHub



andygrove opened a new pull request, #1537:
URL: https://github.com/apache/datafusion-comet/pull/1537

   ## Which issue does this PR close?
   
   
   
   N/A
   
   ## Rationale for this change
   
   
   
   Unify code for `native_datafusion` and `native_iceberg_compat` when checking 
for supported types
   
   ## What changes are included in this PR?
   
   
   
   ## How are these changes tested?
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] chore: Minor code cleanup in native scan type-checking [datafusion-comet]

2025-03-15 Thread via GitHub



andygrove commented on code in PR #1537:
URL: https://github.com/apache/datafusion-comet/pull/1537#discussion_r1997045440


##
spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala:
##
@@ -199,15 +199,15 @@ class CometSparkSessionExtensions
 _,
 _,
 _)
-  if CometScanExec.isFileFormatSupported(fileFormat)
+  if COMET_EXEC_ENABLED.get()
+&& CometConf.isExperimentalNativeScan

Review Comment:
I moved this check earlier because it is much cheaper than checking if the 
schema is supported (especially if the schema is deeply nested)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] chore: Attach Diagnostic to "incompatible type in unary expression" error [datafusion]

2025-03-15 Thread via GitHub



onlyjackfrost commented on PR #15209:
URL: https://github.com/apache/datafusion/pull/15209#issuecomment-2725191499

   @eliaperantoni could I raise another PR for the others unary expressions and 
keep this PR for the PLUS unary expression? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Implement tree explain for `LocalLimitExec` [datafusion]

2025-03-15 Thread via GitHub



alamb commented on PR #15232:
URL: https://github.com/apache/datafusion/pull/15232#issuecomment-2725553415

   Thanks again @shruti2522 and @comphead 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Document guidelines for physical operator yielding [datafusion]

2025-03-15 Thread via GitHub



ozankabak commented on code in PR #15030:
URL: https://github.com/apache/datafusion/pull/15030#discussion_r1984828017


##
datafusion/physical-plan/src/execution_plan.rs:
##
@@ -260,13 +260,30 @@ pub trait ExecutionPlan: Debug + DisplayAs + Send + Sync {
 /// used.
 /// Thus, [`spawn`] is disallowed, and instead use [`SpawnedTask`].
 ///
+/// To enable timely cancellation, the [`Stream`] that is returned must not
+/// pin the CPU and must yield back to the tokio runtime regularly. This 
can

Review Comment:
   Right - but that is the only place AFAICT we do this. So it is (and should 
be) very rare - which the docs ought to mention.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[PR] fix: use common implementation of handling object store and hdfs urls for native_datafusion and native_iceberg_compat [datafusion-comet]

2025-03-15 Thread via GitHub



parthchandra opened a new pull request, #1494:
URL: https://github.com/apache/datafusion-comet/pull/1494

   Addresses part of comments in 
https://github.com/apache/datafusion-comet/pull/1443
   
   Replaces the `register_object_store` method that provided hdfs support with 
a unified `prepare_object_store` method.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] docs: various improvements to tuning guide [datafusion-comet]

2025-03-15 Thread via GitHub



andygrove commented on code in PR #1525:
URL: https://github.com/apache/datafusion-comet/pull/1525#discussion_r1996139134


##
docs/source/user-guide/tuning.md:
##
@@ -17,18 +17,96 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# Tuning Guide
+# Comet Tuning Guide
 
 Comet provides some tuning options to help you get the best performance from 
your queries.
 
 ## Memory Tuning
 
-### Unified Memory Management with Off-Heap Memory
+It is necessary to specify how much memory Comet can use in addition to memory 
already allocated to Spark. In some
+cases, it may be possible to reduce the amount of memory allocated to Spark so 
that overall memory allocation is
+the same or lower than the original configuration. In other cases, enabling 
Comet may require allocating more memory
+than before. See the [Determining How Much Memory to Allocate] section for 
more details.
 
-The recommended way to share memory between Spark and Comet is to set 
`spark.memory.offHeap.enabled=true`. This allows
-Comet to share an off-heap memory pool with Spark. The size of the pool is 
specified by `spark.memory.offHeap.size`. For more details about Spark off-heap 
memory mode, please refer to Spark documentation: 
https://spark.apache.org/docs/latest/configuration.html.
+[Determining How Much Memory to Allocate]: 
#determining-how-much-memory-to-allocate
 
-The type of pool can be specified with `spark.comet.exec.memoryPool`.
+Comet supports Spark's on-heap (the default) and off-heap mode for allocating 
memory. However, we strongly recommend
+using off-heap mode. Comet has some limitations when running in on-heap mode, 
such as requiring more memory overall,
+and requiring shuffle memory to be separately configured.
+
+### Configuring Comet Memory in Off-Heap Mode
+
+The recommended way to allocate memory for Comet is to set 
`spark.memory.offHeap.enabled=true`. This allows
+Comet to share an off-heap memory pool with Spark, reducing the overall memory 
overhead. The size of the pool is
+specified by `spark.memory.offHeap.size`. For more details about Spark 
off-heap memory mode, please refer to
+Spark documentation: https://spark.apache.org/docs/latest/configuration.html.
+
+### Configuring Comet Memory in On-Heap Mode
+
+When running in on-heap mode, Comet memory can be allocated by setting 
`spark.comet.memoryOverhead`. If this setting
+is not provided, it will be calculated by multiplying the current Spark 
executor memory by
+`spark.comet.memory.overhead.factor` (default value is `0.2`) which may or may 
not result in enough memory for 
+Comet to operate. It is not recommended to rely on this behavior. It is better 
to specify `spark.comet.memoryOverhead`
+explicitly.
+
+Comet supports native shuffle and columnar shuffle (these terms are explained 
in the [shuffle] section below). 
+In on-heap mode, columnar shuffle memory must be separately allocated using 
`spark.comet.columnar.shuffle.memorySize`. 
+If this setting is not provided, it will be calculated by multiplying 
`spark.comet.memoryOverhead` by
+`spark.comet.columnar.shuffle.memory.factor` (default value is `1.0`). If a 
shuffle exceeds this amount of memory 
+then the query will fail.
+
+[shuffle]: #shuffle
+
+### Determining How Much Memory to Allocate
+
+Generally, increasing the amount of memory allocated to Comet will improve 
query performance by reducing the
+amount of time spent spilling to disk, especially for aggregate, join, and 
shuffle operations. Allocating insufficient
+memory can result in out-of-memory errors. This is no different from 
allocating memory in Spark and the amount of
+memory will vary for different workloads, so some experimentation will be 
required.
+
+Here is a real-world example, based on running benchmarks derived from TPC-H, 
running on a single executor against 
+local Parquet files using the 100 GB data set.
+
+Baseline Spark Performance 
+
+- Spark completes the benchmark in 632 seconds with 8 cores and 8 GB RAM
+- With less than 8 GB RAM, performance degrades due to spilling
+- Spark can complete the benchmark with as little as 3 GB of RAM, but with 
worse performance (744 seconds)
+
+Comet Performance
+
+- Comet requires at least 5 GB of RAM in off-heap mode and 6 GB RAM in On-Heap 
mode, but performance at this level 
+  is around 340 seconds, which is significantly faster than Spark with any 
amount of RAM
+- Comet running in off-heap with 8 cores completes the benchmark in 295 
seconds, more than 2x faster than Spark
+- It is worth noting that running Comet with only 4 cores and 4 GB RAM 
completes the benchmark in 520 seconds, 
+  providing better performance than Spark for half the resource
+
+It may be possible to reduce Comet's memory overhead by reducing batch sizes 
or increasing number of partitions.
+
+### SortExec
+
+Comet's SortExec implementation spills to disk when under memory pressure, but 
there are some known issues in the 
+underlying DataFusion Sor

[PR] Implement tree explain for CoalescePartitionsExec [datafusion]

2025-03-15 Thread via GitHub



Shreyaskr1409 opened a new pull request, #15225:
URL: https://github.com/apache/datafusion/pull/15225

   ## Which issue does this PR close?
   
   - Closes #15195 .
   
   ## Rationale for this change
   
   ## What changes are included in this PR?
   
   Changes explain_tree.slt and coalesce_partitions.rs
   
   ## Are these changes tested?
   
   Yes
   
   ## Are there any user-facing changes?
   
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] Change mapping of SQL `VARCHAR` from `Utf8` to `Utf8View` [datafusion]

2025-03-15 Thread via GitHub



zhuqi-lucas commented on issue #15096:
URL: https://github.com/apache/datafusion/issues/15096#issuecomment-2727226225

   New subt_task:
   
   - [ ] Support Utf8View datatype for range queries


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[PR] Simplify display format of `AggregateFunctionExpr` [datafusion]

2025-03-15 Thread via GitHub



irenjj opened a new pull request, #15253:
URL: https://github.com/apache/datafusion/pull/15253

   ## Which issue does this PR close?
   
   
   
   - Closes #15252
   
   ## Rationale for this change
   
   
   
   ## What changes are included in this PR?
   
   
   
   ## Are these changes tested?
   
   
   
   ## Are there any user-facing changes?
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Update version to 46.0.1, add CHANGELOG (#15243) [datafusion]

2025-03-15 Thread via GitHub



xudong963 commented on PR #15244:
URL: https://github.com/apache/datafusion/pull/15244#issuecomment-2726776760

   > 🤔 something seems to be wrong here
   
   yeah, missed something, fixed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] chore: Minor code cleanup in native scan type checking [datafusion-comet]

2025-03-15 Thread via GitHub



codecov-commenter commented on PR #1537:
URL: 
https://github.com/apache/datafusion-comet/pull/1537#issuecomment-2726800741

   ## 
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1537?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 Report
   Attention: Patch coverage is `0%` with `4 lines` in your changes missing 
coverage. Please review.
   > Project coverage is 57.49%. Comparing base 
[(`f09f8af`)](https://app.codecov.io/gh/apache/datafusion-comet/commit/f09f8af64c6599255e116a376f4f008f2fd63b43?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 to head 
[(`4416704`)](https://app.codecov.io/gh/apache/datafusion-comet/commit/4416704138345fbe5aca3a3937ce205107c119d1?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   > Report is 83 commits behind head on main.
   
   | [Files with missing 
lines](https://app.codecov.io/gh/apache/datafusion-comet/pull/1537?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 | Patch % | Lines |
   |---|---|---|
   | 
[...org/apache/comet/CometSparkSessionExtensions.scala](https://app.codecov.io/gh/apache/datafusion-comet/pull/1537?src=pr&el=tree&filepath=spark%2Fsrc%2Fmain%2Fscala%2Forg%2Fapache%2Fcomet%2FCometSparkSessionExtensions.scala&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-c3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9jb21ldC9Db21ldFNwYXJrU2Vzc2lvbkV4dGVuc2lvbnMuc2NhbGE=)
 | 0.00% | [0 Missing and 4 partials :warning: 
](https://app.codecov.io/gh/apache/datafusion-comet/pull/1537?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 |
   
   Additional details and impacted files
   
   
   ```diff
   @@ Coverage Diff  @@
   ##   main#1537  +/-   ##
   
   + Coverage 56.12%   57.49%   +1.36% 
   + Complexity  976  971   -5 
   
 Files   119  123   +4 
 Lines 1174312169 +426 
 Branches   2251 2282  +31 
   
   + Hits   6591 6996 +405 
   + Misses 4012 4002  -10 
   - Partials   1140 1171  +31 
   ```
   
   
   
   [:umbrella: View full report in Codecov by 
Sentry](https://app.codecov.io/gh/apache/datafusion-comet/pull/1537?dropdown=coverage&src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   
   :loudspeaker: Have feedback on the report? [Share it 
here](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   
   🚀 New features to boost your workflow: 
   
   - ❄ [Test Analytics](https://docs.codecov.com/docs/test-analytics): Detect 
flaky tests, report on failures, and find test suite problems.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] Publish official Docker images to Docker Hub under Apache account [datafusion-comet]

2025-03-15 Thread via GitHub



comphead commented on issue #1510:
URL: 
https://github.com/apache/datafusion-comet/issues/1510#issuecomment-2726807145

   thanks @andygrove I think it makes sense to me, Apache Spark also includes 
the platform and Comet has the platform variety as well. Although I'm not sure 
if platforms other than Linux are in demand. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] chore: Minor code cleanup in native scan type checking [datafusion-comet]

2025-03-15 Thread via GitHub



andygrove commented on code in PR #1537:
URL: https://github.com/apache/datafusion-comet/pull/1537#discussion_r1997192540


##
spark/src/test/scala/org/apache/comet/parquet/ParquetReadSuite.scala:
##
@@ -82,11 +82,7 @@ abstract class ParquetReadSuite extends CometTestBase {
   }
 
   test("unsupported Spark types") {
-// for native iceberg compat, CometScanExec supports some types that 
native_comet does not.

Review Comment:
   this is no longer true



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] `native_datafusion` scan is only enabled when `spark.comet.exec.enabled` is set [datafusion-comet]

2025-03-15 Thread via GitHub



viirya commented on issue #1536:
URL: 
https://github.com/apache/datafusion-comet/issues/1536#issuecomment-2726904269

   Hmm, I think it is a mistake. In the commit 
1cca8d6f7bd2dfb8e1996bdd55ebe09d08eb8221, I transformed `CometScanExec` added 
for fully native scan in `CometScanRule` to `CometNativeScanExec` in  
`CometExecRule`.
   
   Maybe at the moment, I thought that this fully native scan is only used with 
native execution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] Publish official Docker images to Docker Hub under Apache account [datafusion-comet]

2025-03-15 Thread via GitHub



andygrove commented on issue #1510:
URL: 
https://github.com/apache/datafusion-comet/issues/1510#issuecomment-2726915643

   The first image is live. There are a number of security warnings that we 
need to look into.
   
   https://hub.docker.com/r/apache/datafusion-comet/tags


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] Support Push down expression evaluation in `TableProviders` [datafusion]

2025-03-15 Thread via GitHub



adriangb commented on issue #14993:
URL: https://github.com/apache/datafusion/issues/14993#issuecomment-2704601029

   > But roughly, DataFusion asks the table provider which expressions it can 
push-down, and the node is configured with both a projection expression and a 
filter expression. Exact same mechanism as filter expressions.
   
   @gatesn I'm in agreement with you, what you are proposing makes total sense


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] Release DataFusion `46.0.0` [datafusion]

2025-03-15 Thread via GitHub



alamb commented on issue #14123:
URL: https://github.com/apache/datafusion/issues/14123#issuecomment-2703664954

   > Should 
[ScalarUDFImpl::invoke_batch](https://github.com/apache/datafusion/blob/43ecd9b807877946706628633308f73a4645de1f/datafusion/expr/src/udf.rs#L616)
 be marked as deprecated? (There is already a comment saying that in the 
docstring, but the compiler doesn't catch that 😅 ) That way we'd at least get 
some compile-time warning that we're doing something bad, now there was only 
runtime failures.
   
   I think we should mark it deprecated (though we had bad luck in the past 
with the compiler not complaining about implementing a deprecated trait method 
🤔 )
   
   We also tried to document this in the (new for the first time) upgrade guide:
   
https://datafusion.apache.org/library-user-guide/upgrading.html#use-invoke-with-args-instead-of-invoke-and-invoke-batch
   
   I think we may simply want to remove `invoke_with_batch` / other older 
methods ahead of deprecation schedule to avoid the confusion 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Per file filter evaluation [datafusion]

2025-03-15 Thread via GitHub



adriangb commented on code in PR #15057:
URL: https://github.com/apache/datafusion/pull/15057#discussion_r1988141252


##
datafusion/datasource-parquet/src/source.rs:
##
@@ -559,24 +556,8 @@ impl FileSource for ParquetSource {
 .predicate()
 .map(|p| format!(", predicate={p}"))
 .unwrap_or_default();
-let pruning_predicate_string = self
-.pruning_predicate()
-.map(|pre| {
-let mut guarantees = pre
-.literal_guarantees()
-.iter()
-.map(|item| format!("{}", item))
-.collect_vec();
-guarantees.sort();
-format!(
-", pruning_predicate={}, required_guarantees=[{}]",
-pre.predicate_expr(),
-guarantees.join(", ")
-)
-})
-.unwrap_or_default();
 
-write!(f, "{}{}", predicate_string, pruning_predicate_string)

Review Comment:
   This is what's causing tests to fail. Tests assert against the formatting of 
a ParquetSource and accessing it's pruning predicate method. I'm not sure if we 
should rewrite the tests and somehow make the generated predicates accessible, 
bin them, etc.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] feat: `INSERT INTO` support [datafusion-ballista]

2025-03-15 Thread via GitHub



milenkovicm closed pull request #1177: feat: `INSERT INTO` support
URL: https://github.com/apache/datafusion-ballista/pull/1177


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[PR] Minor: More comment to aggregation fuzzer [datafusion]

2025-03-15 Thread via GitHub



2010YOUY01 opened a new pull request, #15048:
URL: https://github.com/apache/datafusion/pull/15048

   ## Which issue does this PR close?
   
   
   
   - Closes #.
   
   ## Rationale for this change
   
   
   I'm thinking about enhancing the sort fuzzer, so I checked our nice 
aggregate fuzzer to find reusable parts. While doing so, I also took the 
opportunity to add more comments to it.
   
   ## What changes are included in this PR?
   More comment to the aggregation fuzzer.
   
   
   ## Are these changes tested?
   NA
   
   
   ## Are there any user-facing changes?
   NO
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-15 Thread via GitHub



alamb commented on code in PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#discussion_r1996808102


##
datafusion/core/tests/dataframe/mod.rs:
##
@@ -1571,14 +1571,18 @@ async fn with_column_join_same_columns() -> Result<()> {
 
 assert_snapshot!(
 df_with_column.logical_plan(),
-@r###"
+@r"
 Projection: t1.c1, t2.c1, Boolean(true) AS new_column
   Limit: skip=0, fetch=1
 Sort: t1.c1 ASC NULLS FIRST
   Inner Join: t1.c1 = t2.c1
-TableScan: t1
-TableScan: t2
-"###
+SubqueryAlias: t1

Review Comment:
   I am concerned about this change -- it seems like a regression to me. 
   
   Maybe it is ok, but I don't understand the difference/ why there are a bunch 
of projections / SubqueryAlias nows
   
   If these are fixed by physical plan time it is fine, but from here it looks 
like something is wrong



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[PR] Migrate user_defined tests to insta [datafusion]

2025-03-15 Thread via GitHub



shruti2522 opened a new pull request, #15255:
URL: https://github.com/apache/datafusion/pull/15255

   ## Which issue does this PR close?
   
   
   
   - Closes #15247 .
   
   ## Rationale for this change
   
   
   
   ## What changes are included in this PR?
   migrate tests in `datafusion/core/tests/user_defined` to insta
   
   
   
   ## Are these changes tested?
   
   
   
   ## Are there any user-facing changes?
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[PR] Reanimate Code Coverage [datafusion]

2025-03-15 Thread via GitHub



blaginin opened a new pull request, #15256:
URL: https://github.com/apache/datafusion/pull/15256

   ## Which issue does this PR close?
   
   Related to 
https://github.com/apache/datafusion/issues/15155#issuecomment-2715406304
   
   ## Rationale for this change
   
   We have quite a lot of code that isn't covered in tests. And it's actually 
easy to miss covering some case when you submit a big PR
   
   ## What changes are included in this PR?
   
   Now, when a new PR introduces untested code, a warning message will be 
posted. Example: 
[here](https://github.com/blaginin/datafusion/pull/4#issuecomment-2726555358).
   
   https://github.com/user-attachments/assets/4ccdbc37-d0d1-4d4c-bd9e-74b275454799";
 />
   
   Coverage data is aggregated from:
   - `linux-test`
   - `linux-test-datafusion-cli`
   - `sqllogictest-postgres`
   - `macos-aarch64`
   - `test-datafusion-pyarrow`
   
   Excluded from coverage:
   - `datafusion-examples` — not technically tests.
   - `verify-benchmark-results` — IMO we shouldn't rely on them for 
functionality tests, those should be covered on the unit test level
   - `doctests` — because they requite nightly rust and I 
[couldn't](https://github.com/blaginin/datafusion/actions/runs/13876294634) 
(yet) make it work in the CI (although will try more)
   
   There are ways to make checks even more annoying (e.g. putting ❌ in CI 
checks). I want to do this later, but first, I want to make sure we're 
measuring coverage properly and others are okay with it. If that’s the case, 
we’ll tighten CI checks in future PRs. 
   
   Absolute coverage example: 
https://app.codecov.io/gh/blaginin/datafusion/pull/4/tree
   
   ## Are there any user-facing changes?
   
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Reanimate Code Coverage [datafusion]

2025-03-15 Thread via GitHub



blaginin commented on PR #15256:
URL: https://github.com/apache/datafusion/pull/15256#issuecomment-2727009428

   To make this work, we'll need to 
[add](https://docs.codecov.com/docs/adding-the-codecov-token) `CODECOV_TOKEN` 
and add the [app](https://github.com/apps/codecov) to the repo (may alreacy be 
there since we used it before) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] Improved CI test coverage for rust features [datafusion]

2025-03-15 Thread via GitHub



blaginin commented on issue #15155:
URL: https://github.com/apache/datafusion/issues/15155#issuecomment-2727008592

   WIP PoC is [here](https://github.com/apache/datafusion/pull/15256). The way 
I think this should work is here: 
https://github.com/blaginin/datafusion/pull/4#issuecomment-2726555358


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Add additional ruff suggestions [datafusion-python]

2025-03-15 Thread via GitHub



timsaucer commented on PR #1062:
URL: 
https://github.com/apache/datafusion-python/pull/1062#issuecomment-2727021187

   I’m not sure then. I’m away for the weekend and won’t be able to test until 
Monday. I can see then if I can reproduce.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Add additional ruff suggestions [datafusion-python]

2025-03-15 Thread via GitHub



Spaarsh commented on PR #1062:
URL: 
https://github.com/apache/datafusion-python/pull/1062#issuecomment-2727025175

   The errors are due to one of the rules I enabled in one of my commits 
`SIM102`. Since I have some 10 other ruff rules to enable anyway, should I 
disable this one for now?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] Expose to `GroupsAccumulator` whether all the groups are sorted [datafusion]

2025-03-15 Thread via GitHub



akurmustafa commented on issue #14991:
URL: https://github.com/apache/datafusion/issues/14991#issuecomment-2727036019

   
   > [@akurmustafa](https://github.com/akurmustafa) we can add it to the post 
now. Or we could always make another post "Using `WITH ORDER` with 
DataFusion..."
   > 
   > I would be happy to help write it :)
   
   I can add this to the original post as a new section, will let you know when 
it is ready for review.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Add additional ruff suggestions [datafusion-python]

2025-03-15 Thread via GitHub



timsaucer commented on PR #1062:
URL: 
https://github.com/apache/datafusion-python/pull/1062#issuecomment-2727017583

   You may be using a different version of ruff. These do change from time to 
time as new lints get added in.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Add additional ruff suggestions [datafusion-python]

2025-03-15 Thread via GitHub



Spaarsh commented on PR #1062:
URL: 
https://github.com/apache/datafusion-python/pull/1062#issuecomment-2727017123

   I don't understand why the ruff tests are failing. My local ruff check shows 
no errors.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Add additional ruff suggestions [datafusion-python]

2025-03-15 Thread via GitHub



Spaarsh commented on PR #1062:
URL: 
https://github.com/apache/datafusion-python/pull/1062#issuecomment-2727019630

   The workflow is using `0.9.1` and I `pip install` the same. Still I don't 
see any errors on my local tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] `native_datafusion` scan is only enabled when `spark.comet.exec.enabled` is set [datafusion-comet]

2025-03-15 Thread via GitHub



andygrove commented on issue #1536:
URL: 
https://github.com/apache/datafusion-comet/issues/1536#issuecomment-2727044778

   Thanks @viirya. Yes, fully native scan is only used with native execution. 
Maybe the design is correct as it is.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Enable `used_underscore_binding` clippy lint [datafusion]

2025-03-15 Thread via GitHub



alamb merged PR #15189:
URL: https://github.com/apache/datafusion/pull/15189


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] Expose to `GroupsAccumulator` whether all the groups are sorted [datafusion]

2025-03-15 Thread via GitHub



alamb commented on issue #14991:
URL: https://github.com/apache/datafusion/issues/14991#issuecomment-2726553041

   Here is the example / test case from @asubiotto  in 
https://github.com/apache/datafusion/issues/14991#issuecomment-2720197770
   
   
   In the following plan,  both `Aggregate` should have `ordering_mode=Sorted ` 
   
   So instead of
   ```
   Aggregate: groupBy=[[unnested.generated_id]], 
aggr=[[array_agg(unnested.ar)]] 



 
   
   ```
   It should be
   
   ```
   AggregateExec: mode=FinalPartitioned, gby=[generated_id@0 as generated_id], 
aggr=[sum(unnested.ar)], ordering_mode=Sorted   



  
   ```
   
   Here is the full query
   
   ```sql
   EXPLAIN WITH unnested AS (SELECT
   ROW_NUMBER() OVER () AS generated_id,
   unnest(array[value]) as ar
 FROM range(1,5)) SELECT array_agg(ar) FROM unnested group by generated_id;
   
+---+---+
   | plan_type | plan   




   |
   
+---+---+
   | logical_plan  | Projection: array_agg(unnested.ar) 




   |
   |   |   Aggregate: groupBy=[[unnested.generated_id]], 
aggr=[[array_agg(unnested.ar)]] 



  |
   |   | SubqueryAlias: unnested




   |
   |   |   Projection: generated_id, 
__unnest_placeholder(make_array(tmp_table.value),depth=1) AS 
UNNEST(make_array(tmp_table.value)) AS ar   



 |
   |   | Unnest: 
lists[__unnest_placeholder(make_array(tmp_table.value))|depth=1] structs[]

Re: [PR] chore: revert "Upgrade to Spark 3.5.4 (#1471)" [datafusion-comet]

2025-03-15 Thread via GitHub



andygrove commented on PR #1493:
URL: 
https://github.com/apache/datafusion-comet/pull/1493#issuecomment-2711566564

   > @andygrove I was going to put a rpad fix in 
https://github.com/apache/datafusion-comet/pull/1482/files but if we decide to 
revert, I can separate the PR. Let me know.
   
   I am fine with either approach. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Only unnest source for `EmptyRelation` [datafusion]

2025-03-15 Thread via GitHub



blaginin commented on code in PR #15159:
URL: https://github.com/apache/datafusion/pull/15159#discussion_r1997328648


##
datafusion/sql/src/unparser/plan.rs:
##
@@ -377,8 +377,17 @@ impl Unparser<'_> {
 };
 if self.dialect.unnest_as_table_factor() && 
unnest_input_type.is_some() {
 if let LogicalPlan::Unnest(unnest) = &p.input.as_ref() {
-return self
-.unnest_to_table_factor_sql(unnest, query, select, 
relation);
+if let LogicalPlan::Projection(projection) = 
unnest.input.as_ref()
+{
+if matches!(
+projection.input.as_ref(),
+LogicalPlan::EmptyRelation(_)

Review Comment:
   Thank you, added in 
https://github.com/apache/datafusion/pull/15159/commits/28f53014482399c6f74ac2bfda464de712705663



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] Different unnests on `plan_to_sql` are merged [datafusion]

2025-03-15 Thread via GitHub



blaginin commented on issue #15128:
URL: https://github.com/apache/datafusion/issues/15128#issuecomment-2727002296

   I also don't see other cases... They are possible in theory, and I guess by 
the time we get to them, we'll need to refactor the unparser quite thoroughly 🥲


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[PR] build(deps): bump uuid from 1.13.1 to 1.16.0 [datafusion-python]

2025-03-15 Thread via GitHub



dependabot[bot] opened a new pull request, #1068:
URL: https://github.com/apache/datafusion-python/pull/1068

   Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.13.1 to 1.16.0.
   
   Release notes
   Sourced from https://github.com/uuid-rs/uuid/releases";>uuid's releases.
   
   v1.16.0
   What's Changed
   
   Mark Uuid::new_v8 const by https://github.com/tguichaoua";>@tguichaoua in https://redirect.github.com/uuid-rs/uuid/pull/815";>uuid-rs/uuid#815
   Prepare for 1.16.0 release by https://github.com/KodrAus";>@KodrAus in https://redirect.github.com/uuid-rs/uuid/pull/817";>uuid-rs/uuid#817
   
   New Contributors
   
   https://github.com/tguichaoua";>@tguichaoua 
made their first contribution in https://redirect.github.com/uuid-rs/uuid/pull/815";>uuid-rs/uuid#815
   
   Full Changelog: https://github.com/uuid-rs/uuid/compare/v1.15.1...v1.16.0";>https://github.com/uuid-rs/uuid/compare/v1.15.1...v1.16.0
   v1.15.1
   What's Changed
   
   Guarantee v7 timestamp will never overflow by https://github.com/KodrAus";>@KodrAus in https://redirect.github.com/uuid-rs/uuid/pull/811";>uuid-rs/uuid#811
   Prepare for 1.15.1 release by https://github.com/KodrAus";>@KodrAus in https://redirect.github.com/uuid-rs/uuid/pull/812";>uuid-rs/uuid#812
   
   Full Changelog: https://github.com/uuid-rs/uuid/compare/v1.15.0...v1.15.1";>https://github.com/uuid-rs/uuid/compare/v1.15.0...v1.15.1
   v1.15.0
   What's Changed
   
   Add a manual Debug implementation for NonNilUUid by https://github.com/rick-de-water";>@rick-de-water in https://redirect.github.com/uuid-rs/uuid/pull/808";>uuid-rs/uuid#808
   Support higher precision, shiftable timestamps in V7 UUIDs by https://github.com/KodrAus";>@KodrAus in https://redirect.github.com/uuid-rs/uuid/pull/809";>uuid-rs/uuid#809
   Prepare for 1.15.0 release by https://github.com/KodrAus";>@KodrAus in https://redirect.github.com/uuid-rs/uuid/pull/810";>uuid-rs/uuid#810
   
   New Contributors
   
   https://github.com/rick-de-water";>@rick-de-water made 
their first contribution in https://redirect.github.com/uuid-rs/uuid/pull/808";>uuid-rs/uuid#808
   
   Full Changelog: https://github.com/uuid-rs/uuid/compare/v1.14.0...v1.15.0";>https://github.com/uuid-rs/uuid/compare/v1.14.0...v1.15.0
   v1.14.0
   What's Changed
   
   Add FromStr impls to the fmt structs by https://github.com/tysen";>@tysen in https://redirect.github.com/uuid-rs/uuid/pull/806";>uuid-rs/uuid#806
   Prepare for 1.14.0 release by https://github.com/KodrAus";>@KodrAus in https://redirect.github.com/uuid-rs/uuid/pull/807";>uuid-rs/uuid#807
   
   New Contributors
   
   https://github.com/tysen";>@tysen made their 
first contribution in https://redirect.github.com/uuid-rs/uuid/pull/806";>uuid-rs/uuid#806
   
   Full Changelog: https://github.com/uuid-rs/uuid/compare/v1.13.2...v1.14.0";>https://github.com/uuid-rs/uuid/compare/v1.13.2...v1.14.0
   v1.13.2
   What's Changed
   
   Add a compile_error when no source of randomness is available on 
wasm32-unknown-unknown by https://github.com/KodrAus";>@KodrAus in https://redirect.github.com/uuid-rs/uuid/pull/804";>uuid-rs/uuid#804
   Prepare for 1.13.2 release by https://github.com/KodrAus";>@KodrAus in https://redirect.github.com/uuid-rs/uuid/pull/805";>uuid-rs/uuid#805
   
   Full Changelog: https://github.com/uuid-rs/uuid/compare/1.13.1...v1.13.2";>https://github.com/uuid-rs/uuid/compare/1.13.1...v1.13.2
   
   
   
   Commits
   
   https://github.com/uuid-rs/uuid/commit/c36beb14d50f835c1f1220117ca51aae64860a3e";>c36beb1
 Merge pull request https://redirect.github.com/uuid-rs/uuid/issues/817";>#817 from 
uuid-rs/cargo/v1.16.0
   https://github.com/uuid-rs/uuid/commit/5338b246b7a8244cab3cfaa85b14fe1d1bcdcd96";>5338b24
 prepare for 1.16.0 release
   https://github.com/uuid-rs/uuid/commit/420f6279aeff48f0e12b0b39af43a5c149963382";>420f627
 Merge pull request https://redirect.github.com/uuid-rs/uuid/issues/815";>#815 from 
tguichaoua/new_v8_const
   https://github.com/uuid-rs/uuid/commit/254258c8c7c7d6c41aaf6f573dc1731549d519b2";>254258c
 mark Uuid::new_v8 const
   https://github.com/uuid-rs/uuid/commit/4e5b88e7af12f06ea526088506752c450dc991e3";>4e5b88e
 Merge pull request https://redirect.github.com/uuid-rs/uuid/issues/812";>#812 from 
uuid-rs/cargo/v1.15.1
   https://github.com/uuid-rs/uuid/commit/7fb64f78c745fe46e209f0f5a50883b711e25f04";>7fb64f7
 prepare for 1.15.1 release
   https://github.com/uuid-rs/uuid/commit/f05b6df98e8d521eecc80dc7923c9b38e2dff634";>f05b6df
 Merge pull request https://redirect.github.com/uuid-rs/uuid/issues/811";>#811 from 
uuid-rs/fix/v7-overflow
   https://github.com/uuid-rs/uuid/commit/c2d313fbbb3157c186f0511c9e2a914d174f258f";>c2d313f
 guarantee v7 timestamp will never overflow
   https://github.com/uuid-rs/uuid/commit/56ba68ff13983f3917263b86e06a81c00ee97a3d";>56ba68f
 Merge pull request https://redirect.github.com/uuid-rs/uuid/issues/810";>#810 from 
uuid-rs/cargo/v1.15.0
   https://github.com/uuid-rs/uuid/commit/26c8a9

[PR] build(deps): bump pyo3-build-config from 0.23.4 to 0.24.0 [datafusion-python]

2025-03-15 Thread via GitHub



dependabot[bot] opened a new pull request, #1067:
URL: https://github.com/apache/datafusion-python/pull/1067

   Bumps [pyo3-build-config](https://github.com/pyo3/pyo3) from 0.23.4 to 
0.24.0.
   
   Release notes
   Sourced from https://github.com/pyo3/pyo3/releases";>pyo3-build-config's 
releases.
   
   PyO3 0.24.0
   This release is an incremental improvement of refinements and 
optimizations following the new APIs established in PyO3's last few 
releases.
   Support for jiff datetime conversions have been added, and 
also UUID conversions.
   The FromPyObject derive macro has gained new 
#[pyo3(default = ...)] and #[pyo3(rename_all = ...)] 
options, and the IntoPyObject derive macro has gained a new 
#[pyo3(into_py_with = ...)] option.
   PyO3 will now pass positional arguments to Python functions using the 
"vectorcall" protocol in many cases, which should be an optimization 
over the previous behaviour (of creating a Python tuple of positional 
arguments).
   Many methods on iterators of Python collections have been optimized.
   There are also many other incremental improvements, bug fixes and smaller 
features.
   Thank you to everyone who contributed code, documentation, design ideas, 
bug reports, and feedback. The following contributors' commits are included in 
this release:
   https://github.com/0x676e67";>@0x676e67
   https://github.com/alex";>@alex
   https://github.com/arielb1";>@arielb1
   https://github.com/bschoenmaeckers";>@bschoenmaeckers
   https://github.com/davidhewitt";>@davidhewitt
   https://github.com/dependabot";>@dependabot[bot]
   https://github.com/eltociear";>@eltociear
   https://github.com/Icxolu";>@Icxolu
   https://github.com/IvanIsCoding";>@IvanIsCoding
   https://github.com/JeanArhancet";>@JeanArhancet
   https://github.com/kahojyun";>@kahojyun
   https://github.com/kemingy";>@kemingy
   https://github.com/kylebarron";>@kylebarron
   https://github.com/LifeLex";>@LifeLex
   https://github.com/LilyFoote";>@LilyFoote
   https://github.com/lundybernard";>@lundybernard
   https://github.com/matt-codecov";>@matt-codecov
   https://github.com/mattip";>@mattip
   https://github.com/mejrs";>@mejrs
   https://github.com/messense";>@messense
   https://github.com/msimacek";>@msimacek
   https://github.com/ngoldbaum";>@ngoldbaum
   https://github.com/nicolasavru";>@nicolasavru
   https://github.com/Owen-CH-Leung";>@Owen-CH-Leung
   https://github.com/peterjoel";>@peterjoel
   https://github.com/SilverBzH";>@SilverBzH
   https://github.com/Tpt";>@Tpt
   https://github.com/yoav-orca";>@yoav-orca
   https://github.com/yotamofek";>@yotamofek
   PyO3 0.23.5
   This release is a final set of backports onto the PyO3 0.23 series:
   
   PyPy 3.11 support
   Fixes to #[pyclass(freelist)] on free-threaded Python 
3.13
   Fix to Python::run for a case when 
__builtins__ is not loaded correctly on Python 3.10+
   
   
   
   ... (truncated)
   
   
   Changelog
   Sourced from https://github.com/PyO3/pyo3/blob/main/CHANGELOG.md";>pyo3-build-config's 
changelog.
   
   [0.24.0] - 2025-03-09
   Packaging
   
   Add supported CPython/PyPy versions to cargo package metadata. https://redirect.github.com/PyO3/pyo3/pull/4756";>#4756
   Bump target-lexicon dependency to 0.13. https://redirect.github.com/PyO3/pyo3/pull/4822";>#4822
   Add optional jiff dependency to add conversions for 
jiff datetime types. https://redirect.github.com/PyO3/pyo3/pull/4823";>#4823
   Bump minimum supported inventory version to 0.3.5. https://redirect.github.com/PyO3/pyo3/pull/4954";>#4954
   
   Added
   
   Add PyIterator::send method to allow sending values into a 
python generator. https://redirect.github.com/PyO3/pyo3/pull/4746";>#4746
   Add PyCallArgs trait for passing arguments into the Python 
calling protocol. This enabled using a faster calling convention for certain 
types, improving performance. https://redirect.github.com/PyO3/pyo3/pull/4768";>#4768
   Add #[pyo3(default = ...'] option for 
#[derive(FromPyObject)] to set a default value for extracted 
fields of named structs. https://redirect.github.com/PyO3/pyo3/pull/4829";>#4829
   Add #[pyo3(into_py_with = ...)] option for 
#[derive(IntoPyObject, IntoPyObjectRef)]. https://redirect.github.com/PyO3/pyo3/pull/4850";>#4850
   Add uuid to/from python conversions. https://redirect.github.com/PyO3/pyo3/pull/4864";>#4864
   Add FFI definitions PyThreadState_GetFrame and 
PyFrame_GetBack. https://redirect.github.com/PyO3/pyo3/pull/4866";>#4866
   Optimize last for BoundListIterator, 
BoundTupleIterator and BorrowedTupleIterator. https://redirect.github.com/PyO3/pyo3/pull/4878";>#4878
   Optimize Iterator::count() for PyDict, 
PyList, PyTuple & PySet. https://redirect.github.com/PyO3/pyo3/pull/4878";>#4878
   Optimize nth, nth_back, 
advance_by and advance_back_by for 
BoundTupleIterator https://redirect.github.com/PyO3/pyo3/pull/4897";>#4897
   Add support for types.GenericAlias as 
pyo3::types::PyGenericAlias. https://redirect.

[PR] build(deps): bump tokio from 1.43.0 to 1.44.1 [datafusion-python]

2025-03-15 Thread via GitHub



dependabot[bot] opened a new pull request, #1069:
URL: https://github.com/apache/datafusion-python/pull/1069

   Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.43.0 to 1.44.1.
   
   Release notes
   Sourced from https://github.com/tokio-rs/tokio/releases";>tokio's releases.
   
   Tokio v1.44.1
   1.44.1 (March 13th, 2025)
   Fixed
   
   rt: skip defer queue in block_in_place context (https://redirect.github.com/tokio-rs/tokio/issues/7216";>#7216)
   
   https://redirect.github.com/tokio-rs/tokio/issues/7216";>#7216: https://redirect.github.com/tokio-rs/tokio/pull/7216";>tokio-rs/tokio#7216
   Tokio v1.44.0
   1.44.0 (March 7th, 2025)
   This release changes the from_std method on sockets to panic 
if a blocking socket is provided. We determined this change is not a breaking 
change as Tokio is not intended to operate using blocking sockets. Doing so 
results in runtime hangs and should be considered a bug. Accidentally passing a 
blocking socket to Tokio is one of the most common user mistakes. If this 
change causes an issue for you, please comment on https://redirect.github.com/tokio-rs/tokio/issues/7172";>#7172.
   Added
   
   coop: add task::coop module (https://redirect.github.com/tokio-rs/tokio/issues/7116";>#7116)
   process: add Command::get_kill_on_drop() (https://redirect.github.com/tokio-rs/tokio/issues/7086";>#7086)
   sync: add broadcast::Sender::closed (https://redirect.github.com/tokio-rs/tokio/issues/6685";>#6685, https://redirect.github.com/tokio-rs/tokio/issues/7090";>#7090)
   sync: add broadcast::WeakSender (https://redirect.github.com/tokio-rs/tokio/issues/7100";>#7100)
   sync: add oneshot::Receiver::is_empty() (https://redirect.github.com/tokio-rs/tokio/issues/7153";>#7153)
   sync: add oneshot::Receiver::is_terminated() (https://redirect.github.com/tokio-rs/tokio/issues/7152";>#7152)
   
   Fixed
   
   fs: empty reads on File should not start a background read 
(https://redirect.github.com/tokio-rs/tokio/issues/7139";>#7139)
   process: calling start_kill on exited child should not fail 
(https://redirect.github.com/tokio-rs/tokio/issues/7160";>#7160)
   signal: fix CTRL_CLOSE, CTRL_LOGOFF, 
CTRL_SHUTDOWN on windows (https://redirect.github.com/tokio-rs/tokio/issues/7122";>#7122)
   sync: properly handle panic during mpsc drop (https://redirect.github.com/tokio-rs/tokio/issues/7094";>#7094)
   
   Changes
   
   runtime: clean up magic number in registration set (https://redirect.github.com/tokio-rs/tokio/issues/7112";>#7112)
   coop: make coop yield using waker defer strategy (https://redirect.github.com/tokio-rs/tokio/issues/7185";>#7185)
   macros: make select! budget-aware (https://redirect.github.com/tokio-rs/tokio/issues/7164";>#7164)
   net: panic when passing a blocking socket to from_std (https://redirect.github.com/tokio-rs/tokio/issues/7166";>#7166)
   io: clean up buffer casts (https://redirect.github.com/tokio-rs/tokio/issues/7142";>#7142)
   
   Changes to unstable APIs
   
   rt: add before and after task poll callbacks (https://redirect.github.com/tokio-rs/tokio/issues/7120";>#7120)
   tracing: make the task tracing API unstable public (https://redirect.github.com/tokio-rs/tokio/issues/6972";>#6972)
   
   Documented
   
   docs: fix nesting of sections in top-level docs (https://redirect.github.com/tokio-rs/tokio/issues/7159";>#7159)
   fs: rename symlink and hardlink parameter names (https://redirect.github.com/tokio-rs/tokio/issues/7143";>#7143)
   io: swap reader/writer in simplex doc test (https://redirect.github.com/tokio-rs/tokio/issues/7176";>#7176)
   macros: docs about select! alternatives (https://redirect.github.com/tokio-rs/tokio/issues/7110";>#7110)
   net: rename the argument for send_to (https://redirect.github.com/tokio-rs/tokio/issues/7146";>#7146)
   
   
   
   ... (truncated)
   
   
   Commits
   
   https://github.com/tokio-rs/tokio/commit/d413c9c02af8f2b4fea14b769b86484b12f46595";>d413c9c
 chore: prepare Tokio v1.44.1 (https://redirect.github.com/tokio-rs/tokio/issues/7217";>#7217)
   https://github.com/tokio-rs/tokio/commit/addbfb9204be25a8621feb3f20b44a7c1f00edbd";>addbfb9
 rt: skip defer queue in block_in_place context (https://redirect.github.com/tokio-rs/tokio/issues/7216";>#7216)
   https://github.com/tokio-rs/tokio/commit/8182ecf2628d5e80dac52b8ed1ea466dbb0925b9";>8182ecf
 chore: prepare Tokio v1.44.0 (https://redirect.github.com/tokio-rs/tokio/issues/7202";>#7202)
   https://github.com/tokio-rs/tokio/commit/a258bff7018940b438e5de3fb846588454df4e4d";>a258bff
 ci: enable printing in multi thread loom tests (https://redirect.github.com/tokio-rs/tokio/issues/7200";>#7200)
   https://github.com/tokio-rs/tokio/commit/e076d21f679a35ae2697165d46d111285d09e3b4";>e076d21
 process: clarify Child::kill behavior (https://redirect.github.com/tokio-rs/tokio/issues/7162";>#7162)
   https://github.com/tokio-rs/tokio/commit/042433cdccdf0dd33408c1751a80ddd50a077872";>042433c
 net: debug_assert on creating a tokio socket from a blocking on

[PR] build(deps): bump async-trait from 0.1.86 to 0.1.88 [datafusion-python]

2025-03-15 Thread via GitHub



dependabot[bot] opened a new pull request, #1070:
URL: https://github.com/apache/datafusion-python/pull/1070

   Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.86 to 
0.1.88.
   
   Release notes
   Sourced from https://github.com/dtolnay/async-trait/releases";>async-trait's 
releases.
   
   0.1.88
   
   Fix lifetime bounding on generic parameters that have cfg (https://redirect.github.com/dtolnay/async-trait/issues/289";>#289)
   
   0.1.87
   
   Documentation improvements
   
   
   
   
   Commits
   
   https://github.com/dtolnay/async-trait/commit/b3a59195c29c5b336490cec1bac23cff8d3e4483";>b3a5919
 Release 0.1.88
   https://github.com/dtolnay/async-trait/commit/a306be84ec998f46acc700e8b24a3b68b77a873a";>a306be8
 Merge pull request https://redirect.github.com/dtolnay/async-trait/issues/289";>#289 from 
dtolnay/cfg
   https://github.com/dtolnay/async-trait/commit/d3059849a4024425f80f0713bc802d8959290d96";>d305984
 Fix lifetime bounding on generic parameters that have cfg
   https://github.com/dtolnay/async-trait/commit/78506f17149e08594c1a120f1df828411772a0b8";>78506f1
 Add regression test for issue 288
   https://github.com/dtolnay/async-trait/commit/a11384eec60634098f66a3d6ac89c23beccdbbc8";>a11384e
 Add issue 283 link in test
   https://github.com/dtolnay/async-trait/commit/32540aadec088a87eeb88f7688ec5b211a0d5167";>32540aa
 Release 0.1.87
   https://github.com/dtolnay/async-trait/commit/137d14caf3fda9b066384e32e883f2c9bddcbf63";>137d14c
 Resolve mem_replace_with_default clippy lint
   https://github.com/dtolnay/async-trait/commit/45fd82a71eca47ddbd0627546fb1ba7489e51712";>45fd82a
 Ignore elidable_lifetime_names pedantic clippy lint
   https://github.com/dtolnay/async-trait/commit/ea2f2a29a29b71f36479ee99d3d51953f80e44d8";>ea2f2a2
 Point standard library links to stable
   https://github.com/dtolnay/async-trait/commit/3b78161de94175ffa72c6d3a5451415b70cace51";>3b78161
 Update ui test suite to nightly-2025-02-12
   See full diff in https://github.com/dtolnay/async-trait/compare/0.1.86...0.1.88";>compare 
view
   
   
   
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=async-trait&package-manager=cargo&previous-version=0.1.86&new-version=0.1.88)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show  ignore conditions` will show all of 
the ignore conditions of the specified dependency
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] build(deps): bump async-trait from 0.1.86 to 0.1.87 [datafusion-python]

2025-03-15 Thread via GitHub



dependabot[bot] closed pull request #1046: build(deps): bump async-trait from 
0.1.86 to 0.1.87
URL: https://github.com/apache/datafusion-python/pull/1046


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[PR] build(deps): bump object_store from 0.11.2 to 0.12.0 [datafusion-python]

2025-03-15 Thread via GitHub



dependabot[bot] opened a new pull request, #1071:
URL: https://github.com/apache/datafusion-python/pull/1071

   Bumps [object_store](https://github.com/apache/arrow-rs) from 0.11.2 to 
0.12.0.
   
   Changelog
   Sourced from https://github.com/apache/arrow-rs/blob/main/CHANGELOG-old.md";>object_store's
 changelog.
   
   
   Historical Changelog
   https://github.com/apache/arrow-rs/tree/54.2.0";>54.2.0 
(2025-02-12)
   https://github.com/apache/arrow-rs/compare/54.1.0...54.2.0";>Full 
Changelog
   Implemented enhancements:
   
   Casting from Utf8View to Dict(k, Utf8View) https://redirect.github.com/apache/arrow-rs/issues/7114";>#7114
   Support creating map arrays with key metadata https://redirect.github.com/apache/arrow-rs/issues/7100";>#7100 [https://github.com/apache/arrow-rs/labels/arrow";>arrow]
   [parquet] Print Parquet BasicTypeInfo id when present https://redirect.github.com/apache/arrow-rs/issues/7081";>#7081 [https://github.com/apache/arrow-rs/labels/parquet";>parquet]
   Add arrow-ipc benchmarks for the IPC reader and writer https://redirect.github.com/apache/arrow-rs/issues/6968";>#6968 [https://github.com/apache/arrow-rs/labels/arrow";>arrow]
   
   Fixed bugs:
   
   NullBufferBuilder::allocated_size Returns Size in Bits https://redirect.github.com/apache/arrow-rs/issues/7121";>#7121 [https://github.com/apache/arrow-rs/labels/arrow";>arrow]
   [Regression in 54.0.0]. Decimal cast to smaller precision gives invalid 
(off-by-one) result in some cases https://redirect.github.com/apache/arrow-rs/issues/7069";>#7069 [https://github.com/apache/arrow-rs/labels/arrow";>arrow]
   Minor: Fix deprecated note to point to the correct const https://redirect.github.com/apache/arrow-rs/issues/7067";>#7067 [https://github.com/apache/arrow-rs/labels/arrow";>arrow]
   incorrect error message for reading definition levels https://redirect.github.com/apache/arrow-rs/issues/7056";>#7056 [https://github.com/apache/arrow-rs/labels/parquet";>parquet]
   First None in ListArray panics in cast_with_options https://redirect.github.com/apache/arrow-rs/issues/7043";>#7043 [https://github.com/apache/arrow-rs/labels/arrow";>arrow]
   
   Documentation updates:
   
   Minor: Clarify documentation on 
NullBufferBuilder::allocated_size https://redirect.github.com/apache/arrow-rs/pull/7089";>#7089 [https://github.com/apache/arrow-rs/labels/arrow";>arrow] (https://github.com/alamb";>alamb)
   Minor: Update release schedule https://redirect.github.com/apache/arrow-rs/pull/7086";>#7086 (https://github.com/alamb";>alamb)
   Improve ListArray documentation for slices https://redirect.github.com/apache/arrow-rs/pull/7039";>#7039 [https://github.com/apache/arrow-rs/labels/arrow";>arrow] (https://github.com/alamb";>alamb)
   
   Merged pull requests:
   
   fix: NullBufferBuilder::allocated_size should return Size in Bytes https://redirect.github.com/apache/arrow-rs/pull/7122";>#7122 [https://github.com/apache/arrow-rs/labels/arrow";>arrow] (https://github.com/shuozel";>shuozel)
   minor: fix deprecated_note https://redirect.github.com/apache/arrow-rs/pull/7105";>#7105 [https://github.com/apache/arrow-rs/labels/arrow";>arrow] (https://github.com/Chen-Yuan-Lai";>Chen-Yuan-Lai)
   
   
   
   ... (truncated)
   
   
   Commits
   
   See full diff in https://github.com/apache/arrow-rs/commits";>compare view
   
   
   
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=object_store&package-manager=cargo&previous-version=0.11.2&new-version=0.12.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show  ignore conditions` will show all of 
the ignore conditions of the specified dependency
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore

Re: [PR] build(deps): bump async-trait from 0.1.86 to 0.1.87 [datafusion-python]

2025-03-15 Thread via GitHub



dependabot[bot] commented on PR #1046:
URL: 
https://github.com/apache/datafusion-python/pull/1046#issuecomment-2726951093

   Superseded by #1070.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] build(deps): bump uuid from 1.13.1 to 1.15.1 [datafusion-python]

2025-03-15 Thread via GitHub



dependabot[bot] commented on PR #1039:
URL: 
https://github.com/apache/datafusion-python/pull/1039#issuecomment-2726950949

   Superseded by #1068.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] build(deps): bump tokio from 1.43.0 to 1.44.0 [datafusion-python]

2025-03-15 Thread via GitHub



dependabot[bot] commented on PR #1047:
URL: 
https://github.com/apache/datafusion-python/pull/1047#issuecomment-2726951020

   Superseded by #1069.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] build(deps): bump tokio from 1.43.0 to 1.44.0 [datafusion-python]

2025-03-15 Thread via GitHub



dependabot[bot] closed pull request #1047: build(deps): bump tokio from 1.43.0 
to 1.44.0
URL: https://github.com/apache/datafusion-python/pull/1047


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] build(deps): bump uuid from 1.13.1 to 1.15.1 [datafusion-python]

2025-03-15 Thread via GitHub



dependabot[bot] closed pull request #1039: build(deps): bump uuid from 1.13.1 
to 1.15.1
URL: https://github.com/apache/datafusion-python/pull/1039


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] [DISCUSSION] physical-plan-common crate ~and Revert the datasource - physical-plan Dependency~ [datafusion]

2025-03-15 Thread via GitHub



berkaysynnada commented on issue #15111:
URL: https://github.com/apache/datafusion/issues/15111#issuecomment-2726959050

   > physical-plan is not possible to import `datasource`, you would end up 
moving everything inside physical-plan.
   
   Why are you thinking so? If you worry about the catalog, when we have 
physical-plan-common, we can easily convert physical-plan dependency of catalog 
crate to physical-plan-common.
   
   ```mermaid
   flowchart TD
   physical-plan["physical-plan"] --> 
physical-plan-common["physical-plan-common"] & datasource["datasource"]
   datasource --> physical-plan-common
   n1["catalog"] --> physical-plan-common
   datasource --> n1
   
   n1@{ shape: rect}
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Order Requirement Analysis [datafusion-site]

2025-03-15 Thread via GitHub



Omega359 commented on code in PR #58:
URL: https://github.com/apache/datafusion-site/pull/58#discussion_r1987631033


##
content/blog/2025-03-05-ordering-analysis.md:
##
@@ -0,0 +1,353 @@
+---
+layout: post
+title: Analysis of Ordering for Better Plans
+date: 2025-03-05
+author: Mustafa Akur, Andrew Lamb
+categories: [tutorial]
+---
+
+
+
+
+
+## Introduction
+In this blog post, we will explore how to determine whether an ordering 
requirement of an operator is satisfied by its input data. This analysis is 
essential for order-based optimizations and is often more complex than one 
might initially think.
+
+Ordering Requirement for an operator refers to the 
condition that input data must be sorted in a certain way for the operator to 
function as intended. If this condition is not met, the operator may not 
perform as expected. It is the job of the planner to make sure that the 
requirements - such as specific ordering, specific distribution, etc. - of all 
operators are satisfied during execution.
+
+
+There are various use cases, where this type of analysis can be useful. 
+### Removing Unnecessary Sorts
+Imagine a user wants to execute the following query:
+```SQL
+SELECT hostname, log_line 
+FROM telemetry ORDER BY time ASC limit 10
+```
+If we don't know anything about the `telemetry` table, we need to sort it by 
`time ASC` and then retrieve the first 10 rows to get the correct result. 
However, if the table is already ordered by `time ASC`, simply retrieving the 
first 10 rows is sufficient. This approach executes much faster and uses less 
memory compared to the first version.
+
+The key is that the query optimizer needs to know the data is already sorted. 
For simple queries that is likely simple, but it gets complicated fast, like 
for example, what if your data is sorted by `[hostname, time ASC]` and your 
query is
+```sql
+SELECT hostname, log_line 
+FROM telemetry WHERE hostname = 'app.example.com' ORDER BY time ASC;
+```
+In this case, the system still doesn't have to do any sorting -- but only if 
it has enough analysis to be able to reason about the sortedness of the stream 
when we know `hostname` has a single value.
+
+### Optimizing Execution Modes Using Ordering Information
+As another use case, some operators can utilize the ordering information to 
change its underlying algorithm to execute more efficiently. Consider the 
following query:
+```SQL
+SELECT COUNT(log_line) 
+FROM telemetry GROUP BY hostname;
+```
+when `telemetry` is sorted by `hostname`, aggregation doesn't need to hash the 
entire data at its input. It can use a much more efficient algorithm for 
grouping the data according to the `hostname` values. Failure to detect the 
ordering can result in choosing the sub-optimal algorithm variant for the 
operator. To see this in practice, check out the 
[source](https://github.com/apache/datafusion/tree/main/datafusion/physical-plan/src/aggregates/order)
 for ordered variant of the `Aggregation` in `Datafusion`.

Review Comment:
   `input is sorted corectly` ... ->  input is sorted correctly



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] feat/improve ruff test coverage [datafusion-python]

2025-03-15 Thread via GitHub



timsaucer commented on PR #1055:
URL: 
https://github.com/apache/datafusion-python/pull/1055#issuecomment-2710849717

   #1056 contains the follow on work to apply additional rules


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] Dropping Spark 3.3 support [datafusion-comet]

2025-03-15 Thread via GitHub



andygrove closed issue #646: Dropping Spark 3.3 support
URL: https://github.com/apache/datafusion-comet/issues/646


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Use insta for `DataFrame` tests [datafusion]

2025-03-15 Thread via GitHub



alamb commented on PR #15165:
URL: https://github.com/apache/datafusion/pull/15165#issuecomment-2725669379

   🚀 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[PR] Improve eliminate_outer_join rule [datafusion]

2025-03-15 Thread via GitHub



suibianwanwank opened a new pull request, #15254:
URL: https://github.com/apache/datafusion/pull/15254

   ## Which issue does this PR close?
   
   
   
   - Closes #13232 
   
   ## Rationale for this change
   Inspired by the approach in #13249. This PR explores an alternative that 
does not require simplify and covering more expressions. Additionally, I 
believe `evaluates_to_null` can be useful in more optimization rules
   
   ## What changes are included in this PR?
   1. add `evaluates_to_not_true` and `evaluates_to_null` utility func for 
optimizing expressions.
   2. Improve eliminate_outer_join rule to support more expr.
   
   ## Are these changes tested?
   
   
   
   ## Are there any user-facing changes?
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] chore: Re-enable GitHub discussions [datafusion-comet]

2025-03-15 Thread via GitHub



codecov-commenter commented on PR #1535:
URL: 
https://github.com/apache/datafusion-comet/pull/1535#issuecomment-2726766217

   ## 
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1535?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 Report
   All modified and coverable lines are covered by tests :white_check_mark:
   > Project coverage is 57.48%. Comparing base 
[(`f09f8af`)](https://app.codecov.io/gh/apache/datafusion-comet/commit/f09f8af64c6599255e116a376f4f008f2fd63b43?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 to head 
[(`8c9ed74`)](https://app.codecov.io/gh/apache/datafusion-comet/commit/8c9ed74c75ae6899adadbcc82f520f3e9bda4713?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   > Report is 83 commits behind head on main.
   
   Additional details and impacted files
   
   
   ```diff
   @@ Coverage Diff  @@
   ##   main#1535  +/-   ##
   
   + Coverage 56.12%   57.48%   +1.35% 
   + Complexity  976  971   -5 
   
 Files   119  123   +4 
 Lines 1174312174 +431 
 Branches   2251 2284  +33 
   
   + Hits   6591 6998 +407 
   + Misses 4012 4004   -8 
   - Partials   1140 1172  +32 
   ```
   
   
   
   [:umbrella: View full report in Codecov by 
Sentry](https://app.codecov.io/gh/apache/datafusion-comet/pull/1535?dropdown=coverage&src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   
   :loudspeaker: Have feedback on the report? [Share it 
here](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   
   🚀 New features to boost your workflow: 
   
   - ❄ [Test Analytics](https://docs.codecov.com/docs/test-analytics): Detect 
flaky tests, report on failures, and find test suite problems.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Fix wildcard dataframe case [datafusion]

2025-03-15 Thread via GitHub



jayzhan211 commented on code in PR #15230:
URL: https://github.com/apache/datafusion/pull/15230#discussion_r1995599910


##
datafusion/sql/src/select.rs:
##
@@ -741,8 +722,17 @@ impl SqlToRel<'_, S> {
 }
 
 /// Wrap a plan in a projection
-fn project(&self, input: LogicalPlan, expr: Vec) -> 
Result {
-self.validate_schema_satisfies_exprs(input.schema(), &expr)?;
+fn project(&self, input: LogicalPlan, expr: Vec) -> 
Result {
+// convert to Expr for validate_schema_satisfies_exprs
+let exprs = expr
+.iter()
+.filter_map(|e| match e {
+SelectExpr::Expression(expr) => Some(expr.to_owned()),
+_ => None,
+})
+.collect::>();
+self.validate_schema_satisfies_exprs(input.schema(), &exprs)?;

Review Comment:
   maybe we can change the params to SelectExpr?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] feat: Attach `Diagnostic` to more than one column errors in scalar_subquery and in_subquery [datafusion]

2025-03-15 Thread via GitHub



changsun20 commented on PR #15143:
URL: https://github.com/apache/datafusion/pull/15143#issuecomment-2725384326

   Thank you all for your help! Looking forward to contributing more to the 
project soon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[I] Add `ctx = SessionContext()` to init [datafusion-python]

2025-03-15 Thread via GitHub



deanm opened a new issue, #1065:
URL: https://github.com/apache/datafusion-python/issues/1065

   That way there'd just be one line
   
   ```python
   from datafusion import ctx, col, lit, functions as F
   ```
   
   
![Image](https://github.com/user-attachments/assets/8047b372-6002-40b4-8ef1-4b9f2ceec6a1)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[PR] [branch-46] Update ring to v0.17.13 (#15063) [datafusion]

2025-03-15 Thread via GitHub



alamb opened a new pull request, #15228:
URL: https://github.com/apache/datafusion/pull/15228

   - Part of https://github.com/apache/datafusion/issues/15151
   
   Rationale: get a clean CI run on 46
   Changes:
   - Backport https://github.com/apache/datafusion/pull/15063
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] distinct_query_sql benchmark is failing [datafusion]

2025-03-15 Thread via GitHub



getChan commented on issue #15213:
URL: https://github.com/apache/datafusion/issues/15213#issuecomment-2724220964

   take


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Implement tree explain for InterleaveExec [datafusion]

2025-03-15 Thread via GitHub



alamb merged PR #15219:
URL: https://github.com/apache/datafusion/pull/15219


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [datafusion-comet]

2025-03-15 Thread via GitHub



andygrove commented on code in PR #1443:
URL: https://github.com/apache/datafusion-comet/pull/1443#discussion_r1996051635


##
native/core/src/parquet/mod.rs:
##
@@ -645,65 +653,42 @@ pub unsafe extern "system" fn 
Java_org_apache_comet_parquet_Native_initRecordBat
 .get_string(&JString::from_raw(file_path))
 .unwrap()
 .into();
-let batch_stream: Option;
-// TODO: (ARROW NATIVE) Use the common global runtime
+
 let runtime = tokio::runtime::Builder::new_multi_thread()
 .enable_all()
 .build()?;
 let session_ctx = SessionContext::new();
+
 let (object_store_url, object_store_path) =
-prepare_object_store(session_ctx.runtime_env(), path.clone())?;
+prepare_object_store(session_ctx.runtime_env(), 
path.clone()).unwrap();

Review Comment:
   This should be `?` and the function should return a `Result`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

1 2 3 >

1 - 100 of 206 matches

Mail list logo