[GitHub] [arrow-datafusion] yjshen commented on a diff in pull request #2132: WIP: Reduce sort memory usage v1

2022-04-03 Thread GitBox
yjshen commented on code in PR #2132: URL: https://github.com/apache/arrow-datafusion/pull/2132#discussion_r841342087 ## datafusion/core/src/physical_plan/sorts/sort.rs: ## @@ -105,13 +107,21 @@ impl ExternalSorter { } } -async fn insert_batch(&self, input: R

[GitHub] [arrow] sanjibansg commented on a diff in pull request #12755: ARROW-16014: [C++] Create more benchmarks for measuring expression evaluation overhead

2022-04-03 Thread GitBox
sanjibansg commented on code in PR #12755: URL: https://github.com/apache/arrow/pull/12755#discussion_r841298581 ## cpp/src/arrow/compute/exec/expression_execution_benchmark.cc: ## @@ -0,0 +1,56 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contri

[GitHub] [arrow] sanjibansg commented on a diff in pull request #12755: ARROW-16014: [C++] Create more benchmarks for measuring expression evaluation overhead

2022-04-03 Thread GitBox
sanjibansg commented on code in PR #12755: URL: https://github.com/apache/arrow/pull/12755#discussion_r841298550 ## cpp/src/arrow/compute/exec/expression_execution_benchmark.cc: ## @@ -0,0 +1,56 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contri

[GitHub] [arrow] sanjibansg commented on a diff in pull request #12755: ARROW-16014: [C++] Create more benchmarks for measuring expression evaluation overhead

2022-04-03 Thread GitBox
sanjibansg commented on code in PR #12755: URL: https://github.com/apache/arrow/pull/12755#discussion_r841298493 ## cpp/src/arrow/compute/exec/expression_execution_benchmark.cc: ## @@ -0,0 +1,56 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contri

[GitHub] [arrow] sanjibansg commented on a diff in pull request #12755: ARROW-16014: [C++] Create more benchmarks for measuring expression evaluation overhead

2022-04-03 Thread GitBox
sanjibansg commented on code in PR #12755: URL: https://github.com/apache/arrow/pull/12755#discussion_r841298365 ## cpp/src/arrow/compute/exec/expression_execution_benchmark.cc: ## @@ -0,0 +1,56 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contri

[GitHub] [arrow-datafusion] renato2099 commented on a diff in pull request #2142: implement 'StringConcat' operator to support sql like "select 'aa' || 'b' "

2022-04-03 Thread GitBox
renato2099 commented on code in PR #2142: URL: https://github.com/apache/arrow-datafusion/pull/2142#discussion_r841282224 ## datafusion/core/tests/sql/expr.rs: ## @@ -280,6 +280,47 @@ async fn query_scalar_minus_array() -> Result<()> { Ok(()) } +#[tokio::test] +async fn

[GitHub] [arrow-datafusion] renato2099 commented on a diff in pull request #1510: Add factorial function

2022-04-03 Thread GitBox
renato2099 commented on code in PR #1510: URL: https://github.com/apache/arrow-datafusion/pull/1510#discussion_r841278856 ## datafusion/Cargo.toml: ## @@ -77,6 +77,7 @@ rand = "0.8" avro-rs = { version = "0.13", features = ["snappy"], optional = true } num-traits = { version =

[GitHub] [arrow-datafusion] renato2099 commented on a diff in pull request #1510: Add factorial function

2022-04-03 Thread GitBox
renato2099 commented on code in PR #1510: URL: https://github.com/apache/arrow-datafusion/pull/1510#discussion_r841278823 ## datafusion/src/physical_plan/functions.rs: ## @@ -546,6 +550,7 @@ pub fn return_type( BuiltinScalarFunction::Lpad => utf8_to_str_type(&input_expr

Re: [C++] Replacing xsimd with compiler autovectorization

2022-04-03 Thread Sasha Krassovsky
> It would be a very significant contributor, as the inconsistency can manifest > under the form of up to 8-fold differences in performance (or perhaps more). This is on a micro benchmark. For a user workload, the kernel will account for maybe 20% of the runtime, so even if the kernel gets 10x f

Re: [C++] Replacing xsimd with compiler autovectorization

2022-04-03 Thread Antoine Pitrou
Le 01/04/2022 à 08:43, Sasha Krassovsky a écrit : I agree that a potential inconsistent experience is a problem, but I disagree that SIMD would be the root of the problem, or even be a significant contributor to it. It would be a very significant contributor, as the inconsistency can manifes

Re: [VOTE][RUST] Release Apache Arrow Rust 11.1.0 RC1

2022-04-03 Thread Andy Grove
+1 (binding) Verified on Mac M1 On Sat, Apr 2, 2022 at 12:25 PM Chao Sun wrote: > +1 (non-binding) thanks Andrew! > > On Sat, Apr 2, 2022 at 11:16 AM QP Hou wrote: > > > +1 (binding), thanks Andrew! > > > > On Fri, Apr 1, 2022 at 8:26 AM Andrew Lamb wrote: > > > > > > Hi, > > > > > > I would