Re: Question about thread local data in `QueryContext`

2023-03-09 Thread Ruoxi Sun
Got it, that makes sense. Thanks for the answer, really appreciate it! *Rossi Sun* Sasha Krassovsky 于2023年3月10日周五 14:38写道: > Hi Rossi, > It is supposed to be used by every node that needs a temporary array. It > is not used because we haven’t performed the refactor. > > Sasha > > > 9 марта 20

Re: Question about thread local data in `QueryContext`

2023-03-09 Thread Sasha Krassovsky
Hi Rossi, It is supposed to be used by every node that needs a temporary array. It is not used because we haven’t performed the refactor. Sasha > 9 марта 2023 г., в 21:57, Ruoxi Sun написал(а): > > Hi Sasha, thanks for the kind reply. Yeah, that makes sense for using > thread local data to r

Re: Question about thread local data in `QueryContext`

2023-03-09 Thread Ruoxi Sun
Hi Sasha, thanks for the kind reply. Yeah, that makes sense for using thread local data to reduce the vector allocation/deallocation overhead. However I'm still wondering if this thread local data has to be in QueryContext? Specifically, there is thread local state

Re: [ADBC][Rust] Proposal for Rust ADBC API

2023-03-09 Thread Will Jones
I've been thinking about the process here, and I'd like to propose an alternate path. As I understand it, currently the process is: 1. Approve the Rust API as a stable API 2. "Release" Rust API as part of a new version of the ADBC format 3. Release Rust libraries in tandem with other ADBC librarie

Re: [EXTERNAL] Re: Field class in Java vs C#

2023-03-09 Thread Will Jones
Hi David Coe, As David Li pointed out, ADBC implementations can either be based purely within a language (C#-specific drivers that can only be used by C# programs) or use C API drivers written in other languages (C, C++, Go). For the latter, we won't be able to implement this until we finish imple

Re: [DISCUSS] Acero roadmap / philosophy

2023-03-09 Thread Aldrin
Thanks for sharing! There are a variety of things that I didn't know about (such as ExecBatchBuilder) and it's interesting to hear about the performance challenges. How much would future substrait work involve integration with Acero? I'm curious how much more support of substrait is seen as valuab

RE: [EXTERNAL] Re: Field class in Java vs C#

2023-03-09 Thread David Coe
Yes, ok, I see the pattern now. Thanks you. -Original Message- From: David Li Sent: Thursday, March 9, 2023 4:30 PM To: dev@arrow.apache.org Subject: Re: [EXTERNAL] Re: Field class in Java vs C# [You don't often get email from lidav...@apache.org. Learn why this is important at https:/

Re: [EXTERNAL] Re: Field class in Java vs C#

2023-03-09 Thread David Li
I believe it would be something like (pseudocode since the last time I touched C♯ was, 2009?) List TABLE_SCHEMA = new[]{ ..., new Field("table_columns", new ListType(new StructType(COLUMN_SCHEMA)), ..., }; i.e. COLUMN_SCHEMA gets passed as the fields of a StructType itself instead of the

RE: [EXTERNAL] Re: Field class in Java vs C#

2023-03-09 Thread David Coe
I am investigating whether ADBC can be a replacement for ODBC in certain scenarios and help with more efficient copying. For example, in https://github.com/apache/arrow-adbc/blob/923e0408fe5a32cc6501b997fafa8316ace25fe0/java/core/src/main/java/org/apache/arrow/adbc/core/StandardSchemas.java#L116

Re: [RESULT][VOTE] Release Apache Arrow nanoarrow 0.1.0 - RC1

2023-03-09 Thread Dewey Dunnington
Absolutely! By the next time this happens I hope to be better at this :-) The post-release taks are all complete! [x] Closed GitHub milestone [x] Added release to Apache Reporter System (Thanks David!) [x] Uploaded artifacts to Subversion (Thanks David!) [x] Created GitHub release [x] Submit R pac

Re: Field class in Java vs C#

2023-03-09 Thread David Li
I'd be very interested if I can help in any way with porting ADBC to more languages, and learning more about use cases/what functionality is useful (e.g. are you looking to have a full driver/client ecosystem in C♯, or are you interested in being able to leverage drivers written in C/C++/Go?) >

Re: [DISCUSS] Acero roadmap / philosophy

2023-03-09 Thread Antoine Pitrou
Just a reminder for those following other implementations of Arrow, that Acero is the compute/execution engine subsystem baked into Arrow C++. Regards Antoine. Le 09/03/2023 à 21:20, Weston Pace a écrit : We are getting closer to another release. I am thinking about what to work on in th

[DISCUSS] Acero roadmap / philosophy

2023-03-09 Thread Weston Pace
We are getting closer to another release. I am thinking about what to work on in the next release. I think it is a good time to have a discussion about Acero in general. This is possibly also of interest to those working on pyarrow or r-arrow as these libraries rely on Acero for various function

Field class in Java vs C#

2023-03-09 Thread David Coe
I am interested in the difference between how a Field is structured in Java (with children) and in C# (no children) and why that's the case. I am looking to port apache/arrow-adbc: Apache arrow (github.com) to C# but the concept of children is making it a l

Re: Timestamp unit in Substrait and Arrow

2023-03-09 Thread Weston Pace
The Substrait decision for microseconds was made because, at the time, the goal was to keep the type system simple and universal, and there were systems that didn't support ns (e.g. Iceberg, postgres, duckdb, velox). A few options (off the top of my head): 1. Attempt to get a nanoseconds timesta

Re: Question about thread local data in `QueryContext`

2023-03-09 Thread Sasha Krassovsky
Hi Rossi, When profiling Acero we noticed that there was a lot of overhead regarding memory allocation, specifically in the creation/destruction of std::vector. This thread local data in QueryContext was put there as a preparation to refactor other nodes to use TempVectorStack when they need a t

Timestamp unit in Substrait and Arrow

2023-03-09 Thread Li Jin
Hi, I recently came across some limitations in expressing timestamp type with Substrait in the Acero substrait consumer and am curious to hear what people's thoughts are. The particular issue that I have is when specifying timestamp type in substrait, the unit is "microseconds" and there is no wa

Question about thread local data in `QueryContext`

2023-03-09 Thread Ruoxi Sun
Hi folks, I see that the member `tld_ ` in class `QueryContext` is used by `BloomFilterPushdownContext