Re: Help drafting Apache Arrow 2022-07 board report

2022-07-11 Thread Sutou Kouhei
Submitted: https://whimsy.apache.org/board/agenda/2022-07-20/Arrow Thank you all! In <20220711.125944.837133348283645422@clear-code.com> "Re: Help drafting Apache Arrow 2022-07 board report" on Mon, 11 Jul 2022 12:59:44 +0900 (JST), Sutou Kouhei wrote: > Hi, > > I'll submit this in 24

Re: Substrait vs GraphQL

2022-07-11 Thread Weston Pace
This might be an interesting topic for the Substrait community. You can find ways to contact them at [1]. I don't know GraphQL well enough but from what I do know it seems like a GraphQL -> Substrait converter would be useful, at the very least. [1] https://substrait.io/community/ On Mon, Jul 1

Re: cpp Memory Pool Clarification

2022-07-11 Thread Weston Pace
I suppose it depends on your goal. My earlier feedback was that doing a true scan is often detrimental for benchmarking since I/O time can often dominate the results. Also, to get the best scan results, you often spend a lot of time micromanaging the file format / compression / file layout / etc.

[RESULT] [VOTE][RUST] Release Apache Arrow Rust 18.0.0 RC1

2022-07-11 Thread Andrew Lamb
The release is approved with 9 +1 votes (3 binding). Thank you all so much for helping to verify the release The release is available here: https://dist.apache.org/repos/dist/release/arrow/arrow-rs-18.0.0 It has also been uploaded to crates.io: https://crates.io/crates/arrow https://crates.io

Re: cpp Memory Pool Clarification

2022-07-11 Thread Li Jin
> TableSourceNode wouldn't need to allocate since it runs against memory that's already been allocated. Is the memory "that is already allocated" tracked in any allocators? For an end to end benchmark of "scan - join - write" I think would make sense to include all arrow memory allocation (if that

Re: cpp Memory Pool Clarification

2022-07-11 Thread Weston Pace
> Is there anything else I'd need to change? Maybe try something like this: https://github.com/westonpace/arrow/commit/15ac0d051136c585cda63297e48f17557808d898 > Beyond that, we should also expect to see some allocations from > TableSourceNode going through the logging memory pool, even if AsOfJ

Substrait vs GraphQL

2022-07-11 Thread Lee, David
https://graphql.org/learn/schema/#object-types-and-fields I just watched the Data Thread Substrait webinar and I'm wondering if it would be easier to use / extend the GraphQL language standard for substrait plans? The GraphQL schema is purely logical and the GraphQL query standard supports joi

RE: cpp Memory Pool Clarification

2022-07-11 Thread Ivan Chau
Yeah this behavior is certainly a bit strange then. The only alteration I am making is changing the way we create the Execution Context in the benchmark file. Something like: ``` auto logging_pool = LoggingMemoryPool(default_memory_pool()); ExecContext ctx(&logging_pool, ...); ``` Is there any

Re: cpp Memory Pool Clarification

2022-07-11 Thread Weston Pace
Are you changing the default memory pool to a LoggingMemoryPool? Where are you doing this? For a benchmark I think you would need to change the implementation in the benchmark file itself. Similarly, is AsofJoinNode using the default memory pool or the memory pool of the exec plan? It should be

cpp Memory Pool Clarification

2022-07-11 Thread Ivan Chau
Hi all, I've been doing some testing with LoggingMemoryPool to benchmark our AsOfJoin implementation . Our underlying memory pool for the LoggingMemoryPool is the default_memory_pool (this is process-wide).

Re: [Rust] Enable GitHub discussions for Rust projects?

2022-07-11 Thread Andrew Lamb
Thank you Andy -- I hope this experiment works out well On Sat, Jul 9, 2022 at 12:10 PM Andy Grove wrote: > We now have GitHub discussions enabled. Let's see how it works out. > > https://github.com/apache/arrow-rs/discussions/2036 > https://github.com/apache/arrow-datafusion/discussions/2861 >

Re: [VOTE][RUST] Release Apache Arrow Rust 18.0.0 RC1

2022-07-11 Thread Martin Grigorov
+1 (non-binding) Tested on Ubuntu 20.04.4 x86_64 and openEuler 20.03 aarch64. Regards, Martin On Fri, Jul 8, 2022 at 9:55 PM Andrew Lamb wrote: > Hi, > > I would like to propose a release of Apache Arrow Rust Implementation, > version 18.0.0. > > This release candidate is based on commit: > 33