Re: [DISCUSS] Use conbench.ursa.dev for arrow-rs and arrow-datafusion

2021-09-12 Thread Diana Clarke
Thanks for those great instructions, QP! I've spiked adding the arrow-rs and arrow-datafusion benchmark results to the Arrow Conbench server. https://github.com/ursacomputing/benchmarks/pull/79/files And I've published one example run for arrow-datafusion: - Example run: https://conbench.u

Re: [VOTE][RUST] Release Apache Arrow Rust 5.4.0 RC1

2021-09-12 Thread Andy Grove
+1 (binding) Checked signature and ran release verification script. On Fri, Sep 10, 2021 at 10:54 AM Andrew Lamb wrote: > Hi, > > I would like to propose a release of Apache Arrow Rust Implementation, > version 5.4.0. > > NOTE: There is one known issue [5] that causes one of the CI tests to fai

arrow_iterator.cpython-38-darwin.so - the developer cannot be verified

2021-09-12 Thread Jason Withrow
Greetings, Apologies in advance if this is the wrong forum to raise this issue. I would be happy to file a bug in Jira If more appropriate. I am experiencing issues accessing files over NFS from Big Sure with pyarrow 5.0.0. I am running an arm chip, in case that matters. Pyarrow 4.0.1 does wo

Re: [DISCUSS] Use conbench.ursa.dev for arrow-rs and arrow-datafusion

2021-09-12 Thread QP Hou
Thank you Diana for the quick turnaround! The trail run looks great. You are right that `sqrt_20_12, sqrt_20_9, sqrt_22_12, and sqrt_22_14` are just the same type of test with different parameters, so it makes sense to batch them. We can name these benchmarks however we want to make it easier for

[DISCUSS] Leap seconds/days and day light saving for Duration types

2021-09-12 Thread QP Hou
Hi, I would like to draw some attention to a format PR aiming to clarify leap seconds, leap days and daylight saving handling semantics for duration types: https://github.com/apache/arrow/pull/11138. This came out of the effort [1] trying to implement Partial and Total order for duration type DAY

Re: [DataFusion] Question about async/await?

2021-09-12 Thread QP Hou
Hi Renjie, If by datafusion benchmarks, you are referring to the code in the datafusion/benches folder, then those benchmarks are executed with tokio runtime. You are correct that one should schedule compute bound tasks into a separate task managed by a dedicated thread to avoid blocking the asyn