Re: Please Review: Application for a Media Type

2021-04-22 Thread Sutou Kouhei
Hi, I feel that '.stream' is too generic. How about '.arrows'? JSON Lines uses 'l' suffix for extension: '.jsonl' https://jsonlines.org/#conventions Thanks, -- kou In "Re: Please Review: Application for a Media Type" on Thu, 22 Apr 2021 06:44:51 +0200, Jorge Cardoso Leitão wrote: > Tha

Re: Pyarrow RecordBatchStreamWriter and dictionaries

2021-04-22 Thread Radu Teodorescu
Hi I am seeing a similar problem when serializing tables with lists of dictionary encoded elements: each resulting chunk is pointing to the first chunk’s original dictionary. Is this a known issue/limitation. I can follow with a repro otherwise. Thank you Radu > On Sep 28, 2020, at 1:26 PM, Wes

Re: [Go] Flight client app metadata access

2021-04-22 Thread Paul Whalen
Matt, I just created the JIRA: https://issues.apache.org/jira/browse/ARROW-12517 . I'd be happy to pick it up as well but since it sounds like you're already in the code at the moment and it's a small change, I'll leave it to you. Thanks for your work on the Go library already! For anyone else

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-22 Thread Sutou Kouhei
This is a verification script problem: https://github.com/apache/arrow/pull/10135 It should not run Gandiva related Ruby tests with ARROW_GANDIVA=0. In <55363261-b056-440b-dbfe-222f8f13f...@python.org> "Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Thu, 22 Apr 2021 13:44:54 +0200, Antoi

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-22 Thread Mauricio Vargas
+1 I've been using the development version for assignments and I had 0 problems On Thu, Apr 22, 2021 at 4:12 PM Ian Cook wrote: > +1 (non-binding) > > Verified C++ source on Windows 10 with only known failures[1] that > seem to be caused by using Visual Studio 2019. > > [1] https://issues.apac

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-22 Thread Ian Cook
+1 (non-binding) Verified C++ source on Windows 10 with only known failures[1] that seem to be caused by using Visual Studio 2019. [1] https://issues.apache.org/jira/browse/ARROW-11675 On Wed, Apr 21, 2021 at 5:30 PM Krisztián Szűcs wrote: > > Hi, > > I would like to propose the following rele

Re: RE: [Go] expose ability to write arrow.Table to JSON

2021-04-22 Thread Agam Brahma
Thanks for the clarifications, much appreciated. Looking closer, I realize `arrjson` is anyway separating out the values, which isn't what I'd want to ship a table from one service to another. What's a good way to embed the table as a byte stream that can be "read back out" the other end? I se

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-22 Thread Jonathan Keane
+1 (non-binding) Verified wheels, sources, and binaries on macOS 11.2 using the verification script (except for Java Integration, Glib, and Ruby). Like Antoine I ran into the same issue with Ruby. I also installed Arrow and the R package locally + ran some adhoc tests using some of our benchmarks

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-22 Thread Krisztián Szűcs
+1 (binding) Verified the source release, binaries and wheels on macOS Big Sur. The automatized verifications scripts have passed as well [1] [1]: https://github.com/apache/arrow/pull/10126 On Thu, Apr 22, 2021 at 3:56 PM David Li wrote: > > +1 (non-binding) > > Verified wheels, sources, and ap

RE: [Go] Flight client app metadata access

2021-04-22 Thread Matthew Topol
You're absolutely correct and not missing anything, this is definitely an opportunity to make the flight record reader a bit more useful. I like the idea of the getLatestMetadata that you mentioned is on the java side. Can you file a JIRA issue for this? Given that I'm doing a lot of updates to

RE: [Go] expose ability to write arrow.Table to JSON

2021-04-22 Thread Matthew Topol
Micah is correct, the arrjson package is used for the internal integration testing using the specific JSON format for that integration testing which is not likely what Users would want when converting Arrow to JSON. There is not currently a recommended way to serialize an instance of arrow.Tabl

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-22 Thread David Li
+1 (non-binding) Verified wheels, sources, and apt binaries on Ubuntu 18.04. Best, David On 2021/04/21 21:30:33, Krisztián Szűcs wrote: > Hi, > > I would like to propose the following release candidate (RC3) of Apache > Arrow version 4.0.0. This is a release consisting of 719 > resolved JIRA

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-22 Thread Antoine Pitrou
I tried to verify the source release on Ubuntu 20.04 with ARROW_GANDIVA=0 TEST_JAVA=0 TEST_INTEGRATION=0 TEST_CSHARP=0. It succeeded until the Ruby bindings: + bundle exec ruby test/run-test.rb Traceback (most recent call last): 9: from test/run-test.rb:48:in `' 8: from test/ru

[NIGHTLY] Arrow Build Report for Job nightly-2021-04-22-0

2021-04-22 Thread Crossbow
Arrow Build Report for Job nightly-2021-04-22-0 All tasks: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-22-0 Failed Tasks: - conda-linux-gcc-py36-arm64: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-22-0-drone-conda-linux-g

RE: [C++] Indeterminate poor performance of random number generator

2021-04-22 Thread Yibo Cai
Yes, these soft-float math (in libm.so) makes Arm binary extremely slow. -Original Message- From: Antoine Pitrou Sent: Thursday, April 22, 2021 17:20 To: dev@arrow.apache.org Subject: Re: [C++] Indeterminate poor performance of random number generator Le 22/04/2021 à 03:38, Yibo Cai a é

Re: [C++] Indeterminate poor performance of random number generator

2021-04-22 Thread Yibo Cai
On 4/22/21 9:38 AM, Yibo Cai wrote: On 4/21/21 6:07 PM, Antoine Pitrou wrote: Le 21/04/2021 à 11:41, Yibo Cai a écrit : On 4/21/21 5:17 PM, Antoine Pitrou wrote: Le 21/04/2021 à 11:14, Yibo Cai a écrit : When running benchmarks on Arm64 servers, I find some benchmarks are extremely slow w

Re: [C++] Indeterminate poor performance of random number generator

2021-04-22 Thread Antoine Pitrou
Le 22/04/2021 à 03:38, Yibo Cai a écrit : Both using same libstdc++. But std::bernoulli_distribution is inlined, so they are indeed different for clang and gcc. https://godbolt.org/z/aT84x5Yec Looks a pure compiler thing. It looks like clang generates calls to logl() and __divtf3() (soft-fl

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-22 Thread Sutou Kouhei
+1 I got only known Python problems. I ran the followings on Debian GNU/Linux sid: * LANG=C \ TZ=UTC \ ARROW_CMAKE_OPTIONS="-DBoost_NO_BOOST_CMAKE=ON" \ CUDA_TOOLKIT_ROOT=/usr \ dev/release/verify-release-candidate.sh source 4.0.0 3 * dev/release/verify-release-candid

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-22 Thread Sutou Kouhei
Hi, I've re-uploaded binaries. Thanks, -- kou In "Re: [VOTE] Release Apache Arrow 4.0.0 - RC3" on Thu, 22 Apr 2021 03:11:09 -1000, Weston Pace wrote: > I'm getting a failure during the download files check... > > Traceback (most recent call last): > File "/home/centos/arrow/dev/releas