Re: 2.0.0 release timeline: October 9

2020-10-08 Thread Krisztián Szűcs
On Fri, Oct 9, 2020 at 2:49 AM Neal Richardson wrote: > > Hi folks, > We're almost there. > https://cwiki.apache.org/confluence/display/ARROW/Arrow+2.0.0+Release shows > 11 open and 17 in progress issues for 2.0. No blockers, but there are still > two nightly build failures ticketed (ARROW-10175 (

Re: [C++] Support of Decimal16, Decimal32 and Decimal64

2020-10-08 Thread Micah Kornfield
Hi Dmitry, Thanks for volunteering to contribute. Note that we are in the process of implementing Decimal256 support already, which is currently on a separate branch [1] but I'm hoping to have PR sometime early next week. If we are proposing adding support for these lower bit-widths, I think we

Re: Flight gRPC version and disabling server verification in C++ [was Re: 2.0.0 release timeline: October 9]

2020-10-08 Thread Micah Kornfield
> > Hence I'll merge it unless there's objections tomorrow afternoon (13 EST). I left a couple of minor comments. On Thu, Oct 8, 2020 at 6:42 PM David Li wrote: > Wes has taken a look and so have I now, and I think this should be OK. > I know Krisztián preferred to defer it before, but now it

Re: Flight gRPC version and disabling server verification in C++ [was Re: 2.0.0 release timeline: October 9]

2020-10-08 Thread David Li
Wes has taken a look and so have I now, and I think this should be OK. I know Krisztián preferred to defer it before, but now it does not bump the gRPC version everywhere to pass the tests. Hence I'll merge it unless there's objections tomorrow afternoon (13 EST). David On 10/8/20, Antoine Pitro

Re: 2.0.0 release timeline: October 9

2020-10-08 Thread Neal Richardson
Hi folks, We're almost there. https://cwiki.apache.org/confluence/display/ARROW/Arrow+2.0.0+Release shows 11 open and 17 in progress issues for 2.0. No blockers, but there are still two nightly build failures ticketed (ARROW-10175 (pyarrow/hdfs), ARROW-10177 (gandiva xenial)) and perhaps others sho

Re: [C++] Support of Decimal16, Decimal32 and Decimal64

2020-10-08 Thread Wes McKinney
Based in what I've seen in other systems (like Apache Kudu) that support at least 32/64-bit decimal, representing them with a single integer value is probably the best thing (in terms of computing performance, consistency with other implementations) I added you as a contributor on Jira so you can

[C++] Support of Decimal16, Decimal32 and Decimal64

2020-10-08 Thread Chigarev, Dmitry
Hi everyone, I would like to work on this JIRA ticket: https://issues.apache.org/jira/browse/ARROW-9404 ([C++] Add support for Decimal16, Decimal32 and Decimal64) This will be my first experience with contributing to Arrow, so I want to ask advice what approach should I use. As far as I know, cur

Re: Dictionary key access in python/generally

2020-10-08 Thread Benjamin MacDonald Schmidt
Thank you both. I hadn't read the IPC documentation closely enough to understand that it supported metadata at the message level. It seems like the best approach in my case is then probably to flush the dataset to separate files as a large number of IPC message batches, and send the schema and the

Re: Flight gRPC version and disabling server verification in C++ [was Re: 2.0.0 release timeline: October 9]

2020-10-08 Thread Antoine Pitrou
I'll get to review the PR on Monday. It may be too late for the release. Regards Antoine. Le 08/10/2020 à 18:43, James Duong a écrit : > Hi, > > I've edited my PR now so that: > 1. The CMakefiles so that we can detect which namespace > TlsCredentialsOptions are in, if any. > 2. Conditionall

Re: [FlightRPC] Add a "Flight SQL" extension on top of FlightRPC

2020-10-08 Thread Ryan Nicholson
Thanks Wes, Apologies for the delay. The next steps which we were going to pursue once the authentication redesign direction was solid included a session concept. The initial proposal we would put forth would likely include these concepts via use of headers. I agree that with sessions and database

Flight gRPC version and disabling server verification in C++ [was Re: 2.0.0 release timeline: October 9]

2020-10-08 Thread James Duong
Hi, I've edited my PR now so that: 1. The CMakefiles so that we can detect which namespace TlsCredentialsOptions are in, if any. 2. Conditionally compile the C++ Flight client to use the namespace to implement disabling server verification, or compile out the implementation and throw an error at r

Re: [Python] Dictionary Arrays with duplicate values jumbling on round-trip to parquet

2020-10-08 Thread Wes McKinney
I haven't looked closely but it looks like a bug, can someone open a JIRA issue and copy the reproducible example? On Thu, Oct 8, 2020 at 10:57 AM Jadczak, Matt wrote: > > I am unsure if this behaviour is intended (and duplicate values should be > forbidden), but it seems to me that the reason t

Re: [Python] Dictionary Arrays with duplicate values jumbling on round-trip to parquet

2020-10-08 Thread Jadczak, Matt
I am unsure if this behaviour is intended (and duplicate values should be forbidden), but it seems to me that the reason this is happening is that when re-encoding an Arrow dictionary as a Parquet one, the function at https://github.com/apache/arrow/blob/4bbb74713c6883e8523eeeb5ac80a1e1f8521674/

[Python] Dictionary Arrays with duplicate values jumbling on round-trip to parquet

2020-10-08 Thread Al Taylor
Hi, I've found the following odd behaviour when round-tripping data via parquet using pyarrow, when the data contains dictionary arrays with duplicate values. ```python import pyarrow as pa import pyarrow.parquet as pq my_table = pa.Table.from_batches( [ pa.Recor

[Rust]: Exposed API

2020-10-08 Thread vertexclique vertexclique
Hi; Let me start with my aim and how things are evolved in my mind. Through extensive usage of Arrow API, I've realized that we are doing so many unnecessary allocations and rebuilding for simple things like offset changes. (At least that's what I am doing). That said, it is tough to make the tra

[NIGHTLY] Arrow Build Report for Job nightly-2020-10-08-0

2020-10-08 Thread Crossbow
Arrow Build Report for Job nightly-2020-10-08-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-10-08-0 Failed Tasks: - conda-linux-gcc-py36-aarch64: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-10-08-0-drone-conda-linux-gcc-py3