Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-20 Thread Krisztián Szűcs
>> > While it's definitely an improvement over the previously ignored >> >> >> > timezones, I'm not sure that it won't cause unexpected regressions in >> >> >> > the users' codebases. >> >> >> > I'm still trying to

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-20 Thread Micah Kornfield
rstand the issue and its > compatibility > >> >> > implications, but my intuition tells me that we should apply the > >> >> > reversion instead and properly handle the datetime value > conversions > >> >> > in an upcoming minor release. > >> >> > > >> >

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-20 Thread Krisztián Szűcs
e issue and its compatibility >> >> > implications, but my intuition tells me that we should apply the >> >> > reversion instead and properly handle the datetime value conversions >> >> > in an upcoming minor release. >> >> > >>

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-20 Thread Micah Kornfield
er way we should move this conversation to the pull request [1], > >> > because the code snippets pasted here are hardly readable. > >> > > >> > [1]: https://github.com/apache/arrow/pull/7805 > >> > > >> > On Mon, Jul 20, 2020 at 9:40 AM Su

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-20 Thread Krisztián Szűcs
lease. >> > >> > Either way we should move this conversation to the pull request [1], >> > because the code snippets pasted here are hardly readable. >> > >> > [1]: https://github.com/apache/arrow/pull/7805 >> > >> > On Mon, Jul 20, 202

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-20 Thread Micah Kornfield
> On Mon, Jul 20, 2020 at 9:40 AM Sutou Kouhei wrote: > >> > >> Done: https://github.com/apache/arrow/pull/7805#issuecomment-660855376 > >> > >> We can use ...-3.8-... not ...-3.7-... because we don't have > >> ...-3.7-... task in > >> http

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-20 Thread Neal Richardson
onversation to the pull request [1], > > > because the code snippets pasted here are hardly readable. > > > > > > [1]: https://github.com/apache/arrow/pull/7805 > > > > > > On Mon, Jul 20, 2020 at 9:40 AM Sutou Kouhei > wrote: > > >> > >

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-20 Thread Krisztián Szűcs
https://github.com/apache/arrow/pull/7805#issuecomment-660855376 > >> > >> We can use ...-3.8-... not ...-3.7-... because we don't have > >> ...-3.7-... task in > >> https://github.com/apache/arrow/blob/master/dev/tasks/tasks.yml. > >> > >> In

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-20 Thread Antoine Pitrou
ub.com/apache/arrow/pull/7805#issuecomment-660855376 >> >> We can use ...-3.8-... not ...-3.7-... because we don't have >> ...-3.7-... task in >> https://github.com/apache/arrow/blob/master/dev/tasks/tasks.yml. >> >> In >> "Re: [VOTE] Release A

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-20 Thread Krisztián Szűcs
row/blob/master/dev/tasks/tasks.yml. > > In > "Re: [VOTE] Release Apache Arrow 1.0.0 - RC1" on Mon, 20 Jul 2020 00:14:00 > -0700, > Micah Kornfield wrote: > > > FYI, I'm not sure if it is a permissions issue or I've done something wrong > >

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-20 Thread Sutou Kouhei
Done: https://github.com/apache/arrow/pull/7805#issuecomment-660855376 We can use ...-3.8-... not ...-3.7-... because we don't have ...-3.7-... task in https://github.com/apache/arrow/blob/master/dev/tasks/tasks.yml. In "Re: [VOTE] Release Apache Arrow 1.0.0 - RC1" on Mon, 20

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-20 Thread Micah Kornfield
FYI, I'm not sure if it is a permissions issue or I've done something wrong but github-actions does not seem to be responding to "@github-actions crossbow submit test-conda-python-3.7-spark-master" when I enter it. If someone could kick off the spark integration

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-20 Thread Micah Kornfield
Thanks Bryan. I cherry-picked your change onto my change [1] which now honors timezone aware datetime objects on ingestion. I've kicked off the spark integration tests. If this change doesn't work I think the correct course of action is to provide an environment variable in python to turn back t

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Bryan Cutler
I'd rather not see ARROW-9223 reverted, if possible. I will put up my hacked patch to Spark for this so we can test against it if needed, and could share my branch if anyone else wants to test it locally. On Sun, Jul 19, 2020 at 7:35 PM Micah Kornfield wrote: > I'll spend some time tonight on it

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Micah Kornfield
I'll spend some time tonight on it and if I can't get round trip working I'll handle reverting On Sunday, July 19, 2020, Wes McKinney wrote: > On Sun, Jul 19, 2020 at 7:33 PM Neal Richardson > wrote: > > > > It sounds like you may have identified a pyarrow bug, which sounds not > > good, though

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Wes McKinney
On Sun, Jul 19, 2020 at 7:33 PM Neal Richardson wrote: > > It sounds like you may have identified a pyarrow bug, which sounds not > good, though I don't know enough about the broader context to know whether > this is (1) worse than 0.17 or (2) release blocking. I defer to y'all who > know better.

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Neal Richardson
It sounds like you may have identified a pyarrow bug, which sounds not good, though I don't know enough about the broader context to know whether this is (1) worse than 0.17 or (2) release blocking. I defer to y'all who know better. If there are quirks in how Spark handles timezone-naive timestamp

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Wes McKinney
Honestly I think reverting is the best option. This change evidently needs more time to "season" and perhaps this is motivation to enhance test coverage in a number of places. On Sun, Jul 19, 2020 at 7:11 PM Wes McKinney wrote: > > I am OK with any solution that doesn't delay the production of th

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Wes McKinney
I am OK with any solution that doesn't delay the production of the next RC by more than 24 hours On Sun, Jul 19, 2020 at 7:08 PM Micah Kornfield wrote: > > If I read the example right it looks like constructing from python types > isn't keeping timezones into in place? I can try make a patch tha

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Micah Kornfield
If I read the example right it looks like constructing from python types isn't keeping timezones into in place? I can try make a patch that fixes that tonight or would the preference be to revert my patch (note I think another bug with a prior bug was fixed in my PR as well) -Micah On Sunday, Ju

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Wes McKinney
I put up a PR to revert ARROW-9223. If someone cannot resolve the problem another way that I recommend applying the reversion and cutting RC2 https://github.com/apache/arrow/pull/7802 To state the obvious we must verify that this resolves the Spark problem also On Sun, Jul 19, 2020 at 6:55 PM We

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Wes McKinney
I think I see the problem now: In [40]: parr Out[40]: 0 {'f0': 1969-12-31 16:00:00-08:00} 1{'f0': 1969-12-31 16:00:00.01-08:00} 2{'f0': 1969-12-31 16:00:00.02-08:00} dtype: object In [41]: parr[0]['f0'] Out[41]: datetime.datetime(1969, 12, 31, 16, 0, tzinfo=) In [42]: p

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Wes McKinney
Ah I forgot that this is a "feature" of nanosecond timestamps In [21]: arr = pa.array([0, 1, 2], type=pa.timestamp('us', 'America/Los_Angeles')) In [22]: struct_arr = pa.StructArray.from_arrays([arr], names=['f0']) In [23]: struct_arr.to_pandas() Out[23]: 0 {'f0': 1969-12-31 16:00:00-0

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Wes McKinney
There seems to be other broken StructArray stuff In [14]: arr = pa.array([0, 1, 2], type=pa.timestamp('ns', 'America/Los_Angeles')) In [15]: struct_arr = pa.StructArray.from_arrays([arr], names=['f0']) In [16]: struct_arr Out[16]: -- is_valid: all not null -- child 0 type: timestamp[ns, tz=Amer

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Wes McKinney
Well, the problem is that time zones are really finicky comparing Spark (which uses a localtime interpretation of timestamps without time zone) and Arrow (which has naive timestamps -- a concept similar but different from the SQL concept TIMESTAMP WITHOUT TIME ZONE -- and tz-aware timestamps). So s

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Neal Richardson
If it’s a display problem, should it block the release? Sent from my iPhone > On Jul 19, 2020, at 3:57 PM, Wes McKinney wrote: > > I opened https://issues.apache.org/jira/browse/ARROW-9525 about the > display problem. My guess is that there are other problems lurking > here > >> On Sun, Jul 1

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Wes McKinney
I opened https://issues.apache.org/jira/browse/ARROW-9525 about the display problem. My guess is that there are other problems lurking here On Sun, Jul 19, 2020 at 5:54 PM Wes McKinney wrote: > > hi Bryan, > > This is a display bug > > In [6]: arr = pa.array([0, 1, 2], type=pa.timestamp('ns', > '

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Wes McKinney
hi Bryan, This is a display bug In [6]: arr = pa.array([0, 1, 2], type=pa.timestamp('ns', 'America/Los_Angeles')) In [7]: arr.view('int64') Out[7]: [ 0, 1, 2 ] In [8]: arr Out[8]: [ 1970-01-01 00:00:00.0, 1970-01-01 00:00:00.1, 1970-01-01 00:00:00.2 ] In [

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Wes McKinney
Well this is not good and pretty disappointing given that we had nearly a month to sort through the implications of Micah’s patch. We should try to resolve this ASAP On Sun, Jul 19, 2020 at 5:10 PM Bryan Cutler wrote: > +0 (non-binding) > > I ran verification script for binaries and then source,

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Bryan Cutler
+0 (non-binding) I ran verification script for binaries and then source, as below, and both look good ARROW_TMPDIR=/tmp/arrow-test TEST_DEFAULT=0 TEST_SOURCE=1 TEST_CPP=1 TEST_PYTHON=1 TEST_JAVA=1 TEST_INTEGRATION_CPP=1 TEST_INTEGRATION_JAVA=1 dev/release/verify-release-candidate.sh source 1.0.0 1

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-19 Thread Wes McKinney
+1 (binding) I ran the release verification (source and binary) on Ubuntu 18.04 and Windows with MSVC. I experienced a symbol loading issue on macOS [1] but I suspect it's something environment-specific on the machine. Since we have viable macOS packages I'm not concerned about this and can inves

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-18 Thread Micah Kornfield
+1 (binding) Ran wheel and binary tests on ubuntu 19.04 On Fri, Jul 17, 2020 at 2:25 PM Neal Richardson wrote: > +1 (binding) > > In addition to the usual verification on > https://github.com/apache/arrow/pull/7787, I've successfully staged the R > binary artifacts on Windows ( > https://github

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-17 Thread Neal Richardson
+1 (binding) In addition to the usual verification on https://github.com/apache/arrow/pull/7787, I've successfully staged the R binary artifacts on Windows ( https://github.com/r-windows/rtools-packages/pull/126), macOS ( https://github.com/autobrew/homebrew-core/pull/12), and Linux ( https://gith

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-17 Thread Wes McKinney
I see the JS failures as well. I think it is a failure localized to newer Node versions since our JavaScript CI works fine. I don't think it should block the release given the lack of development activity in JavaScript [1] -- if any JS devs are concerned about publishing an artifact then we can ski

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-17 Thread Ryan Murray
I've tested Java and it looks good. However the verify script keeps on bailing with protobuf related errors: 'cpp/build/orc_ep-prefix/src/orc_ep-build/c++/src/orc_proto.pb.cc' and friends cant find protobuf definitions. A bit odd as cmake can see protobuf headers and builds directly off master work

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-17 Thread Antoine Pitrou
+1 (binding). I tested on Ubuntu 18.04. * Wheels verification went fine. * Source verification went fine with CUDA enabled and TEST_INTEGRATION_JS=0 TEST_JS=0. I didn't test the binaries. Regards Antoine. Le 17/07/2020 à 03:41, Krisztián Szűcs a écrit : > Hi, > > I would like to propose t

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-17 Thread Krisztián Szűcs
t.github.com/kou/2118eb1e5046483479f712614343931b#file-wheels-log Agree with Antoine, that this is rather a test issue and we shouldn't block on this. > > Thanks, > -- > kou > > In > "[VOTE] Release Apache Arrow 1.0.0 - RC1" on Fri, 17 Jul 2020 03:41:56 &

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-17 Thread Krisztián Szűcs
+1 (binding) Locally verified the source release, binaries and wheels on macOS 10.15.5. Everything has passed. Note: I had to use an older version of NodeJS v12.18.2 because the JS tests were failing with NodeJS 14.5.0. Also ran the crossbow verification jobs, and everything seems to work fine. S

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-17 Thread Antoine Pitrou
Le 17/07/2020 à 10:32, Sutou Kouhei a écrit : > > * Python 3.8 wheel's tests are failed. 3.5, 3.6 and 3.7 > are passed. It seems that -larrow and -larrow_python for > Cython are failed. > > > /tmp/arrow-1.0.0.NlcPX/test-miniconda/envs/_verify_wheel-3.8/compiler_compat/ld: > ca

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-17 Thread Sutou Kouhei
.8/compiler_compat/ld: cannot find -larrow_python I'm not sure whether this is a critical or not. Details: https://gist.github.com/kou/2118eb1e5046483479f712614343931b#file-wheels-log Thanks, -- kou In "[VOTE] Release Apache Arrow 1.0.0 - RC1" on Fri, 17 Jul 2020 03

Re: [VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-16 Thread Andy Grove
+1 (binding) based on testing the Rust implementation only. On Thu, Jul 16, 2020 at 7:42 PM Krisztián Szűcs wrote: > Hi, > > I would like to propose the second release candidate (RC1) of Apache > Arrow version 1.0.0. > This is a major release consisting of 826 resolved JIRA issues[1]. > > The ve

[VOTE] Release Apache Arrow 1.0.0 - RC1

2020-07-16 Thread Krisztián Szűcs
Hi, I would like to propose the second release candidate (RC1) of Apache Arrow version 1.0.0. This is a major release consisting of 826 resolved JIRA issues[1]. The verification of the first release candidate (RC0) has failed [0], and the packaging scripts were unable to produce two wheels. Compa