Done: https://github.com/apache/arrow/pull/7805#issuecomment-660855376
We can use ...-3.8-... not ...-3.7-... because we don't have ...-3.7-... task in https://github.com/apache/arrow/blob/master/dev/tasks/tasks.yml. In <cak7z5t8hqcsd3meg42cuzkscpjr3zndsvrjmm8vied0gzto...@mail.gmail.com> "Re: [VOTE] Release Apache Arrow 1.0.0 - RC1" on Mon, 20 Jul 2020 00:14:00 -0700, Micah Kornfield <emkornfi...@gmail.com> wrote: > FYI, I'm not sure if it is a permissions issue or I've done something wrong > but github-actions does not seem to be responding to "@github-actions > <https://github.com/github-actions> crossbow submit > test-conda-python-3.7-spark-master" when I enter it. If someone could kick > off the spark integration test I would be grateful. > > On Mon, Jul 20, 2020 at 12:09 AM Micah Kornfield <emkornfi...@gmail.com> > wrote: > >> Thanks Bryan. I cherry-picked your change onto my change [1] which now >> honors timezone aware datetime objects on ingestion. I've kicked off the >> spark integration tests. >> >> If this change doesn't work I think the correct course of action is to >> provide an environment variable in python to turn back to the old behavior >> (ignoring timezones on conversion). I think honoring timezone information >> where possible is a strict improvement but I agree we should give users an >> option to not break if they wish to upgrade to the latest version. I need >> to get some sleep but I will have another PR posted tomorrow evening if the >> current one doesn't unblock the release. >> >> [1] https://github.com/apache/arrow/pull/7805 >> >> On Sun, Jul 19, 2020 at 10:50 PM Bryan Cutler <cutl...@gmail.com> wrote: >> >>> I'd rather not see ARROW-9223 reverted, if possible. I will put up my >>> hacked patch to Spark for this so we can test against it if needed, and >>> could share my branch if anyone else wants to test it locally. >>> >>> On Sun, Jul 19, 2020 at 7:35 PM Micah Kornfield <emkornfi...@gmail.com> >>> wrote: >>> >>> > I'll spend some time tonight on it and if I can't get round trip working >>> > I'll handle reverting >>> > >>> > On Sunday, July 19, 2020, Wes McKinney <wesmck...@gmail.com> wrote: >>> > >>> > > On Sun, Jul 19, 2020 at 7:33 PM Neal Richardson >>> > > <neal.p.richard...@gmail.com> wrote: >>> > > > >>> > > > It sounds like you may have identified a pyarrow bug, which sounds >>> not >>> > > > good, though I don't know enough about the broader context to know >>> > > whether >>> > > > this is (1) worse than 0.17 or (2) release blocking. I defer to >>> y'all >>> > who >>> > > > know better. >>> > > > >>> > > > If there are quirks in how Spark handles timezone-naive timestamps, >>> > > > shouldn't the fix/workaround go in pyspark, not pyarrow? For what >>> it's >>> > > > worth, I dealt with similar Spark timezone issues in R recently: >>> > > > https://github.com/sparklyr/sparklyr/issues/2439 I handled with it >>> (in >>> > > > sparklyr, not the arrow R package) by always setting a timezone when >>> > > > sending data to Spark. Not ideal but it made the numbers "right". >>> > > >>> > > Since people are running this code in production we need to be careful >>> > > about disrupting them. Unfortunately I'm at the limit of how much time >>> > > I can spend on this, but releasing with ARROW-9223 as is (without >>> > > being partially or fully reverted) makes me deeply uncomfortable. So I >>> > > hope the matter can be resolved. >>> > > >>> > > > Neal >>> > > > >>> > > > >>> > > > On Sun, Jul 19, 2020 at 5:13 PM Wes McKinney <wesmck...@gmail.com> >>> > > wrote: >>> > > > >>> > > > > Honestly I think reverting is the best option. This change >>> evidently >>> > > > > needs more time to "season" and perhaps this is motivation to >>> enhance >>> > > > > test coverage in a number of places. >>> > > > > >>> > > > > On Sun, Jul 19, 2020 at 7:11 PM Wes McKinney <wesmck...@gmail.com >>> > >>> > > wrote: >>> > > > > > >>> > > > > > I am OK with any solution that doesn't delay the production of >>> the >>> > > > > > next RC by more than 24 hours >>> > > > > > >>> > > > > > On Sun, Jul 19, 2020 at 7:08 PM Micah Kornfield < >>> > > emkornfi...@gmail.com> >>> > > > > wrote: >>> > > > > > > >>> > > > > > > If I read the example right it looks like constructing from >>> > python >>> > > > > types >>> > > > > > > isn't keeping timezones into in place? I can try make a patch >>> > that >>> > > > > fixes >>> > > > > > > that tonight or would the preference be to revert my patch >>> (note >>> > I >>> > > > > think >>> > > > > > > another bug with a prior bug was fixed in my PR as well) >>> > > > > > > >>> > > > > > > -Micah >>> > > > > > > >>> > > > > > > On Sunday, July 19, 2020, Wes McKinney <wesmck...@gmail.com> >>> > > wrote: >>> > > > > > > >>> > > > > > > > I think I see the problem now: >>> > > > > > > > >>> > > > > > > > In [40]: parr >>> > > > > > > > Out[40]: >>> > > > > > > > 0 {'f0': 1969-12-31 16:00:00-08:00} >>> > > > > > > > 1 {'f0': 1969-12-31 16:00:00.000001-08:00} >>> > > > > > > > 2 {'f0': 1969-12-31 16:00:00.000002-08:00} >>> > > > > > > > dtype: object >>> > > > > > > > >>> > > > > > > > In [41]: parr[0]['f0'] >>> > > > > > > > Out[41]: datetime.datetime(1969, 12, 31, 16, 0, >>> > tzinfo=<DstTzInfo >>> > > > > > > > 'America/Los_Angeles' PST-1 day, 16:00:00 STD>) >>> > > > > > > > >>> > > > > > > > In [42]: pa.array(parr) >>> > > > > > > > Out[42]: >>> > > > > > > > <pyarrow.lib.StructArray object at 0x7f0893706a60> >>> > > > > > > > -- is_valid: all not null >>> > > > > > > > -- child 0 type: timestamp[us] >>> > > > > > > > [ >>> > > > > > > > 1969-12-31 16:00:00.000000, >>> > > > > > > > 1969-12-31 16:00:00.000001, >>> > > > > > > > 1969-12-31 16:00:00.000002 >>> > > > > > > > ] >>> > > > > > > > >>> > > > > > > > In [43]: pa.array(parr).field(0).type >>> > > > > > > > Out[43]: TimestampType(timestamp[us]) >>> > > > > > > > >>> > > > > > > > On 0.17.1 >>> > > > > > > > >>> > > > > > > > In [8]: arr = pa.array([0, 1, 2], type=pa.timestamp('us', >>> > > > > > > > 'America/Los_Angeles')) >>> > > > > > > > >>> > > > > > > > In [9]: arr >>> > > > > > > > Out[9]: >>> > > > > > > > <pyarrow.lib.TimestampArray object at 0x7f9dede69d00> >>> > > > > > > > [ >>> > > > > > > > 1970-01-01 00:00:00.000000, >>> > > > > > > > 1970-01-01 00:00:00.000001, >>> > > > > > > > 1970-01-01 00:00:00.000002 >>> > > > > > > > ] >>> > > > > > > > >>> > > > > > > > In [10]: struct_arr = pa.StructArray.from_arrays([arr], >>> > > names=['f0']) >>> > > > > > > > >>> > > > > > > > In [11]: struct_arr >>> > > > > > > > Out[11]: >>> > > > > > > > <pyarrow.lib.StructArray object at 0x7f9ded0016e0> >>> > > > > > > > -- is_valid: all not null >>> > > > > > > > -- child 0 type: timestamp[us, tz=America/Los_Angeles] >>> > > > > > > > [ >>> > > > > > > > 1970-01-01 00:00:00.000000, >>> > > > > > > > 1970-01-01 00:00:00.000001, >>> > > > > > > > 1970-01-01 00:00:00.000002 >>> > > > > > > > ] >>> > > > > > > > >>> > > > > > > > In [12]: struct_arr.to_pandas() >>> > > > > > > > Out[12]: >>> > > > > > > > 0 {'f0': 1970-01-01 00:00:00} >>> > > > > > > > 1 {'f0': 1970-01-01 00:00:00.000001} >>> > > > > > > > 2 {'f0': 1970-01-01 00:00:00.000002} >>> > > > > > > > dtype: object >>> > > > > > > > >>> > > > > > > > In [13]: pa.array(struct_arr.to_pandas()) >>> > > > > > > > Out[13]: >>> > > > > > > > <pyarrow.lib.StructArray object at 0x7f9ded003210> >>> > > > > > > > -- is_valid: all not null >>> > > > > > > > -- child 0 type: timestamp[us] >>> > > > > > > > [ >>> > > > > > > > 1970-01-01 00:00:00.000000, >>> > > > > > > > 1970-01-01 00:00:00.000001, >>> > > > > > > > 1970-01-01 00:00:00.000002 >>> > > > > > > > ] >>> > > > > > > > >>> > > > > > > > In [14]: pa.array(struct_arr.to_pandas()).type >>> > > > > > > > Out[14]: StructType(struct<f0: timestamp[us]>) >>> > > > > > > > >>> > > > > > > > So while the time zone is getting stripped in both cases, >>> the >>> > > failure >>> > > > > > > > to round trip is a problem. If we are going to attach the >>> time >>> > > zone >>> > > > > in >>> > > > > > > > to_pandas() then we need to respect it when going the other >>> > way. >>> > > > > > > > >>> > > > > > > > This looks like a regression to me and so I'm inclined to >>> > revise >>> > > my >>> > > > > > > > vote on the release to -0/-1 >>> > > > > > > > >>> > > > > > > > On Sun, Jul 19, 2020 at 6:46 PM Wes McKinney < >>> > > wesmck...@gmail.com> >>> > > > > wrote: >>> > > > > > > > > >>> > > > > > > > > Ah I forgot that this is a "feature" of nanosecond >>> timestamps >>> > > > > > > > > >>> > > > > > > > > In [21]: arr = pa.array([0, 1, 2], type=pa.timestamp('us', >>> > > > > > > > > 'America/Los_Angeles')) >>> > > > > > > > > >>> > > > > > > > > In [22]: struct_arr = pa.StructArray.from_arrays([arr], >>> > > > > names=['f0']) >>> > > > > > > > > >>> > > > > > > > > In [23]: struct_arr.to_pandas() >>> > > > > > > > > Out[23]: >>> > > > > > > > > 0 {'f0': 1969-12-31 16:00:00-08:00} >>> > > > > > > > > 1 {'f0': 1969-12-31 16:00:00.000001-08:00} >>> > > > > > > > > 2 {'f0': 1969-12-31 16:00:00.000002-08:00} >>> > > > > > > > > dtype: object >>> > > > > > > > > >>> > > > > > > > > So this is working as intended, such as it is >>> > > > > > > > > >>> > > > > > > > > On Sun, Jul 19, 2020 at 6:40 PM Wes McKinney < >>> > > wesmck...@gmail.com> >>> > > > > > > > wrote: >>> > > > > > > > > > >>> > > > > > > > > > There seems to be other broken StructArray stuff >>> > > > > > > > > > >>> > > > > > > > > > In [14]: arr = pa.array([0, 1, 2], >>> type=pa.timestamp('ns', >>> > > > > > > > > > 'America/Los_Angeles')) >>> > > > > > > > > > >>> > > > > > > > > > In [15]: struct_arr = pa.StructArray.from_arrays([arr], >>> > > > > names=['f0']) >>> > > > > > > > > > >>> > > > > > > > > > In [16]: struct_arr >>> > > > > > > > > > Out[16]: >>> > > > > > > > > > <pyarrow.lib.StructArray object at 0x7f089370f590> >>> > > > > > > > > > -- is_valid: all not null >>> > > > > > > > > > -- child 0 type: timestamp[ns, tz=America/Los_Angeles] >>> > > > > > > > > > [ >>> > > > > > > > > > 1970-01-01 00:00:00.000000000, >>> > > > > > > > > > 1970-01-01 00:00:00.000000001, >>> > > > > > > > > > 1970-01-01 00:00:00.000000002 >>> > > > > > > > > > ] >>> > > > > > > > > > >>> > > > > > > > > > In [17]: struct_arr.to_pandas() >>> > > > > > > > > > Out[17]: >>> > > > > > > > > > 0 {'f0': 0} >>> > > > > > > > > > 1 {'f0': 1} >>> > > > > > > > > > 2 {'f0': 2} >>> > > > > > > > > > dtype: object >>> > > > > > > > > > >>> > > > > > > > > > All in all it appears that this part of the project >>> needs >>> > > some >>> > > > > TLC >>> > > > > > > > > > >>> > > > > > > > > > On Sun, Jul 19, 2020 at 6:16 PM Wes McKinney < >>> > > > > wesmck...@gmail.com> >>> > > > > > > > wrote: >>> > > > > > > > > > > >>> > > > > > > > > > > Well, the problem is that time zones are really >>> finicky >>> > > > > comparing >>> > > > > > > > > > > Spark (which uses a localtime interpretation of >>> > timestamps >>> > > > > without >>> > > > > > > > > > > time zone) and Arrow (which has naive timestamps -- a >>> > > concept >>> > > > > similar >>> > > > > > > > > > > but different from the SQL concept TIMESTAMP WITHOUT >>> TIME >>> > > ZONE >>> > > > > -- and >>> > > > > > > > > > > tz-aware timestamps). So somewhere there is a time >>> zone >>> > > being >>> > > > > > > > stripped >>> > > > > > > > > > > or applied/localized which may result in the >>> transferred >>> > > data >>> > > > > to/from >>> > > > > > > > > > > Spark being shifted by the time zone offset. I think >>> it's >>> > > > > important >>> > > > > > > > > > > that we determine what the problem is -- if it's a >>> > problem >>> > > > > that has >>> > > > > > > > to >>> > > > > > > > > > > be fixed in Arrow (and it's not clear to me that it >>> is) >>> > > it's >>> > > > > worth >>> > > > > > > > > > > spending some time to understand what's going on to >>> avoid >>> > > the >>> > > > > > > > > > > possibility of patch release on account of this. >>> > > > > > > > > > > >>> > > > > > > > > > > On Sun, Jul 19, 2020 at 6:12 PM Neal Richardson >>> > > > > > > > > > > <neal.p.richard...@gmail.com> wrote: >>> > > > > > > > > > > > >>> > > > > > > > > > > > If it’s a display problem, should it block the >>> release? >>> > > > > > > > > > > > >>> > > > > > > > > > > > Sent from my iPhone >>> > > > > > > > > > > > >>> > > > > > > > > > > > > On Jul 19, 2020, at 3:57 PM, Wes McKinney < >>> > > > > wesmck...@gmail.com> >>> > > > > > > > wrote: >>> > > > > > > > > > > > > >>> > > > > > > > > > > > > I opened https://issues.apache.org/ >>> > > jira/browse/ARROW-9525 >>> > > > > > > > about the >>> > > > > > > > > > > > > display problem. My guess is that there are other >>> > > problems >>> > > > > > > > lurking >>> > > > > > > > > > > > > here >>> > > > > > > > > > > > > >>> > > > > > > > > > > > >> On Sun, Jul 19, 2020 at 5:54 PM Wes McKinney < >>> > > > > > > > wesmck...@gmail.com> wrote: >>> > > > > > > > > > > > >> >>> > > > > > > > > > > > >> hi Bryan, >>> > > > > > > > > > > > >> >>> > > > > > > > > > > > >> This is a display bug >>> > > > > > > > > > > > >> >>> > > > > > > > > > > > >> In [6]: arr = pa.array([0, 1, 2], >>> > > type=pa.timestamp('ns', >>> > > > > > > > > > > > >> 'America/Los_Angeles')) >>> > > > > > > > > > > > >> >>> > > > > > > > > > > > >> In [7]: arr.view('int64') >>> > > > > > > > > > > > >> Out[7]: >>> > > > > > > > > > > > >> <pyarrow.lib.Int64Array object at 0x7fd1b8aaef30> >>> > > > > > > > > > > > >> [ >>> > > > > > > > > > > > >> 0, >>> > > > > > > > > > > > >> 1, >>> > > > > > > > > > > > >> 2 >>> > > > > > > > > > > > >> ] >>> > > > > > > > > > > > >> >>> > > > > > > > > > > > >> In [8]: arr >>> > > > > > > > > > > > >> Out[8]: >>> > > > > > > > > > > > >> <pyarrow.lib.TimestampArray object at >>> > 0x7fd1b8aae6e0> >>> > > > > > > > > > > > >> [ >>> > > > > > > > > > > > >> 1970-01-01 00:00:00.000000000, >>> > > > > > > > > > > > >> 1970-01-01 00:00:00.000000001, >>> > > > > > > > > > > > >> 1970-01-01 00:00:00.000000002 >>> > > > > > > > > > > > >> ] >>> > > > > > > > > > > > >> >>> > > > > > > > > > > > >> In [9]: arr.to_pandas() >>> > > > > > > > > > > > >> Out[9]: >>> > > > > > > > > > > > >> 0 1969-12-31 16:00:00-08:00 >>> > > > > > > > > > > > >> 1 1969-12-31 16:00:00.000000001-08:00 >>> > > > > > > > > > > > >> 2 1969-12-31 16:00:00.000000002-08:00 >>> > > > > > > > > > > > >> dtype: datetime64[ns, America/Los_Angeles] >>> > > > > > > > > > > > >> >>> > > > > > > > > > > > >> the repr of TimestampArray doesn't take into >>> account >>> > > the >>> > > > > > > > timezone >>> > > > > > > > > > > > >> >>> > > > > > > > > > > > >> In [10]: arr[0] >>> > > > > > > > > > > > >> Out[10]: <pyarrow.TimestampScalar: >>> > > Timestamp('1969-12-31 >>> > > > > > > > > > > > >> 16:00:00-0800', tz='America/Los_Angeles')> >>> > > > > > > > > > > > >> >>> > > > > > > > > > > > >> So if it's incorrect, the problem is happening >>> > > somewhere >>> > > > > before >>> > > > > > > > or >>> > > > > > > > > > > > >> while the StructArray is being created. If I had >>> to >>> > > guess >>> > > > > it's >>> > > > > > > > caused >>> > > > > > > > > > > > >> by the tzinfo of the datetime.datetime values not >>> > > being >>> > > > > handled >>> > > > > > > > in the >>> > > > > > > > > > > > >> way that they were before >>> > > > > > > > > > > > >> >>> > > > > > > > > > > > >>> On Sun, Jul 19, 2020 at 5:19 PM Wes McKinney < >>> > > > > > > > wesmck...@gmail.com> wrote: >>> > > > > > > > > > > > >>> >>> > > > > > > > > > > > >>> Well this is not good and pretty disappointing >>> > given >>> > > > > that we >>> > > > > > > > had nearly a month to sort through the implications of >>> Micah’s >>> > > > > patch. We >>> > > > > > > > should try to resolve this ASAP >>> > > > > > > > > > > > >>> >>> > > > > > > > > > > > >>> On Sun, Jul 19, 2020 at 5:10 PM Bryan Cutler < >>> > > > > > > > cutl...@gmail.com> wrote: >>> > > > > > > > > > > > >>>> >>> > > > > > > > > > > > >>>> +0 (non-binding) >>> > > > > > > > > > > > >>>> >>> > > > > > > > > > > > >>>> I ran verification script for binaries and then >>> > > source, >>> > > > > as >>> > > > > > > > below, and both >>> > > > > > > > > > > > >>>> look good >>> > > > > > > > > > > > >>>> ARROW_TMPDIR=/tmp/arrow-test TEST_DEFAULT=0 >>> > > > > TEST_SOURCE=1 >>> > > > > > > > TEST_CPP=1 >>> > > > > > > > > > > > >>>> TEST_PYTHON=1 TEST_JAVA=1 >>> TEST_INTEGRATION_CPP=1 >>> > > > > > > > TEST_INTEGRATION_JAVA=1 >>> > > > > > > > > > > > >>>> dev/release/verify-release-candidate.sh source >>> > > 1.0.0 1 >>> > > > > > > > > > > > >>>> >>> > > > > > > > > > > > >>>> I tried to patch Spark locally to verify the >>> > recent >>> > > > > change in >>> > > > > > > > nested >>> > > > > > > > > > > > >>>> timestamps and was not able to get things >>> working >>> > > quite >>> > > > > > > > right, but I'm not >>> > > > > > > > > > > > >>>> sure if the problem is in Spark, Arrow or my >>> > patch - >>> > > > > hence my >>> > > > > > > > vote of +0. >>> > > > > > > > > > > > >>>> >>> > > > > > > > > > > > >>>> Here is what I'm seeing >>> > > > > > > > > > > > >>>> >>> > > > > > > > > > > > >>>> ``` >>> > > > > > > > > > > > >>>> (Input as datetime) >>> > > > > > > > > > > > >>>> datetime.datetime(2018, 3, 10, 0, 0) >>> > > > > > > > > > > > >>>> datetime.datetime(2018, 3, 15, 0, 0) >>> > > > > > > > > > > > >>>> >>> > > > > > > > > > > > >>>> (Struct Array) >>> > > > > > > > > > > > >>>> -- is_valid: all not null >>> > > > > > > > > > > > >>>> -- child 0 type: timestamp[us, >>> > > tz=America/Los_Angeles] >>> > > > > > > > > > > > >>>> [ >>> > > > > > > > > > > > >>>> 2018-03-10 00:00:00.000000, >>> > > > > > > > > > > > >>>> 2018-03-10 00:00:00.000000 >>> > > > > > > > > > > > >>>> ] >>> > > > > > > > > > > > >>>> -- child 1 type: timestamp[us, >>> > > tz=America/Los_Angeles] >>> > > > > > > > > > > > >>>> [ >>> > > > > > > > > > > > >>>> 2018-03-15 00:00:00.000000, >>> > > > > > > > > > > > >>>> 2018-03-15 00:00:00.000000 >>> > > > > > > > > > > > >>>> ] >>> > > > > > > > > > > > >>>> >>> > > > > > > > > > > > >>>> (Flattened Arrays) >>> > > > > > > > > > > > >>>> types [TimestampType(timestamp[us, >>> > > > > tz=America/Los_Angeles]), >>> > > > > > > > > > > > >>>> TimestampType(timestamp[us, >>> > > tz=America/Los_Angeles])] >>> > > > > > > > > > > > >>>> [<pyarrow.lib.TimestampArray object at >>> > > 0x7ffbbd88f520> >>> > > > > > > > > > > > >>>> [ >>> > > > > > > > > > > > >>>> 2018-03-10 00:00:00.000000, >>> > > > > > > > > > > > >>>> 2018-03-10 00:00:00.000000 >>> > > > > > > > > > > > >>>> ], <pyarrow.lib.TimestampArray object at >>> > > 0x7ffba958be50> >>> > > > > > > > > > > > >>>> [ >>> > > > > > > > > > > > >>>> 2018-03-15 00:00:00.000000, >>> > > > > > > > > > > > >>>> 2018-03-15 00:00:00.000000 >>> > > > > > > > > > > > >>>> ]] >>> > > > > > > > > > > > >>>> >>> > > > > > > > > > > > >>>> (Pandas Conversion) >>> > > > > > > > > > > > >>>> [ >>> > > > > > > > > > > > >>>> 0 2018-03-09 16:00:00-08:00 >>> > > > > > > > > > > > >>>> 1 2018-03-09 16:00:00-08:00 >>> > > > > > > > > > > > >>>> dtype: datetime64[ns, America/Los_Angeles], >>> > > > > > > > > > > > >>>> >>> > > > > > > > > > > > >>>> 0 2018-03-14 17:00:00-07:00 >>> > > > > > > > > > > > >>>> 1 2018-03-14 17:00:00-07:00 >>> > > > > > > > > > > > >>>> dtype: datetime64[ns, America/Los_Angeles]] >>> > > > > > > > > > > > >>>> ``` >>> > > > > > > > > > > > >>>> >>> > > > > > > > > > > > >>>> Based on output of existing a correct timestamp >>> > > udf, it >>> > > > > looks >>> > > > > > > > like the >>> > > > > > > > > > > > >>>> pyarrow Struct Array values are wrong and >>> that's >>> > > carried >>> > > > > > > > through the >>> > > > > > > > > > > > >>>> flattened arrays, causing the Pandas values to >>> > have >>> > > a >>> > > > > > > > negative offset. >>> > > > > > > > > > > > >>>> >>> > > > > > > > > > > > >>>> Here is output from a working udf with >>> timestamp, >>> > > the >>> > > > > pyarrow >>> > > > > > > > Array >>> > > > > > > > > > > > >>>> displays in UTC time, I believe. >>> > > > > > > > > > > > >>>> >>> > > > > > > > > > > > >>>> ``` >>> > > > > > > > > > > > >>>> (Timestamp Array) >>> > > > > > > > > > > > >>>> type timestamp[us, tz=America/Los_Angeles] >>> > > > > > > > > > > > >>>> [ >>> > > > > > > > > > > > >>>> [ >>> > > > > > > > > > > > >>>> 1969-01-01 09:01:01.000000 >>> > > > > > > > > > > > >>>> ] >>> > > > > > > > > > > > >>>> ] >>> > > > > > > > > > > > >>>> >>> > > > > > > > > > > > >>>> (Pandas Conversion) >>> > > > > > > > > > > > >>>> 0 1969-01-01 01:01:01-08:00 >>> > > > > > > > > > > > >>>> Name: _0, dtype: datetime64[ns, >>> > America/Los_Angeles] >>> > > > > > > > > > > > >>>> >>> > > > > > > > > > > > >>>> (Timezone Localized) >>> > > > > > > > > > > > >>>> 0 1969-01-01 01:01:01 >>> > > > > > > > > > > > >>>> Name: _0, dtype: datetime64[ns] >>> > > > > > > > > > > > >>>> ``` >>> > > > > > > > > > > > >>>> >>> > > > > > > > > > > > >>>> I'll have to dig in further at another time and >>> > > debug >>> > > > > where >>> > > > > > > > the values go >>> > > > > > > > > > > > >>>> wrong. >>> > > > > > > > > > > > >>>> >>> > > > > > > > > > > > >>>> On Sat, Jul 18, 2020 at 9:51 PM Micah >>> Kornfield < >>> > > > > > > > emkornfi...@gmail.com> >>> > > > > > > > > > > > >>>> wrote: >>> > > > > > > > > > > > >>>> >>> > > > > > > > > > > > >>>>> +1 (binding) >>> > > > > > > > > > > > >>>>> >>> > > > > > > > > > > > >>>>> Ran wheel and binary tests on ubuntu 19.04 >>> > > > > > > > > > > > >>>>> >>> > > > > > > > > > > > >>>>> On Fri, Jul 17, 2020 at 2:25 PM Neal >>> Richardson < >>> > > > > > > > > > > > >>>>> neal.p.richard...@gmail.com> >>> > > > > > > > > > > > >>>>> wrote: >>> > > > > > > > > > > > >>>>> >>> > > > > > > > > > > > >>>>>> +1 (binding) >>> > > > > > > > > > > > >>>>>> >>> > > > > > > > > > > > >>>>>> In addition to the usual verification on >>> > > > > > > > > > > > >>>>>> https://github.com/apache/arrow/pull/7787, >>> I've >>> > > > > > > > successfully staged the >>> > > > > > > > > > > > >>>>> R >>> > > > > > > > > > > > >>>>>> binary artifacts on Windows ( >>> > > > > > > > > > > > >>>>>> https://github.com/r-windows/ >>> > > rtools-packages/pull/126 >>> > > > > ), >>> > > > > > > > macOS ( >>> > > > > > > > > > > > >>>>>> >>> > https://github.com/autobrew/homebrew-core/pull/12 >>> > > ), >>> > > > > and >>> > > > > > > > Linux ( >>> > > > > > > > > > > > >>>>>> >>> > > > > https://github.com/ursa-labs/arrow-r-nightly/actions/runs/ >>> > > > > > > > 172977277) >>> > > > > > > > > > > > >>>>> using >>> > > > > > > > > > > > >>>>>> the release candidate. >>> > > > > > > > > > > > >>>>>> >>> > > > > > > > > > > > >>>>>> And I agree with the judgment about skipping >>> a >>> > JS >>> > > > > release >>> > > > > > > > artifact. Looks >>> > > > > > > > > > > > >>>>>> like there hasn't been a code change since >>> > > October so >>> > > > > > > > there's no point. >>> > > > > > > > > > > > >>>>>> >>> > > > > > > > > > > > >>>>>> Neal >>> > > > > > > > > > > > >>>>>> >>> > > > > > > > > > > > >>>>>> On Fri, Jul 17, 2020 at 10:37 AM Wes >>> McKinney < >>> > > > > > > > wesmck...@gmail.com> >>> > > > > > > > > > > > >>>>> wrote: >>> > > > > > > > > > > > >>>>>> >>> > > > > > > > > > > > >>>>>>> I see the JS failures as well. I think it >>> is a >>> > > > > failure >>> > > > > > > > localized to >>> > > > > > > > > > > > >>>>>>> newer Node versions since our JavaScript CI >>> > works >>> > > > > fine. I >>> > > > > > > > don't think >>> > > > > > > > > > > > >>>>>>> it should block the release given the lack >>> of >>> > > > > development >>> > > > > > > > activity in >>> > > > > > > > > > > > >>>>>>> JavaScript [1] -- if any JS devs are >>> concerned >>> > > about >>> > > > > > > > publishing an >>> > > > > > > > > > > > >>>>>>> artifact then we can skip pushing it to NPM >>> > > > > > > > > > > > >>>>>>> >>> > > > > > > > > > > > >>>>>>> @Ryan it seems it may be something >>> environment >>> > > > > related on >>> > > > > > > > your >>> > > > > > > > > > > > >>>>>>> machine, I'm on Ubuntu 18.04 and have not >>> seen >>> > > this. >>> > > > > > > > > > > > >>>>>>> >>> > > > > > > > > > > > >>>>>>> On >>> > > > > > > > > > > > >>>>>>> >>> > > > > > > > > > > > >>>>>>>> * Python 3.8 wheel's tests are failed. >>> 3.5, >>> > 3.6 >>> > > > > and 3.7 >>> > > > > > > > > > > > >>>>>>>> are passed. It seems that -larrow and >>> > > > > -larrow_python >>> > > > > > > > for >>> > > > > > > > > > > > >>>>>>>> Cython are failed. >>> > > > > > > > > > > > >>>>>>> >>> > > > > > > > > > > > >>>>>>> I suspect this is related to >>> > > > > > > > > > > > >>>>>>> >>> > > > > > > > > > > > >>>>>>> >>> > > > > > > > > > > > >>>>>> >>> > > > > > > > > > > > >>>>> https://github.com/apache/arrow/commit/ >>> > > > > > > > 120c21f4bf66d2901b3a353a1f67bac3c3355924#diff- >>> > > > > > > > 0f69784b44040448d17d0e4e8a641fe8 >>> > > > > > > > > > > > >>>>>>> , >>> > > > > > > > > > > > >>>>>>> but I don't think it's a blocking issue >>> > > > > > > > > > > > >>>>>>> >>> > > > > > > > > > > > >>>>>>> [1]: >>> > > > > https://github.com/apache/arrow/commits/master/js >>> > > > > > > > > > > > >>>>>>> >>> > > > > > > > > > > > >>>>>>> On Fri, Jul 17, 2020 at 9:42 AM Ryan Murray >>> < >>> > > > > > > > rym...@dremio.com> wrote: >>> > > > > > > > > > > > >>>>>>>> >>> > > > > > > > > > > > >>>>>>>> I've tested Java and it looks good. However >>> > the >>> > > > > verify >>> > > > > > > > script keeps >>> > > > > > > > > > > > >>>>> on >>> > > > > > > > > > > > >>>>>>>> bailing with protobuf related errors: >>> > > > > > > > > > > > >>>>>>>> >>> > > > > 'cpp/build/orc_ep-prefix/src/orc_ep-build/c++/src/orc_ >>> > > > > > > > proto.pb.cc' >>> > > > > > > > > > > > >>>>> and >>> > > > > > > > > > > > >>>>>>>> friends cant find protobuf definitions. A >>> bit >>> > > odd as >>> > > > > > > > cmake can see >>> > > > > > > > > > > > >>>>>>> protobuf >>> > > > > > > > > > > > >>>>>>>> headers and builds directly off master work >>> > just >>> > > > > fine. >>> > > > > > > > Has anyone >>> > > > > > > > > > > > >>>>> else >>> > > > > > > > > > > > >>>>>>>> experienced this? I am on ubutnu 18.04 >>> > > > > > > > > > > > >>>>>>>> >>> > > > > > > > > > > > >>>>>>>> On Fri, Jul 17, 2020 at 10:49 AM Antoine >>> > Pitrou >>> > > < >>> > > > > > > > anto...@python.org> >>> > > > > > > > > > > > >>>>>>> wrote: >>> > > > > > > > > > > > >>>>>>>> >>> > > > > > > > > > > > >>>>>>>>> >>> > > > > > > > > > > > >>>>>>>>> +1 (binding). I tested on Ubuntu 18.04. >>> > > > > > > > > > > > >>>>>>>>> >>> > > > > > > > > > > > >>>>>>>>> * Wheels verification went fine. >>> > > > > > > > > > > > >>>>>>>>> * Source verification went fine with CUDA >>> > > enabled >>> > > > > and >>> > > > > > > > > > > > >>>>>>>>> TEST_INTEGRATION_JS=0 TEST_JS=0. >>> > > > > > > > > > > > >>>>>>>>> >>> > > > > > > > > > > > >>>>>>>>> I didn't test the binaries. >>> > > > > > > > > > > > >>>>>>>>> >>> > > > > > > > > > > > >>>>>>>>> Regards >>> > > > > > > > > > > > >>>>>>>>> >>> > > > > > > > > > > > >>>>>>>>> Antoine. >>> > > > > > > > > > > > >>>>>>>>> >>> > > > > > > > > > > > >>>>>>>>> >>> > > > > > > > > > > > >>>>>>>>> Le 17/07/2020 à 03:41, Krisztián Szűcs a >>> > écrit >>> > > : >>> > > > > > > > > > > > >>>>>>>>>> Hi, >>> > > > > > > > > > > > >>>>>>>>>> >>> > > > > > > > > > > > >>>>>>>>>> I would like to propose the second >>> release >>> > > > > candidate >>> > > > > > > > (RC1) of >>> > > > > > > > > > > > >>>>>> Apache >>> > > > > > > > > > > > >>>>>>>>>> Arrow version 1.0.0. >>> > > > > > > > > > > > >>>>>>>>>> This is a major release consisting of 826 >>> > > > > resolved JIRA >>> > > > > > > > > > > > >>>>> issues[1]. >>> > > > > > > > > > > > >>>>>>>>>> >>> > > > > > > > > > > > >>>>>>>>>> The verification of the first release >>> > > candidate >>> > > > > (RC0) >>> > > > > > > > has failed >>> > > > > > > > > > > > >>>>>>> [0], and >>> > > > > > > > > > > > >>>>>>>>>> the packaging scripts were unable to >>> produce >>> > > two >>> > > > > > > > wheels. Compared >>> > > > > > > > > > > > >>>>>>>>>> to RC0 this release candidate includes >>> > > additional >>> > > > > > > > patches for the >>> > > > > > > > > > > > >>>>>>>>>> following bugs: ARROW-9506, ARROW-9504, >>> > > > > ARROW-9497, >>> > > > > > > > > > > > >>>>>>>>>> ARROW-9500, ARROW-9499. >>> > > > > > > > > > > > >>>>>>>>>> >>> > > > > > > > > > > > >>>>>>>>>> This release candidate is based on >>> commit: >>> > > > > > > > > > > > >>>>>>>>>> bc0649541859095ee77d03a7b891ea8d6e2fd641 >>> [2] >>> > > > > > > > > > > > >>>>>>>>>> >>> > > > > > > > > > > > >>>>>>>>>> The source release rc1 is hosted at [3]. >>> > > > > > > > > > > > >>>>>>>>>> The binary artifacts are hosted at >>> > > [4][5][6][7]. >>> > > > > > > > > > > > >>>>>>>>>> The changelog is located at [8]. >>> > > > > > > > > > > > >>>>>>>>>> >>> > > > > > > > > > > > >>>>>>>>>> Please download, verify checksums and >>> > > signatures, >>> > > > > run >>> > > > > > > > the unit >>> > > > > > > > > > > > >>>>>> tests, >>> > > > > > > > > > > > >>>>>>>>>> and vote on the release. See [9] for how >>> to >>> > > > > validate a >>> > > > > > > > release >>> > > > > > > > > > > > >>>>>>> candidate. >>> > > > > > > > > > > > >>>>>>>>>> >>> > > > > > > > > > > > >>>>>>>>>> The vote will be open for at least 72 >>> hours. >>> > > > > > > > > > > > >>>>>>>>>> >>> > > > > > > > > > > > >>>>>>>>>> [ ] +1 Release this as Apache Arrow 1.0.0 >>> > > > > > > > > > > > >>>>>>>>>> [ ] +0 >>> > > > > > > > > > > > >>>>>>>>>> [ ] -1 Do not release this as Apache >>> Arrow >>> > > 1.0.0 >>> > > > > > > > because... >>> > > > > > > > > > > > >>>>>>>>>> >>> > > > > > > > > > > > >>>>>>>>>> [0]: >>> > > > > > > > > > > > >>>>>>> >>> > > > > https://github.com/apache/arrow/pull/7778#issuecomment- >>> > > > > > > > 659065370 >>> > > > > > > > > > > > >>>>>>>>>> [1]: >>> > > > > > > > > > > > >>>>>>>>> >>> > > > > > > > > > > > >>>>>>> >>> > > > > > > > > > > > >>>>>> >>> > > > > > > > > > > > >>>>> https://issues.apache.org/ >>> > > jira/issues/?jql=project%20% >>> > > > > > > > 3D%20ARROW%20AND%20status%20in%20%28Resolved%2C% >>> > > 20Closed%29%20AND% >>> > > > > > > > 20fixVersion%20%3D%201.0.0 >>> > > > > > > > > > > > >>>>>>>>>> [2]: >>> > > > > > > > > > > > >>>>>>>>> >>> > > > > > > > > > > > >>>>>>> >>> > > > > > > > > > > > >>>>>> >>> > > > > > > > > > > > >>>>> https://github.com/apache/arrow/tree/ >>> > > > > > > > bc0649541859095ee77d03a7b891ea8d6e2fd641 >>> > > > > > > > > > > > >>>>>>>>>> [3]: >>> > > > > > > > > > > > >>>>>>> https://dist.apache.org/repos/ >>> > > > > > > > dist/dev/arrow/apache-arrow-1.0.0-rc1 >>> > > > > > > > > > > > >>>>>>>>>> [4]: https://bintray.com/apache/ >>> > > > > > > > arrow/centos-rc/1.0.0-rc1 >>> > > > > > > > > > > > >>>>>>>>>> [5]: https://bintray.com/apache/ >>> > > > > > > > arrow/debian-rc/1.0.0-rc1 >>> > > > > > > > > > > > >>>>>>>>>> [6]: https://bintray.com/apache/ >>> > > > > > > > arrow/python-rc/1.0.0-rc1 >>> > > > > > > > > > > > >>>>>>>>>> [7]: https://bintray.com/apache/ >>> > > > > > > > arrow/ubuntu-rc/1.0.0-rc1 >>> > > > > > > > > > > > >>>>>>>>>> [8]: >>> > > > > > > > > > > > >>>>>>>>> >>> > > > > > > > > > > > >>>>>>> >>> > > > > > > > > > > > >>>>>> >>> > > > > > > > > > > > >>>>> https://github.com/apache/arrow/blob/ >>> > > > > > > > bc0649541859095ee77d03a7b891ea8d6e2fd641/CHANGELOG.md >>> > > > > > > > > > > > >>>>>>>>>> [9]: >>> > > > > > > > > > > > >>>>>>>>> >>> > > > > > > > > > > > >>>>>>> >>> > > > > > > > > > > > >>>>>> >>> > > > > > > > > > > > >>>>> https://cwiki.apache.org/ >>> > > confluence/display/ARROW/How+ >>> > > > > > > > to+Verify+Release+Candidates >>> > > > > > > > > > > > >>>>>>>>>> >>> > > > > > > > > > > > >>>>>>>>> >>> > > > > > > > > > > > >>>>>>> >>> > > > > > > > > > > > >>>>>> >>> > > > > > > > > > > > >>>>> >>> > > > > > > > >>> > > > > >>> > > >>> > >>> >>