Just to summarize my understanding: 1. We will live with the rollback of the CL. 2. A new RC is being cut with this rollback.
I think this is OK. I'm going to not rush the proper fix or flags in the current PR which tries to fix it. But I would like to make another PR which disable `to_pandas(timestamp_as_object=True)`. Before I put in the effort to do this, I'd like to gauge if people feel it is worth cutting a new RC over. On Mon, Jul 20, 2020 at 2:56 PM Krisztián Szűcs <szucs.kriszt...@gmail.com> wrote: > On Mon, Jul 20, 2020 at 11:00 PM Micah Kornfield <emkornfi...@gmail.com> > wrote: > >> > >> If yes then `timestamp_as_object` keyword arguments seems like a new > >> feature, so strictly speaking it's not a regression compared to the > >> previous release. > > > > Yes, I don't think we should be releasing new features that are know to > be half baked and based on discussions elsewhere will likely need a > backward compatibility mode just in case users come to rely on the flawed > implementation. > > Ehh, I just read your response and I already cut RC2 including ARROW-5359 > [1]. > I'm afraid I won't be able to cut another RC today, so I'll finish this > one. > > [1]: > https://github.com/apache/arrow/commit/11ee468dcd32196d49332b3b7001ca33d959eafd > > > > > I think we should remove or cause the flag to error for the 1.0 release > at least, so we aren't digging ourselves further into a hole. > > > > On Mon, Jul 20, 2020 at 12:41 PM Krisztián Szűcs < > szucs.kriszt...@gmail.com> wrote: > >> > >> The conversations in the pull requests are pretty broad so I'm just > >> guessing, but do you refer that `to_pandas(timestamp_as_object=True)` > >> drops the timezone information? > >> If yes then `timestamp_as_object` keyword arguments seems like a new > >> feature, so strictly speaking it's not a regression compared to the > >> previous release. > >> > >> I agree that we shouldn't leave known bugs (I don't like it either), > >> but I'm afraid proper timezone support will require more effort. Like > >> currently we also strip timezone information when converting from > >> datetime.time(..., tzinfo) objects, or the missing timezone support in > >> the temporal casts. > >> > >> On Mon, Jul 20, 2020 at 7:36 PM Micah Kornfield <emkornfi...@gmail.com> > wrote: > >> > > >> > I just wanted to clarify. doing a full rollback of the patch means > that https://issues.apache.org/jira/browse/ARROW-5359 would get released > out of the gate with a bug in it. > >> > > >> > On Mon, Jul 20, 2020 at 7:48 AM Antoine Pitrou <anto...@python.org> > wrote: > >> >> > >> >> > >> >> If the release condition is for the regression to be fixed in less > than > >> >> 24 hours (less than 12 hours now?), I think we should simply revert > the > >> >> original PR and work on a fix more leisurely for 1.1.0 (or even > 1.0.1). > >> >> > >> >> Unless it really causes havoc for Spark users, in which case a > >> >> circumvention should be found. > >> >> > >> >> Regards > >> >> > >> >> Antoine. > >> >> > >> >> > >> >> Le 20/07/2020 à 16:46, Krisztián Szűcs a écrit : > >> >> > If I understand correctly we used to just store the timestamp and > the > >> >> > timezone if an explicit arrow type was passed during the > python->arrow > >> >> > conversion, but the timestamp values were not changed in any way. > >> >> > Micah's current patch changes the python->arrow conversion > behavior to > >> >> > normalize all values to utc timestamps. > >> >> > > >> >> > While it's definitely an improvement over the previously ignored > >> >> > timezones, I'm not sure that it won't cause unexpected regressions > in > >> >> > the users' codebases. > >> >> > I'm still trying to better understand the issue and its > compatibility > >> >> > implications, but my intuition tells me that we should apply the > >> >> > reversion instead and properly handle the datetime value > conversions > >> >> > in an upcoming minor release. > >> >> > > >> >> > Either way we should move this conversation to the pull request > [1], > >> >> > because the code snippets pasted here are hardly readable. > >> >> > > >> >> > [1]: https://github.com/apache/arrow/pull/7805 > >> >> > > >> >> > On Mon, Jul 20, 2020 at 9:40 AM Sutou Kouhei <k...@clear-code.com> > wrote: > >> >> >> > >> >> >> Done: > https://github.com/apache/arrow/pull/7805#issuecomment-660855376 > >> >> >> > >> >> >> We can use ...-3.8-... not ...-3.7-... because we don't have > >> >> >> ...-3.7-... task in > >> >> >> https://github.com/apache/arrow/blob/master/dev/tasks/tasks.yml. > >> >> >> > >> >> >> In < > cak7z5t8hqcsd3meg42cuzkscpjr3zndsvrjmm8vied0gzto...@mail.gmail.com> > >> >> >> "Re: [VOTE] Release Apache Arrow 1.0.0 - RC1" on Mon, 20 Jul > 2020 00:14:00 -0700, > >> >> >> Micah Kornfield <emkornfi...@gmail.com> wrote: > >> >> >> > >> >> >>> FYI, I'm not sure if it is a permissions issue or I've done > something wrong > >> >> >>> but github-actions does not seem to be responding to > "@github-actions > >> >> >>> <https://github.com/github-actions> crossbow submit > >> >> >>> test-conda-python-3.7-spark-master" when I enter it. If someone > could kick > >> >> >>> off the spark integration test I would be grateful. > >> >> >>> > >> >> >>> On Mon, Jul 20, 2020 at 12:09 AM Micah Kornfield < > emkornfi...@gmail.com> > >> >> >>> wrote: > >> >> >>> > >> >> >>>> Thanks Bryan. I cherry-picked your change onto my change [1] > which now > >> >> >>>> honors timezone aware datetime objects on ingestion. I've > kicked off the > >> >> >>>> spark integration tests. > >> >> >>>> > >> >> >>>> If this change doesn't work I think the correct course of > action is to > >> >> >>>> provide an environment variable in python to turn back to the > old behavior > >> >> >>>> (ignoring timezones on conversion). I think honoring timezone > information > >> >> >>>> where possible is a strict improvement but I agree we should > give users an > >> >> >>>> option to not break if they wish to upgrade to the latest > version. I need > >> >> >>>> to get some sleep but I will have another PR posted tomorrow > evening if the > >> >> >>>> current one doesn't unblock the release. > >> >> >>>> > >> >> >>>> [1] https://github.com/apache/arrow/pull/7805 > >> >> >>>> > >> >> >>>> On Sun, Jul 19, 2020 at 10:50 PM Bryan Cutler < > cutl...@gmail.com> wrote: > >> >> >>>> > >> >> >>>>> I'd rather not see ARROW-9223 reverted, if possible. I will > put up my > >> >> >>>>> hacked patch to Spark for this so we can test against it if > needed, and > >> >> >>>>> could share my branch if anyone else wants to test it locally. > >> >> >>>>> > >> >> >>>>> On Sun, Jul 19, 2020 at 7:35 PM Micah Kornfield < > emkornfi...@gmail.com> > >> >> >>>>> wrote: > >> >> >>>>> > >> >> >>>>>> I'll spend some time tonight on it and if I can't get round > trip working > >> >> >>>>>> I'll handle reverting > >> >> >>>>>> > >> >> >>>>>> On Sunday, July 19, 2020, Wes McKinney <wesmck...@gmail.com> > wrote: > >> >> >>>>>> > >> >> >>>>>>> On Sun, Jul 19, 2020 at 7:33 PM Neal Richardson > >> >> >>>>>>> <neal.p.richard...@gmail.com> wrote: > >> >> >>>>>>>> > >> >> >>>>>>>> It sounds like you may have identified a pyarrow bug, which > sounds > >> >> >>>>> not > >> >> >>>>>>>> good, though I don't know enough about the broader context > to know > >> >> >>>>>>> whether > >> >> >>>>>>>> this is (1) worse than 0.17 or (2) release blocking. I > defer to > >> >> >>>>> y'all > >> >> >>>>>> who > >> >> >>>>>>>> know better. > >> >> >>>>>>>> > >> >> >>>>>>>> If there are quirks in how Spark handles timezone-naive > timestamps, > >> >> >>>>>>>> shouldn't the fix/workaround go in pyspark, not pyarrow? > For what > >> >> >>>>> it's > >> >> >>>>>>>> worth, I dealt with similar Spark timezone issues in R > recently: > >> >> >>>>>>>> https://github.com/sparklyr/sparklyr/issues/2439 I handled > with it > >> >> >>>>> (in > >> >> >>>>>>>> sparklyr, not the arrow R package) by always setting a > timezone when > >> >> >>>>>>>> sending data to Spark. Not ideal but it made the numbers > "right". > >> >> >>>>>>> > >> >> >>>>>>> Since people are running this code in production we need to > be careful > >> >> >>>>>>> about disrupting them. Unfortunately I'm at the limit of how > much time > >> >> >>>>>>> I can spend on this, but releasing with ARROW-9223 as is > (without > >> >> >>>>>>> being partially or fully reverted) makes me deeply > uncomfortable. So I > >> >> >>>>>>> hope the matter can be resolved. > >> >> >>>>>>> > >> >> >>>>>>>> Neal > >> >> >>>>>>>> > >> >> >>>>>>>> > >> >> >>>>>>>> On Sun, Jul 19, 2020 at 5:13 PM Wes McKinney < > wesmck...@gmail.com> > >> >> >>>>>>> wrote: > >> >> >>>>>>>> > >> >> >>>>>>>>> Honestly I think reverting is the best option. This change > >> >> >>>>> evidently > >> >> >>>>>>>>> needs more time to "season" and perhaps this is motivation > to > >> >> >>>>> enhance > >> >> >>>>>>>>> test coverage in a number of places. > >> >> >>>>>>>>> > >> >> >>>>>>>>> On Sun, Jul 19, 2020 at 7:11 PM Wes McKinney < > wesmck...@gmail.com > >> >> >>>>>> > >> >> >>>>>>> wrote: > >> >> >>>>>>>>>> > >> >> >>>>>>>>>> I am OK with any solution that doesn't delay the > production of > >> >> >>>>> the > >> >> >>>>>>>>>> next RC by more than 24 hours > >> >> >>>>>>>>>> > >> >> >>>>>>>>>> On Sun, Jul 19, 2020 at 7:08 PM Micah Kornfield < > >> >> >>>>>>> emkornfi...@gmail.com> > >> >> >>>>>>>>> wrote: > >> >> >>>>>>>>>>> > >> >> >>>>>>>>>>> If I read the example right it looks like constructing > from > >> >> >>>>>> python > >> >> >>>>>>>>> types > >> >> >>>>>>>>>>> isn't keeping timezones into in place? I can try make a > patch > >> >> >>>>>> that > >> >> >>>>>>>>> fixes > >> >> >>>>>>>>>>> that tonight or would the preference be to revert my > patch > >> >> >>>>> (note > >> >> >>>>>> I > >> >> >>>>>>>>> think > >> >> >>>>>>>>>>> another bug with a prior bug was fixed in my PR as well) > >> >> >>>>>>>>>>> > >> >> >>>>>>>>>>> -Micah > >> >> >>>>>>>>>>> > >> >> >>>>>>>>>>> On Sunday, July 19, 2020, Wes McKinney < > wesmck...@gmail.com> > >> >> >>>>>>> wrote: > >> >> >>>>>>>>>>> > >> >> >>>>>>>>>>>> I think I see the problem now: > >> >> >>>>>>>>>>>> > >> >> >>>>>>>>>>>> In [40]: parr > >> >> >>>>>>>>>>>> Out[40]: > >> >> >>>>>>>>>>>> 0 {'f0': 1969-12-31 16:00:00-08:00} > >> >> >>>>>>>>>>>> 1 {'f0': 1969-12-31 16:00:00.000001-08:00} > >> >> >>>>>>>>>>>> 2 {'f0': 1969-12-31 16:00:00.000002-08:00} > >> >> >>>>>>>>>>>> dtype: object > >> >> >>>>>>>>>>>> > >> >> >>>>>>>>>>>> In [41]: parr[0]['f0'] > >> >> >>>>>>>>>>>> Out[41]: datetime.datetime(1969, 12, 31, 16, 0, > >> >> >>>>>> tzinfo=<DstTzInfo > >> >> >>>>>>>>>>>> 'America/Los_Angeles' PST-1 day, 16:00:00 STD>) > >> >> >>>>>>>>>>>> > >> >> >>>>>>>>>>>> In [42]: pa.array(parr) > >> >> >>>>>>>>>>>> Out[42]: > >> >> >>>>>>>>>>>> <pyarrow.lib.StructArray object at 0x7f0893706a60> > >> >> >>>>>>>>>>>> -- is_valid: all not null > >> >> >>>>>>>>>>>> -- child 0 type: timestamp[us] > >> >> >>>>>>>>>>>> [ > >> >> >>>>>>>>>>>> 1969-12-31 16:00:00.000000, > >> >> >>>>>>>>>>>> 1969-12-31 16:00:00.000001, > >> >> >>>>>>>>>>>> 1969-12-31 16:00:00.000002 > >> >> >>>>>>>>>>>> ] > >> >> >>>>>>>>>>>> > >> >> >>>>>>>>>>>> In [43]: pa.array(parr).field(0).type > >> >> >>>>>>>>>>>> Out[43]: TimestampType(timestamp[us]) > >> >> >>>>>>>>>>>> > >> >> >>>>>>>>>>>> On 0.17.1 > >> >> >>>>>>>>>>>> > >> >> >>>>>>>>>>>> In [8]: arr = pa.array([0, 1, 2], > type=pa.timestamp('us', > >> >> >>>>>>>>>>>> 'America/Los_Angeles')) > >> >> >>>>>>>>>>>> > >> >> >>>>>>>>>>>> In [9]: arr > >> >> >>>>>>>>>>>> Out[9]: > >> >> >>>>>>>>>>>> <pyarrow.lib.TimestampArray object at 0x7f9dede69d00> > >> >> >>>>>>>>>>>> [ > >> >> >>>>>>>>>>>> 1970-01-01 00:00:00.000000, > >> >> >>>>>>>>>>>> 1970-01-01 00:00:00.000001, > >> >> >>>>>>>>>>>> 1970-01-01 00:00:00.000002 > >> >> >>>>>>>>>>>> ] > >> >> >>>>>>>>>>>> > >> >> >>>>>>>>>>>> In [10]: struct_arr = pa.StructArray.from_arrays([arr], > >> >> >>>>>>> names=['f0']) > >> >> >>>>>>>>>>>> > >> >> >>>>>>>>>>>> In [11]: struct_arr > >> >> >>>>>>>>>>>> Out[11]: > >> >> >>>>>>>>>>>> <pyarrow.lib.StructArray object at 0x7f9ded0016e0> > >> >> >>>>>>>>>>>> -- is_valid: all not null > >> >> >>>>>>>>>>>> -- child 0 type: timestamp[us, tz=America/Los_Angeles] > >> >> >>>>>>>>>>>> [ > >> >> >>>>>>>>>>>> 1970-01-01 00:00:00.000000, > >> >> >>>>>>>>>>>> 1970-01-01 00:00:00.000001, > >> >> >>>>>>>>>>>> 1970-01-01 00:00:00.000002 > >> >> >>>>>>>>>>>> ] > >> >> >>>>>>>>>>>> > >> >> >>>>>>>>>>>> In [12]: struct_arr.to_pandas() > >> >> >>>>>>>>>>>> Out[12]: > >> >> >>>>>>>>>>>> 0 {'f0': 1970-01-01 00:00:00} > >> >> >>>>>>>>>>>> 1 {'f0': 1970-01-01 00:00:00.000001} > >> >> >>>>>>>>>>>> 2 {'f0': 1970-01-01 00:00:00.000002} > >> >> >>>>>>>>>>>> dtype: object > >> >> >>>>>>>>>>>> > >> >> >>>>>>>>>>>> In [13]: pa.array(struct_arr.to_pandas()) > >> >> >>>>>>>>>>>> Out[13]: > >> >> >>>>>>>>>>>> <pyarrow.lib.StructArray object at 0x7f9ded003210> > >> >> >>>>>>>>>>>> -- is_valid: all not null > >> >> >>>>>>>>>>>> -- child 0 type: timestamp[us] > >> >> >>>>>>>>>>>> [ > >> >> >>>>>>>>>>>> 1970-01-01 00:00:00.000000, > >> >> >>>>>>>>>>>> 1970-01-01 00:00:00.000001, > >> >> >>>>>>>>>>>> 1970-01-01 00:00:00.000002 > >> >> >>>>>>>>>>>> ] > >> >> >>>>>>>>>>>> > >> >> >>>>>>>>>>>> In [14]: pa.array(struct_arr.to_pandas()).type > >> >> >>>>>>>>>>>> Out[14]: StructType(struct<f0: timestamp[us]>) > >> >> >>>>>>>>>>>> > >> >> >>>>>>>>>>>> So while the time zone is getting stripped in both > cases, > >> >> >>>>> the > >> >> >>>>>>> failure > >> >> >>>>>>>>>>>> to round trip is a problem. If we are going to attach > the > >> >> >>>>> time > >> >> >>>>>>> zone > >> >> >>>>>>>>> in > >> >> >>>>>>>>>>>> to_pandas() then we need to respect it when going the > other > >> >> >>>>>> way. > >> >> >>>>>>>>>>>> > >> >> >>>>>>>>>>>> This looks like a regression to me and so I'm inclined > to > >> >> >>>>>> revise > >> >> >>>>>>> my > >> >> >>>>>>>>>>>> vote on the release to -0/-1 > >> >> >>>>>>>>>>>> > >> >> >>>>>>>>>>>> On Sun, Jul 19, 2020 at 6:46 PM Wes McKinney < > >> >> >>>>>>> wesmck...@gmail.com> > >> >> >>>>>>>>> wrote: > >> >> >>>>>>>>>>>>> > >> >> >>>>>>>>>>>>> Ah I forgot that this is a "feature" of nanosecond > >> >> >>>>> timestamps > >> >> >>>>>>>>>>>>> > >> >> >>>>>>>>>>>>> In [21]: arr = pa.array([0, 1, 2], > type=pa.timestamp('us', > >> >> >>>>>>>>>>>>> 'America/Los_Angeles')) > >> >> >>>>>>>>>>>>> > >> >> >>>>>>>>>>>>> In [22]: struct_arr = pa.StructArray.from_arrays([arr], > >> >> >>>>>>>>> names=['f0']) > >> >> >>>>>>>>>>>>> > >> >> >>>>>>>>>>>>> In [23]: struct_arr.to_pandas() > >> >> >>>>>>>>>>>>> Out[23]: > >> >> >>>>>>>>>>>>> 0 {'f0': 1969-12-31 16:00:00-08:00} > >> >> >>>>>>>>>>>>> 1 {'f0': 1969-12-31 16:00:00.000001-08:00} > >> >> >>>>>>>>>>>>> 2 {'f0': 1969-12-31 16:00:00.000002-08:00} > >> >> >>>>>>>>>>>>> dtype: object > >> >> >>>>>>>>>>>>> > >> >> >>>>>>>>>>>>> So this is working as intended, such as it is > >> >> >>>>>>>>>>>>> > >> >> >>>>>>>>>>>>> On Sun, Jul 19, 2020 at 6:40 PM Wes McKinney < > >> >> >>>>>>> wesmck...@gmail.com> > >> >> >>>>>>>>>>>> wrote: > >> >> >>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>> There seems to be other broken StructArray stuff > >> >> >>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>> In [14]: arr = pa.array([0, 1, 2], > >> >> >>>>> type=pa.timestamp('ns', > >> >> >>>>>>>>>>>>>> 'America/Los_Angeles')) > >> >> >>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>> In [15]: struct_arr = > pa.StructArray.from_arrays([arr], > >> >> >>>>>>>>> names=['f0']) > >> >> >>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>> In [16]: struct_arr > >> >> >>>>>>>>>>>>>> Out[16]: > >> >> >>>>>>>>>>>>>> <pyarrow.lib.StructArray object at 0x7f089370f590> > >> >> >>>>>>>>>>>>>> -- is_valid: all not null > >> >> >>>>>>>>>>>>>> -- child 0 type: timestamp[ns, tz=America/Los_Angeles] > >> >> >>>>>>>>>>>>>> [ > >> >> >>>>>>>>>>>>>> 1970-01-01 00:00:00.000000000, > >> >> >>>>>>>>>>>>>> 1970-01-01 00:00:00.000000001, > >> >> >>>>>>>>>>>>>> 1970-01-01 00:00:00.000000002 > >> >> >>>>>>>>>>>>>> ] > >> >> >>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>> In [17]: struct_arr.to_pandas() > >> >> >>>>>>>>>>>>>> Out[17]: > >> >> >>>>>>>>>>>>>> 0 {'f0': 0} > >> >> >>>>>>>>>>>>>> 1 {'f0': 1} > >> >> >>>>>>>>>>>>>> 2 {'f0': 2} > >> >> >>>>>>>>>>>>>> dtype: object > >> >> >>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>> All in all it appears that this part of the project > >> >> >>>>> needs > >> >> >>>>>>> some > >> >> >>>>>>>>> TLC > >> >> >>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>> On Sun, Jul 19, 2020 at 6:16 PM Wes McKinney < > >> >> >>>>>>>>> wesmck...@gmail.com> > >> >> >>>>>>>>>>>> wrote: > >> >> >>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>> Well, the problem is that time zones are really > >> >> >>>>> finicky > >> >> >>>>>>>>> comparing > >> >> >>>>>>>>>>>>>>> Spark (which uses a localtime interpretation of > >> >> >>>>>> timestamps > >> >> >>>>>>>>> without > >> >> >>>>>>>>>>>>>>> time zone) and Arrow (which has naive timestamps -- a > >> >> >>>>>>> concept > >> >> >>>>>>>>> similar > >> >> >>>>>>>>>>>>>>> but different from the SQL concept TIMESTAMP WITHOUT > >> >> >>>>> TIME > >> >> >>>>>>> ZONE > >> >> >>>>>>>>> -- and > >> >> >>>>>>>>>>>>>>> tz-aware timestamps). So somewhere there is a time > >> >> >>>>> zone > >> >> >>>>>>> being > >> >> >>>>>>>>>>>> stripped > >> >> >>>>>>>>>>>>>>> or applied/localized which may result in the > >> >> >>>>> transferred > >> >> >>>>>>> data > >> >> >>>>>>>>> to/from > >> >> >>>>>>>>>>>>>>> Spark being shifted by the time zone offset. I think > >> >> >>>>> it's > >> >> >>>>>>>>> important > >> >> >>>>>>>>>>>>>>> that we determine what the problem is -- if it's a > >> >> >>>>>> problem > >> >> >>>>>>>>> that has > >> >> >>>>>>>>>>>> to > >> >> >>>>>>>>>>>>>>> be fixed in Arrow (and it's not clear to me that it > >> >> >>>>> is) > >> >> >>>>>>> it's > >> >> >>>>>>>>> worth > >> >> >>>>>>>>>>>>>>> spending some time to understand what's going on to > >> >> >>>>> avoid > >> >> >>>>>>> the > >> >> >>>>>>>>>>>>>>> possibility of patch release on account of this. > >> >> >>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>> On Sun, Jul 19, 2020 at 6:12 PM Neal Richardson > >> >> >>>>>>>>>>>>>>> <neal.p.richard...@gmail.com> wrote: > >> >> >>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>> If it’s a display problem, should it block the > >> >> >>>>> release? > >> >> >>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>> Sent from my iPhone > >> >> >>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>> On Jul 19, 2020, at 3:57 PM, Wes McKinney < > >> >> >>>>>>>>> wesmck...@gmail.com> > >> >> >>>>>>>>>>>> wrote: > >> >> >>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>> I opened https://issues.apache.org/ > >> >> >>>>>>> jira/browse/ARROW-9525 > >> >> >>>>>>>>>>>> about the > >> >> >>>>>>>>>>>>>>>>> display problem. My guess is that there are other > >> >> >>>>>>> problems > >> >> >>>>>>>>>>>> lurking > >> >> >>>>>>>>>>>>>>>>> here > >> >> >>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>> On Sun, Jul 19, 2020 at 5:54 PM Wes McKinney < > >> >> >>>>>>>>>>>> wesmck...@gmail.com> wrote: > >> >> >>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>> hi Bryan, > >> >> >>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>> This is a display bug > >> >> >>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>> In [6]: arr = pa.array([0, 1, 2], > >> >> >>>>>>> type=pa.timestamp('ns', > >> >> >>>>>>>>>>>>>>>>>> 'America/Los_Angeles')) > >> >> >>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>> In [7]: arr.view('int64') > >> >> >>>>>>>>>>>>>>>>>> Out[7]: > >> >> >>>>>>>>>>>>>>>>>> <pyarrow.lib.Int64Array object at 0x7fd1b8aaef30> > >> >> >>>>>>>>>>>>>>>>>> [ > >> >> >>>>>>>>>>>>>>>>>> 0, > >> >> >>>>>>>>>>>>>>>>>> 1, > >> >> >>>>>>>>>>>>>>>>>> 2 > >> >> >>>>>>>>>>>>>>>>>> ] > >> >> >>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>> In [8]: arr > >> >> >>>>>>>>>>>>>>>>>> Out[8]: > >> >> >>>>>>>>>>>>>>>>>> <pyarrow.lib.TimestampArray object at > >> >> >>>>>> 0x7fd1b8aae6e0> > >> >> >>>>>>>>>>>>>>>>>> [ > >> >> >>>>>>>>>>>>>>>>>> 1970-01-01 00:00:00.000000000, > >> >> >>>>>>>>>>>>>>>>>> 1970-01-01 00:00:00.000000001, > >> >> >>>>>>>>>>>>>>>>>> 1970-01-01 00:00:00.000000002 > >> >> >>>>>>>>>>>>>>>>>> ] > >> >> >>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>> In [9]: arr.to_pandas() > >> >> >>>>>>>>>>>>>>>>>> Out[9]: > >> >> >>>>>>>>>>>>>>>>>> 0 1969-12-31 16:00:00-08:00 > >> >> >>>>>>>>>>>>>>>>>> 1 1969-12-31 16:00:00.000000001-08:00 > >> >> >>>>>>>>>>>>>>>>>> 2 1969-12-31 16:00:00.000000002-08:00 > >> >> >>>>>>>>>>>>>>>>>> dtype: datetime64[ns, America/Los_Angeles] > >> >> >>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>> the repr of TimestampArray doesn't take into > >> >> >>>>> account > >> >> >>>>>>> the > >> >> >>>>>>>>>>>> timezone > >> >> >>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>> In [10]: arr[0] > >> >> >>>>>>>>>>>>>>>>>> Out[10]: <pyarrow.TimestampScalar: > >> >> >>>>>>> Timestamp('1969-12-31 > >> >> >>>>>>>>>>>>>>>>>> 16:00:00-0800', tz='America/Los_Angeles')> > >> >> >>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>> So if it's incorrect, the problem is happening > >> >> >>>>>>> somewhere > >> >> >>>>>>>>> before > >> >> >>>>>>>>>>>> or > >> >> >>>>>>>>>>>>>>>>>> while the StructArray is being created. If I had > >> >> >>>>> to > >> >> >>>>>>> guess > >> >> >>>>>>>>> it's > >> >> >>>>>>>>>>>> caused > >> >> >>>>>>>>>>>>>>>>>> by the tzinfo of the datetime.datetime values not > >> >> >>>>>>> being > >> >> >>>>>>>>> handled > >> >> >>>>>>>>>>>> in the > >> >> >>>>>>>>>>>>>>>>>> way that they were before > >> >> >>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>> On Sun, Jul 19, 2020 at 5:19 PM Wes McKinney < > >> >> >>>>>>>>>>>> wesmck...@gmail.com> wrote: > >> >> >>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>> Well this is not good and pretty disappointing > >> >> >>>>>> given > >> >> >>>>>>>>> that we > >> >> >>>>>>>>>>>> had nearly a month to sort through the implications of > >> >> >>>>> Micah’s > >> >> >>>>>>>>> patch. We > >> >> >>>>>>>>>>>> should try to resolve this ASAP > >> >> >>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>> On Sun, Jul 19, 2020 at 5:10 PM Bryan Cutler < > >> >> >>>>>>>>>>>> cutl...@gmail.com> wrote: > >> >> >>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>> +0 (non-binding) > >> >> >>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>> I ran verification script for binaries and then > >> >> >>>>>>> source, > >> >> >>>>>>>>> as > >> >> >>>>>>>>>>>> below, and both > >> >> >>>>>>>>>>>>>>>>>>>> look good > >> >> >>>>>>>>>>>>>>>>>>>> ARROW_TMPDIR=/tmp/arrow-test TEST_DEFAULT=0 > >> >> >>>>>>>>> TEST_SOURCE=1 > >> >> >>>>>>>>>>>> TEST_CPP=1 > >> >> >>>>>>>>>>>>>>>>>>>> TEST_PYTHON=1 TEST_JAVA=1 > >> >> >>>>> TEST_INTEGRATION_CPP=1 > >> >> >>>>>>>>>>>> TEST_INTEGRATION_JAVA=1 > >> >> >>>>>>>>>>>>>>>>>>>> dev/release/verify-release-candidate.sh source > >> >> >>>>>>> 1.0.0 1 > >> >> >>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>> I tried to patch Spark locally to verify the > >> >> >>>>>> recent > >> >> >>>>>>>>> change in > >> >> >>>>>>>>>>>> nested > >> >> >>>>>>>>>>>>>>>>>>>> timestamps and was not able to get things > >> >> >>>>> working > >> >> >>>>>>> quite > >> >> >>>>>>>>>>>> right, but I'm not > >> >> >>>>>>>>>>>>>>>>>>>> sure if the problem is in Spark, Arrow or my > >> >> >>>>>> patch - > >> >> >>>>>>>>> hence my > >> >> >>>>>>>>>>>> vote of +0. > >> >> >>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>> Here is what I'm seeing > >> >> >>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>> ``` > >> >> >>>>>>>>>>>>>>>>>>>> (Input as datetime) > >> >> >>>>>>>>>>>>>>>>>>>> datetime.datetime(2018, 3, 10, 0, 0) > >> >> >>>>>>>>>>>>>>>>>>>> datetime.datetime(2018, 3, 15, 0, 0) > >> >> >>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>> (Struct Array) > >> >> >>>>>>>>>>>>>>>>>>>> -- is_valid: all not null > >> >> >>>>>>>>>>>>>>>>>>>> -- child 0 type: timestamp[us, > >> >> >>>>>>> tz=America/Los_Angeles] > >> >> >>>>>>>>>>>>>>>>>>>> [ > >> >> >>>>>>>>>>>>>>>>>>>> 2018-03-10 00:00:00.000000, > >> >> >>>>>>>>>>>>>>>>>>>> 2018-03-10 00:00:00.000000 > >> >> >>>>>>>>>>>>>>>>>>>> ] > >> >> >>>>>>>>>>>>>>>>>>>> -- child 1 type: timestamp[us, > >> >> >>>>>>> tz=America/Los_Angeles] > >> >> >>>>>>>>>>>>>>>>>>>> [ > >> >> >>>>>>>>>>>>>>>>>>>> 2018-03-15 00:00:00.000000, > >> >> >>>>>>>>>>>>>>>>>>>> 2018-03-15 00:00:00.000000 > >> >> >>>>>>>>>>>>>>>>>>>> ] > >> >> >>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>> (Flattened Arrays) > >> >> >>>>>>>>>>>>>>>>>>>> types [TimestampType(timestamp[us, > >> >> >>>>>>>>> tz=America/Los_Angeles]), > >> >> >>>>>>>>>>>>>>>>>>>> TimestampType(timestamp[us, > >> >> >>>>>>> tz=America/Los_Angeles])] > >> >> >>>>>>>>>>>>>>>>>>>> [<pyarrow.lib.TimestampArray object at > >> >> >>>>>>> 0x7ffbbd88f520> > >> >> >>>>>>>>>>>>>>>>>>>> [ > >> >> >>>>>>>>>>>>>>>>>>>> 2018-03-10 00:00:00.000000, > >> >> >>>>>>>>>>>>>>>>>>>> 2018-03-10 00:00:00.000000 > >> >> >>>>>>>>>>>>>>>>>>>> ], <pyarrow.lib.TimestampArray object at > >> >> >>>>>>> 0x7ffba958be50> > >> >> >>>>>>>>>>>>>>>>>>>> [ > >> >> >>>>>>>>>>>>>>>>>>>> 2018-03-15 00:00:00.000000, > >> >> >>>>>>>>>>>>>>>>>>>> 2018-03-15 00:00:00.000000 > >> >> >>>>>>>>>>>>>>>>>>>> ]] > >> >> >>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>> (Pandas Conversion) > >> >> >>>>>>>>>>>>>>>>>>>> [ > >> >> >>>>>>>>>>>>>>>>>>>> 0 2018-03-09 16:00:00-08:00 > >> >> >>>>>>>>>>>>>>>>>>>> 1 2018-03-09 16:00:00-08:00 > >> >> >>>>>>>>>>>>>>>>>>>> dtype: datetime64[ns, America/Los_Angeles], > >> >> >>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>> 0 2018-03-14 17:00:00-07:00 > >> >> >>>>>>>>>>>>>>>>>>>> 1 2018-03-14 17:00:00-07:00 > >> >> >>>>>>>>>>>>>>>>>>>> dtype: datetime64[ns, America/Los_Angeles]] > >> >> >>>>>>>>>>>>>>>>>>>> ``` > >> >> >>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>> Based on output of existing a correct timestamp > >> >> >>>>>>> udf, it > >> >> >>>>>>>>> looks > >> >> >>>>>>>>>>>> like the > >> >> >>>>>>>>>>>>>>>>>>>> pyarrow Struct Array values are wrong and > >> >> >>>>> that's > >> >> >>>>>>> carried > >> >> >>>>>>>>>>>> through the > >> >> >>>>>>>>>>>>>>>>>>>> flattened arrays, causing the Pandas values to > >> >> >>>>>> have > >> >> >>>>>>> a > >> >> >>>>>>>>>>>> negative offset. > >> >> >>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>> Here is output from a working udf with > >> >> >>>>> timestamp, > >> >> >>>>>>> the > >> >> >>>>>>>>> pyarrow > >> >> >>>>>>>>>>>> Array > >> >> >>>>>>>>>>>>>>>>>>>> displays in UTC time, I believe. > >> >> >>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>> ``` > >> >> >>>>>>>>>>>>>>>>>>>> (Timestamp Array) > >> >> >>>>>>>>>>>>>>>>>>>> type timestamp[us, tz=America/Los_Angeles] > >> >> >>>>>>>>>>>>>>>>>>>> [ > >> >> >>>>>>>>>>>>>>>>>>>> [ > >> >> >>>>>>>>>>>>>>>>>>>> 1969-01-01 09:01:01.000000 > >> >> >>>>>>>>>>>>>>>>>>>> ] > >> >> >>>>>>>>>>>>>>>>>>>> ] > >> >> >>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>> (Pandas Conversion) > >> >> >>>>>>>>>>>>>>>>>>>> 0 1969-01-01 01:01:01-08:00 > >> >> >>>>>>>>>>>>>>>>>>>> Name: _0, dtype: datetime64[ns, > >> >> >>>>>> America/Los_Angeles] > >> >> >>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>> (Timezone Localized) > >> >> >>>>>>>>>>>>>>>>>>>> 0 1969-01-01 01:01:01 > >> >> >>>>>>>>>>>>>>>>>>>> Name: _0, dtype: datetime64[ns] > >> >> >>>>>>>>>>>>>>>>>>>> ``` > >> >> >>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>> I'll have to dig in further at another time and > >> >> >>>>>>> debug > >> >> >>>>>>>>> where > >> >> >>>>>>>>>>>> the values go > >> >> >>>>>>>>>>>>>>>>>>>> wrong. > >> >> >>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>> On Sat, Jul 18, 2020 at 9:51 PM Micah > >> >> >>>>> Kornfield < > >> >> >>>>>>>>>>>> emkornfi...@gmail.com> > >> >> >>>>>>>>>>>>>>>>>>>> wrote: > >> >> >>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>> +1 (binding) > >> >> >>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>> Ran wheel and binary tests on ubuntu 19.04 > >> >> >>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>> On Fri, Jul 17, 2020 at 2:25 PM Neal > >> >> >>>>> Richardson < > >> >> >>>>>>>>>>>>>>>>>>>>> neal.p.richard...@gmail.com> > >> >> >>>>>>>>>>>>>>>>>>>>> wrote: > >> >> >>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>> +1 (binding) > >> >> >>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>> In addition to the usual verification on > >> >> >>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/arrow/pull/7787, > >> >> >>>>> I've > >> >> >>>>>>>>>>>> successfully staged the > >> >> >>>>>>>>>>>>>>>>>>>>> R > >> >> >>>>>>>>>>>>>>>>>>>>>> binary artifacts on Windows ( > >> >> >>>>>>>>>>>>>>>>>>>>>> https://github.com/r-windows/ > >> >> >>>>>>> rtools-packages/pull/126 > >> >> >>>>>>>>> ), > >> >> >>>>>>>>>>>> macOS ( > >> >> >>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>> https://github.com/autobrew/homebrew-core/pull/12 > >> >> >>>>>>> ), > >> >> >>>>>>>>> and > >> >> >>>>>>>>>>>> Linux ( > >> >> >>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>> https://github.com/ursa-labs/arrow-r-nightly/actions/runs/ > >> >> >>>>>>>>>>>> 172977277) > >> >> >>>>>>>>>>>>>>>>>>>>> using > >> >> >>>>>>>>>>>>>>>>>>>>>> the release candidate. > >> >> >>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>> And I agree with the judgment about skipping > >> >> >>>>> a > >> >> >>>>>> JS > >> >> >>>>>>>>> release > >> >> >>>>>>>>>>>> artifact. Looks > >> >> >>>>>>>>>>>>>>>>>>>>>> like there hasn't been a code change since > >> >> >>>>>>> October so > >> >> >>>>>>>>>>>> there's no point. > >> >> >>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>> Neal > >> >> >>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>> On Fri, Jul 17, 2020 at 10:37 AM Wes > >> >> >>>>> McKinney < > >> >> >>>>>>>>>>>> wesmck...@gmail.com> > >> >> >>>>>>>>>>>>>>>>>>>>> wrote: > >> >> >>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>> I see the JS failures as well. I think it > >> >> >>>>> is a > >> >> >>>>>>>>> failure > >> >> >>>>>>>>>>>> localized to > >> >> >>>>>>>>>>>>>>>>>>>>>>> newer Node versions since our JavaScript CI > >> >> >>>>>> works > >> >> >>>>>>>>> fine. I > >> >> >>>>>>>>>>>> don't think > >> >> >>>>>>>>>>>>>>>>>>>>>>> it should block the release given the lack > >> >> >>>>> of > >> >> >>>>>>>>> development > >> >> >>>>>>>>>>>> activity in > >> >> >>>>>>>>>>>>>>>>>>>>>>> JavaScript [1] -- if any JS devs are > >> >> >>>>> concerned > >> >> >>>>>>> about > >> >> >>>>>>>>>>>> publishing an > >> >> >>>>>>>>>>>>>>>>>>>>>>> artifact then we can skip pushing it to NPM > >> >> >>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>> @Ryan it seems it may be something > >> >> >>>>> environment > >> >> >>>>>>>>> related on > >> >> >>>>>>>>>>>> your > >> >> >>>>>>>>>>>>>>>>>>>>>>> machine, I'm on Ubuntu 18.04 and have not > >> >> >>>>> seen > >> >> >>>>>>> this. > >> >> >>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>> On > >> >> >>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>> * Python 3.8 wheel's tests are failed. > >> >> >>>>> 3.5, > >> >> >>>>>> 3.6 > >> >> >>>>>>>>> and 3.7 > >> >> >>>>>>>>>>>>>>>>>>>>>>>> are passed. It seems that -larrow and > >> >> >>>>>>>>> -larrow_python > >> >> >>>>>>>>>>>> for > >> >> >>>>>>>>>>>>>>>>>>>>>>>> Cython are failed. > >> >> >>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>> I suspect this is related to > >> >> >>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>> https://github.com/apache/arrow/commit/ > >> >> >>>>>>>>>>>> 120c21f4bf66d2901b3a353a1f67bac3c3355924#diff- > >> >> >>>>>>>>>>>> 0f69784b44040448d17d0e4e8a641fe8 > >> >> >>>>>>>>>>>>>>>>>>>>>>> , > >> >> >>>>>>>>>>>>>>>>>>>>>>> but I don't think it's a blocking issue > >> >> >>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>> [1]: > >> >> >>>>>>>>> https://github.com/apache/arrow/commits/master/js > >> >> >>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>> On Fri, Jul 17, 2020 at 9:42 AM Ryan Murray > >> >> >>>>> < > >> >> >>>>>>>>>>>> rym...@dremio.com> wrote: > >> >> >>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>> I've tested Java and it looks good. However > >> >> >>>>>> the > >> >> >>>>>>>>> verify > >> >> >>>>>>>>>>>> script keeps > >> >> >>>>>>>>>>>>>>>>>>>>> on > >> >> >>>>>>>>>>>>>>>>>>>>>>>> bailing with protobuf related errors: > >> >> >>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>> 'cpp/build/orc_ep-prefix/src/orc_ep-build/c++/src/orc_ > >> >> >>>>>>>>>>>> proto.pb.cc' > >> >> >>>>>>>>>>>>>>>>>>>>> and > >> >> >>>>>>>>>>>>>>>>>>>>>>>> friends cant find protobuf definitions. A > >> >> >>>>> bit > >> >> >>>>>>> odd as > >> >> >>>>>>>>>>>> cmake can see > >> >> >>>>>>>>>>>>>>>>>>>>>>> protobuf > >> >> >>>>>>>>>>>>>>>>>>>>>>>> headers and builds directly off master work > >> >> >>>>>> just > >> >> >>>>>>>>> fine. > >> >> >>>>>>>>>>>> Has anyone > >> >> >>>>>>>>>>>>>>>>>>>>> else > >> >> >>>>>>>>>>>>>>>>>>>>>>>> experienced this? I am on ubutnu 18.04 > >> >> >>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>> On Fri, Jul 17, 2020 at 10:49 AM Antoine > >> >> >>>>>> Pitrou > >> >> >>>>>>> < > >> >> >>>>>>>>>>>> anto...@python.org> > >> >> >>>>>>>>>>>>>>>>>>>>>>> wrote: > >> >> >>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> +1 (binding). I tested on Ubuntu 18.04. > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> * Wheels verification went fine. > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> * Source verification went fine with CUDA > >> >> >>>>>>> enabled > >> >> >>>>>>>>> and > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> TEST_INTEGRATION_JS=0 TEST_JS=0. > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> I didn't test the binaries. > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> Regards > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> Antoine. > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> Le 17/07/2020 à 03:41, Krisztián Szűcs a > >> >> >>>>>> écrit > >> >> >>>>>>> : > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> Hi, > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> I would like to propose the second > >> >> >>>>> release > >> >> >>>>>>>>> candidate > >> >> >>>>>>>>>>>> (RC1) of > >> >> >>>>>>>>>>>>>>>>>>>>>> Apache > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> Arrow version 1.0.0. > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> This is a major release consisting of 826 > >> >> >>>>>>>>> resolved JIRA > >> >> >>>>>>>>>>>>>>>>>>>>> issues[1]. > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> The verification of the first release > >> >> >>>>>>> candidate > >> >> >>>>>>>>> (RC0) > >> >> >>>>>>>>>>>> has failed > >> >> >>>>>>>>>>>>>>>>>>>>>>> [0], and > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> the packaging scripts were unable to > >> >> >>>>> produce > >> >> >>>>>>> two > >> >> >>>>>>>>>>>> wheels. Compared > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> to RC0 this release candidate includes > >> >> >>>>>>> additional > >> >> >>>>>>>>>>>> patches for the > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> following bugs: ARROW-9506, ARROW-9504, > >> >> >>>>>>>>> ARROW-9497, > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> ARROW-9500, ARROW-9499. > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> This release candidate is based on > >> >> >>>>> commit: > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> bc0649541859095ee77d03a7b891ea8d6e2fd641 > >> >> >>>>> [2] > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> The source release rc1 is hosted at [3]. > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> The binary artifacts are hosted at > >> >> >>>>>>> [4][5][6][7]. > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> The changelog is located at [8]. > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> Please download, verify checksums and > >> >> >>>>>>> signatures, > >> >> >>>>>>>>> run > >> >> >>>>>>>>>>>> the unit > >> >> >>>>>>>>>>>>>>>>>>>>>> tests, > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> and vote on the release. See [9] for how > >> >> >>>>> to > >> >> >>>>>>>>> validate a > >> >> >>>>>>>>>>>> release > >> >> >>>>>>>>>>>>>>>>>>>>>>> candidate. > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> The vote will be open for at least 72 > >> >> >>>>> hours. > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> [ ] +1 Release this as Apache Arrow 1.0.0 > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> [ ] +0 > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> [ ] -1 Do not release this as Apache > >> >> >>>>> Arrow > >> >> >>>>>>> 1.0.0 > >> >> >>>>>>>>>>>> because... > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> [0]: > >> >> >>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>> https://github.com/apache/arrow/pull/7778#issuecomment- > >> >> >>>>>>>>>>>> 659065370 > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> [1]: > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/ > >> >> >>>>>>> jira/issues/?jql=project%20% > >> >> >>>>>>>>>>>> 3D%20ARROW%20AND%20status%20in%20%28Resolved%2C% > >> >> >>>>>>> 20Closed%29%20AND% > >> >> >>>>>>>>>>>> 20fixVersion%20%3D%201.0.0 > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> [2]: > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>> https://github.com/apache/arrow/tree/ > >> >> >>>>>>>>>>>> bc0649541859095ee77d03a7b891ea8d6e2fd641 > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> [3]: > >> >> >>>>>>>>>>>>>>>>>>>>>>> https://dist.apache.org/repos/ > >> >> >>>>>>>>>>>> dist/dev/arrow/apache-arrow-1.0.0-rc1 > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> [4]: https://bintray.com/apache/ > >> >> >>>>>>>>>>>> arrow/centos-rc/1.0.0-rc1 > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> [5]: https://bintray.com/apache/ > >> >> >>>>>>>>>>>> arrow/debian-rc/1.0.0-rc1 > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> [6]: https://bintray.com/apache/ > >> >> >>>>>>>>>>>> arrow/python-rc/1.0.0-rc1 > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> [7]: https://bintray.com/apache/ > >> >> >>>>>>>>>>>> arrow/ubuntu-rc/1.0.0-rc1 > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> [8]: > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>> https://github.com/apache/arrow/blob/ > >> >> >>>>>>>>>>>> bc0649541859095ee77d03a7b891ea8d6e2fd641/CHANGELOG.md > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> [9]: > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/ > >> >> >>>>>>> confluence/display/ARROW/How+ > >> >> >>>>>>>>>>>> to+Verify+Release+Candidates > >> >> >>>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>>>>>>>>>>> > >> >> >>>>>>>>>>>> > >> >> >>>>>>>>> > >> >> >>>>>>> > >> >> >>>>>> > >> >> >>>>> > >> >> >>>> >