Bryan, was there an actual change when to drop Python 3.4 in PyArrow? If not, I think it might be possible that we can increase the minimal Arrow version separately. If there was, it looks inevitable to upgrade Jenkins\s Python from 3.4 to 3.5.
2019년 3월 29일 (금) 오전 1:39, Felix Cheung <felixcheun...@hotmail.com>님이 작성: > That’s not necessarily bad. I don’t know if we have plan to ever release > any new 2.2.x, 2.3.x at this point and we can message this “supported > version” of python change for any new 2.4 release. > > Besides we could still support python 3.4 - it’s just more complicated to > test manually without Jenkins coverage. > > > ------------------------------ > *From:* shane knapp <skn...@berkeley.edu> > *Sent:* Tuesday, March 26, 2019 12:11 PM > *To:* Bryan Cutler > *Cc:* dev > *Subject:* Re: Upgrading minimal PyArrow version to 0.12.x [SPARK-27276] > > i'm pretty certain that i've got a solid python 3.5 conda environment > ready to be deployed, but this isn't a minor change to the build system and > there might be some bugs to iron out. > > another problem is that the current python 3.4 environment is hard-coded > in to the both the build scripts on jenkins (all over the place) and in the > codebase (thankfully in only one spot): export > PATH=/home/anaconda/envs/py3k/bin:$PATH > > this means that every branch (master, 2.x, etc) will test against whatever > version of python lives in that conda environment. if we upgrade to 3.5, > all branches will test against this version. changing the build and test > infra to support testing against 2.7, 3.4 or 3.5 based on branch is > definitely non-trivial... > > thoughts? > > > > > On Tue, Mar 26, 2019 at 11:39 AM Bryan Cutler <cutl...@gmail.com> wrote: > >> Thanks Hyukjin. The plan is to get this done for 3.0 only. Here is a >> link to the JIRA https://issues.apache.org/jira/browse/SPARK-27276. >> Shane is also correct in that newer versions of pyarrow have stopped >> support for Python 3.4, so we should probably have Jenkins test against 2.7 >> and 3.5. >> >> On Mon, Mar 25, 2019 at 9:44 PM Reynold Xin <r...@databricks.com> wrote: >> >>> +1 on doing this in 3.0. >>> >>> >>> On Mon, Mar 25, 2019 at 9:31 PM, Felix Cheung <felixcheun...@hotmail.com >>> > wrote: >>> >>>> I’m +1 if 3.0 >>>> >>>> >>>> ------------------------------ >>>> *From:* Sean Owen <sro...@gmail.com> >>>> *Sent:* Monday, March 25, 2019 6:48 PM >>>> *To:* Hyukjin Kwon >>>> *Cc:* dev; Bryan Cutler; Takuya UESHIN; shane knapp >>>> *Subject:* Re: Upgrading minimal PyArrow version to 0.12.x >>>> [SPARK-27276] >>>> >>>> I don't know a lot about Arrow here, but seems reasonable. Is this for >>>> Spark 3.0 or for 2.x? Certainly, requiring the latest for Spark 3 >>>> seems right. >>>> >>>> On Mon, Mar 25, 2019 at 8:17 PM Hyukjin Kwon <gurwls...@gmail.com> >>>> wrote: >>>> > >>>> > Hi all, >>>> > >>>> > We really need to upgrade the minimal version soon. It's actually >>>> slowing down the PySpark dev, for instance, by the overhead that sometimes >>>> we need currently to test all multiple matrix of Arrow and Pandas. Also, it >>>> currently requires to add some weird hacks or ugly codes. Some bugs exist >>>> in lower versions, and some features are not supported in low PyArrow, for >>>> instance. >>>> > >>>> > Per, (Apache Arrow'+ Spark committer FWIW), Bryan's recommendation >>>> and my opinion as well, we should better increase the minimal version to >>>> 0.12.x. (Also, note that Pandas <> Arrow is an experimental feature). >>>> > >>>> > So, I and Bryan will proceed this roughly in few days if there isn't >>>> objections assuming we're fine with increasing it to 0.12.x. Please let me >>>> know if there are some concerns. >>>> > >>>> > For clarification, this requires some jobs in Jenkins to upgrade the >>>> minimal version of PyArrow (I cc'ed Shane as well). >>>> > >>>> > PS: I roughly heard that Shane's busy for some work stuff .. but it's >>>> kind of important in my perspective. >>>> > >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>> >>> >>> > > -- > Shane Knapp > UC Berkeley EECS Research / RISELab Staff Technical Lead > https://rise.cs.berkeley.edu >