Thanks Dongjoon. That makes much more sense now!

2020년 7월 3일 (금) 오전 12:11, Dongjoon Hyun <dongjoon.h...@gmail.com>님이 작성:

> Thank you, Hyukjin.
>
> According to the Python community, Python 3.5 is also EOF at 2020-09-13
> (only two months left).
>
> - https://www.python.org/downloads/
>
> So, targeting live Python versions at Apache Spark 3.1.0 (December 2020)
> looks reasonable to me.
>
> For old Python versions, we still have Apache Spark 2.4 LTS and also
> Apache Spark 3.0.x will work.
>
> Bests,
> Dongjoon.
>
>
> On Wed, Jul 1, 2020 at 10:50 PM Yuanjian Li <xyliyuanj...@gmail.com>
> wrote:
>
>> +1, especially Python 2
>>
>> Holden Karau <hol...@pigscanfly.ca> 于2020年7月2日周四 上午10:20写道:
>>
>>> I’m ok with us dropping Python 2, 3.4, and 3.5 in Spark 3.1 forward. It
>>> will be exciting to get to use more recent Python features. The most recent
>>> Ubuntu LTS ships with 3.7, and while the previous LTS ships with 3.5, if
>>> folks really can’t upgrade there’s conda.
>>>
>>> Is there anyone with a large Python 3.5 fleet who can’t use conda?
>>>
>>> On Wed, Jul 1, 2020 at 7:15 PM Hyukjin Kwon <gurwls...@gmail.com> wrote:
>>>
>>>> Yeah, sure. It will be dropped at Spark 3.1 onwards. I don't think we
>>>> should make such changes in maintenance releases
>>>>
>>>> 2020년 7월 2일 (목) 오전 11:13, Holden Karau <hol...@pigscanfly.ca>님이 작성:
>>>>
>>>>> To be clear the plan is to drop them in Spark 3.1 onwards, yes?
>>>>>
>>>>> On Wed, Jul 1, 2020 at 7:11 PM Hyukjin Kwon <gurwls...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I would like to discuss dropping deprecated Python versions 2, 3.4
>>>>>> and 3.5 at https://github.com/apache/spark/pull/28957. I assume
>>>>>> people support it in general
>>>>>> but I am writing this to make sure everybody is happy.
>>>>>>
>>>>>> Fokko made a very good investigation on it, see
>>>>>> https://github.com/apache/spark/pull/28957#issuecomment-652022449.
>>>>>> Assuming from the statistics, I think we're pretty safe to drop them.
>>>>>> Also note that dropping Python 2 was actually declared at
>>>>>> https://python3statement.org/
>>>>>>
>>>>>> Roughly speaking, there are many main advantages by dropping them:
>>>>>>   1. It removes a bunch of hacks we added around 700 lines in PySpark.
>>>>>>   2. PyPy2 has a critical bug that causes a flaky test,
>>>>>> https://issues.apache.org/jira/browse/SPARK-28358 given my testing
>>>>>> and investigation.
>>>>>>   3. Users can use Python type hints with Pandas UDFs without
>>>>>> thinking about Python version
>>>>>>   4. Users can leverage one latest cloudpickle,
>>>>>> https://github.com/apache/spark/pull/28950. With Python 3.8+ it can
>>>>>> also leverage C pickle.
>>>>>>   5. ...
>>>>>>
>>>>>> So it benefits both users and dev. WDYT guys?
>>>>>>
>>>>>>
>>>>>> --
>>>>> Twitter: https://twitter.com/holdenkarau
>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>
>>>> --
>>> Twitter: https://twitter.com/holdenkarau
>>> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>
>>

Reply via email to