Deprecated -- certainly and sooner than later. I don't have a good sense of the overhead of continuing to support Python 2; is it large enough to consider dropping it in Spark 3.0?
On Wed, May 29, 2019 at 11:47 PM Xiangrui Meng <men...@gmail.com> wrote: > > Hi all, > > I want to revive this old thread since no action was taken so far. If we plan > to mark Python 2 as deprecated in Spark 3.0, we should do it as early as > possible and let users know ahead. PySpark depends on Python, numpy, pandas, > and pyarrow, all of which are sunsetting Python 2 support by 2020/01/01 per > https://python3statement.org/. At that time we cannot really support Python 2 > because the dependent libraries do not plan to make new releases, even for > security reasons. So I suggest the following: > > 1. Update Spark website and state that Python 2 is deprecated in Spark 3.0 > and its support will be removed in a release after 2020/01/01. > 2. Make a formal announcement to dev@ and users@. > 3. Add Apache Spark project to https://python3statement.org/ timeline. > 4. Update PySpark, check python version and print a deprecation warning if > version < 3. > > Any thoughts and suggestions? > > Best, > Xiangrui --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org