I totally agree with Sean, just a small correction:
Java 7 and Python 2.6 are already deprecated since Spark 2.0 (after a
lengthy discussion), so there is no need to discuss whether they should
become deprecated in 2.1
  http://spark.apache.org/releases/spark-release-2-0-0.html#deprecations
The discussion is whether Scala 2.10 should also be marked as deprecated
(no one is objecting that), and more importantly, when to actually move
from deprecation to actually dropping support for any combination of JDK /
Scala / Hadoop / Python.

Ofir Manor

Co-Founder & CTO | Equalum

Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io

On Fri, Oct 28, 2016 at 12:13 AM, Sean Owen <so...@cloudera.com> wrote:

> The burden may be a little more apparent when dealing with the day to day
> merging and fixing of breaks. The upside is maybe the more compelling
> argument though. For example, lambda-fying all the Java code, supporting
> java.time, and taking advantage of some newer Hadoop/YARN APIs is a
> moderate win for users too, and there's also a cost to not doing that.
>
> I must say I don't see a risk of fragmentation as nearly the problem it's
> made out to be here. We are, after all, here discussing _beginning_ to
> remove support _in 6 months_, for long since non-current versions of
> things. An org's decision to not, say, use Java 8 is a decision to not use
> the new version of lots of things. It's not clear this is a constituency
> that is either large or one to reasonably serve indefinitely.
>
> In the end, the Scala issue may be decisive. Supporting 2.10 - 2.12
> simultaneously is a bridge too far, and if 2.12 requires Java 8, it's a
> good reason to for Spark to require Java 8. And Steve suggests that means a
> minimum of Hadoop 2.6 too. (I still profess ignorance of the Python part of
> the issue.)
>
> Put another way I am not sure what the criteria is, if not the above?
>
> I support deprecating all of these things, at the least, in 2.1.0.
> Although it's a separate question, I believe it's going to be necessary to
> remove support in ~6 months in 2.2.0.
>
>
> On Thu, Oct 27, 2016 at 4:36 PM Matei Zaharia <matei.zaha...@gmail.com>
> wrote:
>
>> Just to comment on this, I'm generally against removing these types of
>> things unless they create a substantial burden on project contributors. It
>> doesn't sound like Python 2.6 and Java 7 do that yet -- Scala 2.10 might,
>> but then of course we need to wait for 2.12 to be out and stable.
>>
>> In general, this type of stuff only hurts users, and doesn't have a huge
>> impact on Spark contributors' productivity (sure, it's a bit unpleasant,
>> but that's life). If we break compatibility this way too quickly, we
>> fragment the user community, and then either people have a crappy
>> experience with Spark because their corporate IT doesn't yet have an
>> environment that can run the latest version, or worse, they create more
>> maintenance burden for us because they ask for more patches to be
>> backported to old Spark versions (1.6.x, 2.0.x, etc). Python in particular
>> is pretty fundamental to many Linux distros.
>>
>> In the future, rather than just looking at when some software came out,
>> it may be good to have some criteria for when to drop support for
>> something. For example, if there are really nice libraries in Python 2.7 or
>> Java 8 that we're missing out on, that may be a good reason. The
>> maintenance burden for multiple Scala versions is definitely painful but I
>> also think we should always support the latest two Scala releases.
>>
>> Matei
>>
>> On Oct 27, 2016, at 12:15 PM, Reynold Xin <r...@databricks.com> wrote:
>>
>> I created a JIRA ticket to track this: https://issues.apache.
>> org/jira/browse/SPARK-18138
>>
>>
>>
>> On Thu, Oct 27, 2016 at 10:19 AM, Steve Loughran <ste...@hortonworks.com>
>> wrote:
>>
>>
>> On 27 Oct 2016, at 10:03, Sean Owen <so...@cloudera.com> wrote:
>>
>> Seems OK by me.
>> How about Hadoop < 2.6, Python 2.6? Those seem more removeable. I'd like
>> to add that to a list of things that will begin to be unsupported 6 months
>> from now.
>>
>>
>> If you go to java 8 only, then hadoop 2.6+ is mandatory.
>>
>>
>> On Wed, Oct 26, 2016 at 8:49 PM Koert Kuipers <ko...@tresata.com> wrote:
>>
>> that sounds good to me
>>
>> On Wed, Oct 26, 2016 at 2:26 PM, Reynold Xin <r...@databricks.com> wrote:
>>
>> We can do the following concrete proposal:
>>
>> 1. Plan to remove support for Java 7 / Scala 2.10 in Spark 2.2.0 (Mar/Apr
>> 2017).
>>
>> 2. In Spark 2.1.0 release, aggressively and explicitly announce the
>> deprecation of Java 7 / Scala 2.10 support.
>>
>> (a) It should appear in release notes, documentations that mention how to
>> build Spark
>>
>> (b) and a warning should be shown every time SparkContext is started
>> using Scala 2.10 or Java 7.
>>
>>
>>
>>
>>

Reply via email to