Reynold, Considering the performance improvements you mentioned in your original e-mail and also considering that few other big data projects have already or are in progress of abandoning JDK 7, I think it would benefit Spark if we go with JDK 8.0 only.
Are there users that will be less aggressive ? Yes, but those would most likely be in more stable releases like 1.6.x. On Sun, Apr 3, 2016 at 10:28 PM, Reynold Xin <r...@databricks.com> wrote: > Since my original email, I've talked to a lot more users and looked at > what various environments support. It is true that a lot of enterprises, > and even some technology companies, are still using Java 7. One thing is > that up until this date, users still can't install openjdk 8 on Ubuntu by > default. I see that as an indication that it is too early to drop Java 7. > > Looking at the timeline, JDK release a major new version roughly every 3 > years. We dropped Java 6 support one year ago, so from a timeline point of > view we would be very aggressive here if we were to drop Java 7 support in > Spark 2.0. > > Note that not dropping Java 7 support now doesn't mean we have to support > Java 7 throughout Spark 2.x. We dropped Java 6 support in Spark 1.5, even > though Spark 1.0 started with Java 6. > > In terms of testing, Josh has actually improved our test infra so now we > would run the Java 8 tests: https://github.com/apache/spark/pull/12073 > > > > > On Thu, Mar 24, 2016 at 8:51 PM, Liwei Lin <lwl...@gmail.com> wrote: > >> Arguments are really convincing; new Dataset API as well as performance >> >> improvements is exiting, so I'm personally +1 on moving onto Java8. >> >> >> >> However, I'm afraid Tencent is one of "the organizations stuck with >> Java7" >> >> -- our IT Infra division wouldn't upgrade to Java7 until Java8 is out, and >> >> wouldn't upgrade to Java8 until Java9 is out. >> >> >> So: >> >> (non-binding) +1 on dropping scala 2.10 support >> >> (non-binding) -1 on dropping Java 7 support >> >> * as long as we figure out a practical way to run >> Spark with >> >> JDK8 on JDK7 clusters, this -1 would then >> definitely be +1 >> >> >> Thanks ! >> >> On Fri, Mar 25, 2016 at 10:28 AM, Koert Kuipers <ko...@tresata.com> >> wrote: >> >>> i think that logic is reasonable, but then the same should also apply to >>> scala 2.10, which is also unmaintained/unsupported at this point (basically >>> has been since march 2015 except for one hotfix due to a license >>> incompatibility) >>> >>> who wants to support scala 2.10 three years after they did the last >>> maintenance release? >>> >>> >>> On Thu, Mar 24, 2016 at 9:59 PM, Mridul Muralidharan <mri...@gmail.com> >>> wrote: >>> >>>> Removing compatibility (with jdk, etc) can be done with a major >>>> release- given that 7 has been EOLed a while back and is now unsupported, >>>> we have to decide if we drop support for it in 2.0 or 3.0 (2+ years from >>>> now). >>>> >>>> Given the functionality & performance benefits of going to jdk8, future >>>> enhancements relevant in 2.x timeframe ( scala, dependencies) which >>>> requires it, and simplicity wrt code, test & support it looks like a good >>>> checkpoint to drop jdk7 support. >>>> >>>> As already mentioned in the thread, existing yarn clusters are >>>> unaffected if they want to continue running jdk7 and yet use >>>> spark2 (install jdk8 on all nodes and use it via JAVA_HOME, or worst case >>>> distribute jdk8 as archive - suboptimal). >>>> I am unsure about mesos (standalone might be easier upgrade I guess ?). >>>> >>>> >>>> Proposal is for 1.6x line to continue to be supported with critical >>>> fixes; newer features will require 2.x and so jdk8 >>>> >>>> Regards >>>> Mridul >>>> >>>> >>>> On Thursday, March 24, 2016, Marcelo Vanzin <van...@cloudera.com> >>>> wrote: >>>> >>>>> On Thu, Mar 24, 2016 at 4:50 PM, Reynold Xin <r...@databricks.com> >>>>> wrote: >>>>> > If you want to go down that route, you should also ask somebody who >>>>> has had >>>>> > experience managing a large organization's applications and try to >>>>> update >>>>> > Scala version. >>>>> >>>>> I understand both sides. But if you look at what I've been asking >>>>> since the beginning, it's all about the cost and benefits of dropping >>>>> support for java 1.7. >>>>> >>>>> The biggest argument in your original e-mail is about testing. And the >>>>> testing cost is much bigger for supporting scala 2.10 than it is for >>>>> supporting java 1.7. If you read one of my earlier replies, it should >>>>> be even possible to just do everything in a single job - compile for >>>>> java 7 and still be able to test things in 1.8, including lambdas, >>>>> which seems to be the main thing you were worried about. >>>>> >>>>> >>>>> > On Thu, Mar 24, 2016 at 4:48 PM, Marcelo Vanzin <van...@cloudera.com> >>>>> wrote: >>>>> >> >>>>> >> On Thu, Mar 24, 2016 at 4:46 PM, Reynold Xin <r...@databricks.com> >>>>> wrote: >>>>> >> > Actually it's *way* harder to upgrade Scala from 2.10 to 2.11, >>>>> than >>>>> >> > upgrading the JVM runtime from 7 to 8, because Scala 2.10 and >>>>> 2.11 are >>>>> >> > not >>>>> >> > binary compatible, whereas JVM 7 and 8 are binary compatible >>>>> except >>>>> >> > certain >>>>> >> > esoteric cases. >>>>> >> >>>>> >> True, but ask anyone who manages a large cluster how long it would >>>>> >> take them to upgrade the jdk across their cluster and validate all >>>>> >> their applications and everything... binary compatibility is a tiny >>>>> >> drop in that bucket. >>>>> >> >>>>> >> -- >>>>> >> Marcelo >>>>> > >>>>> > >>>>> >>>>> >>>>> >>>>> -- >>>>> Marcelo >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >>>>> For additional commands, e-mail: dev-h...@spark.apache.org >>>>> >>>>> >>> >> > -- Luciano Resende http://twitter.com/lresende1975 http://lresende.blogspot.com/