+1 Agree, dropping support for java 7 is long overdue - and 2.0 would be a logical release to do this on.
Regards, Mridul On Thu, Mar 24, 2016 at 12:27 AM, Reynold Xin <r...@databricks.com> wrote: > About a year ago we decided to drop Java 6 support in Spark 1.5. I am > wondering if we should also just drop Java 7 support in Spark 2.0 (i.e. > Spark 2.0 would require Java 8 to run). > > Oracle ended public updates for JDK 7 in one year ago (Apr 2015), and > removed public downloads for JDK 7 in July 2015. In the past I've actually > been against dropping Java 8, but today I ran into an issue with the new > Dataset API not working well with Java 8 lambdas, and that changed my > opinion on this. > > I've been thinking more about this issue today and also talked with a lot > people offline to gather feedback, and I actually think the pros outweighs > the cons, for the following reasons (in some rough order of importance): > > 1. It is complicated to test how well Spark APIs work for Java lambdas if we > support Java 7. Jenkins machines need to have both Java 7 and Java 8 > installed and we must run through a set of test suites in 7, and then the > lambda tests in Java 8. This complicates build environments/scripts, and > makes them less robust. Without good testing infrastructure, I have no > confidence in building good APIs for Java 8. > > 2. Dataset/DataFrame performance will be between 1x to 10x slower in Java 7. > The primary APIs we want users to use in Spark 2.x are Dataset/DataFrame, > and this impacts pretty much everything from machine learning to structured > streaming. We have made great progress in their performance through > extensive use of code generation. (In many dimensions Spark 2.0 with > DataFrames/Datasets looks more like a compiler than a MapReduce or query > engine.) These optimizations don't work well in Java 7 due to broken code > cache flushing. This problem has been fixed by Oracle in Java 8. In > addition, Java 8 comes with better support for Unsafe and SIMD. > > 3. Scala 2.12 will come out soon, and we will want to add support for that. > Scala 2.12 only works on Java 8. If we do support Java 7, we'd have a fairly > complicated compatibility matrix and testing infrastructure. > > 4. There are libraries that I've looked into in the past that support only > Java 8. This is more common in high performance libraries such as Aeron (a > messaging library). Having to support Java 7 means we are not able to use > these. It is not that big of a deal right now, but will become increasingly > more difficult as we optimize performance. > > > The downside of not supporting Java 7 is also obvious. Some organizations > are stuck with Java 7, and they wouldn't be able to use Spark 2.0 without > upgrading Java. > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org