+1, yes Java 7 has been end of life for a year now, 2.0 is a good time to
upgrade to Java 8

On Thu, Mar 24, 2016 at 12:42 AM, Raymond Honderdors <
raymond.honderd...@sizmek.com> wrote:

> Very good points
>
>
>
> Going to support java 8 looks like a good direction
>
> 2.0 would be a good release to start with that
>
>
>
> *Raymond Honderdors *
>
> *Team Lead Analytics BI*
>
> *Business Intelligence Developer *
>
> *raymond.honderd...@sizmek.com <raymond.honderd...@sizmek.com> *
>
> *T +972.7325.3569*
>
> *Herzliya*
>
>
>
> *From:* Reynold Xin [mailto:r...@databricks.com]
> *Sent:* Thursday, March 24, 2016 9:37 AM
> *To:* dev@spark.apache.org
> *Subject:* Re: [discuss] ending support for Java 7 in Spark 2.0
>
>
>
> One other benefit that I didn't mention is that we'd be able to use Java
> 8's Optional class to replace our built-in Optional.
>
>
>
>
>
> On Thu, Mar 24, 2016 at 12:27 AM, Reynold Xin <r...@databricks.com> wrote:
>
> About a year ago we decided to drop Java 6 support in Spark 1.5. I am
> wondering if we should also just drop Java 7 support in Spark 2.0 (i.e.
> Spark 2.0 would require Java 8 to run).
>
>
>
> Oracle ended public updates for JDK 7 in one year ago (Apr 2015), and
> removed public downloads for JDK 7 in July 2015. In the past I've actually
> been against dropping Java 8, but today I ran into an issue with the new
> Dataset API not working well with Java 8 lambdas, and that changed my
> opinion on this.
>
>
>
> I've been thinking more about this issue today and also talked with a lot
> people offline to gather feedback, and I actually think the pros outweighs
> the cons, for the following reasons (in some rough order of importance):
>
>
>
> 1. It is complicated to test how well Spark APIs work for Java lambdas if
> we support Java 7. Jenkins machines need to have both Java 7 and Java 8
> installed and we must run through a set of test suites in 7, and then the
> lambda tests in Java 8. This complicates build environments/scripts, and
> makes them less robust. Without good testing infrastructure, I have no
> confidence in building good APIs for Java 8.
>
>
>
> 2. Dataset/DataFrame performance will be between 1x to 10x slower in Java
> 7. The primary APIs we want users to use in Spark 2.x are
> Dataset/DataFrame, and this impacts pretty much everything from machine
> learning to structured streaming. We have made great progress in their
> performance through extensive use of code generation. (In many dimensions
> Spark 2.0 with DataFrames/Datasets looks more like a compiler than a
> MapReduce or query engine.) These optimizations don't work well in Java 7
> due to broken code cache flushing. This problem has been fixed by Oracle in
> Java 8. In addition, Java 8 comes with better support for Unsafe and SIMD.
>
>
>
> 3. Scala 2.12 will come out soon, and we will want to add support for
> that. Scala 2.12 only works on Java 8. If we do support Java 7, we'd have a
> fairly complicated compatibility matrix and testing infrastructure.
>
>
>
> 4. There are libraries that I've looked into in the past that support only
> Java 8. This is more common in high performance libraries such as Aeron (a
> messaging library). Having to support Java 7 means we are not able to use
> these. It is not that big of a deal right now, but will become increasingly
> more difficult as we optimize performance.
>
>
>
>
>
> The downside of not supporting Java 7 is also obvious. Some organizations
> are stuck with Java 7, and they wouldn't be able to use Spark 2.0 without
> upgrading Java.
>
>
>
>
>
>
>



-- 
Ram Sriharsha
Architect, Spark and Data Science
Hortonworks, 2550 Great America Way, 2nd Floor
Santa Clara, CA 95054
Ph: 408-510-8635
email: har...@apache.org

[image: https://www.linkedin.com/in/harsha340]
<https://www.linkedin.com/in/harsha340> <https://twitter.com/halfabrane>
<https://github.com/harsha2010/>

Reply via email to