In order to facilitate community testing of Spark 1.6.0, I'm excited to
announce the availability of an early preview of the release. This is not a
release candidate, so there is no voting involved. However, it'd be awesome
if community members can start testing with this preview package and report
any problems they encounter.

This preview package contains all the commits to branch-1.6
<https://github.com/apache/spark/tree/branch-1.6> till commit
308381420f51b6da1007ea09a02d740613a226e0
<https://github.com/apache/spark/tree/v1.6.0-preview2>.

The staging maven repository for this preview build can be found here:
https://repository.apache.org/content/repositories/orgapachespark-1162

Binaries for this preview build can be found here:
http://people.apache.org/~pwendell/spark-releases/spark-v1.6.0-preview2-bin/

A build of the docs can also be found here:
http://people.apache.org/~pwendell/spark-releases/spark-v1.6.0-preview2-docs/

The full change log for this release can be found on JIRA
<https://issues.apache.org/jira/browse/SPARK-11908?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%201.6.0>
.

*== How can you help? ==*

If you are a Spark user, you can help us test this release by taking a
Spark workload and running on this preview release, then reporting any
regressions.

*== Major Features ==*

When testing, we'd appreciate it if users could focus on areas that have
changed in this release.  Some notable new features include:

SPARK-11787 <https://issues.apache.org/jira/browse/SPARK-11787> *Parquet
Performance* - Improve Parquet scan performance when using flat schemas.
SPARK-10810 <https://issues.apache.org/jira/browse/SPARK-10810> *Session *
*Management* - Multiple users of the thrift (JDBC/ODBC) server now have
isolated sessions including their own default database (i.e USE mydb) even
on shared clusters.
SPARK-9999  <https://issues.apache.org/jira/browse/SPARK-9999> *Dataset API* -
A new, experimental type-safe API (similar to RDDs) that performs many
operations on serialized binary data and code generation (i.e. Project
Tungsten)
SPARK-10000 <https://issues.apache.org/jira/browse/SPARK-10000> *Unified
Memory Management* - Shared memory for execution and caching instead of
exclusive division of the regions.
SPARK-10978 <https://issues.apache.org/jira/browse/SPARK-10978> *Datasource
API Avoid Double Filter* - When implementing a datasource with filter
pushdown, developers can now tell Spark SQL to avoid double evaluating a
pushed-down filter.
SPARK-2629  <https://issues.apache.org/jira/browse/SPARK-2629> *New
improved state management* - trackStateByKey - a DStream transformation for
stateful stream processing, supersedes updateStateByKey in functionality
and performance.

Happy testing!

Michael

Reply via email to