This is a continuation of the previous thread, `Apache Spark 3.2 Expectation`, 
in order to give you updates.

- 
https://lists.apache.org/thread.html/r61897da071729913bf586ddd769311ce8b5b068e7156c352b51f7a33%40%3Cdev.spark.apache.org%3E

First of all, the AS-IS schedule is here

- https://spark.apache.org/versioning-policy.html

  July 1st Code freeze. Release branch cut.
  Mid July QA period. Focus on bug fixes, tests, stability and docs. Generally, 
no new features merged.
  August   Release candidates (RC), voting, etc. until final release passes

Second, Gengliang Wang volunteered as a release manager and started to work as 
a release manager. Thank you! He shared the on-going issues and I want to 
piggy-back the followings to his list.


# Languages

- Scala 2.13 Support: Although SPARK-25075 is almost done and we have Scala 
2.13 Jenkins job on master branch, we do not support Scala 2.13.6. We should 
document it if Scala 2.13.7 is not arrived on time.
  Please see https://github.com/scala/scala/pull/9641 (Milestone Scala 2.13.7).

- SparkR CRAN publishing: Apache SparkR 3.1.2 is in CRAN as of today, but we 
get policy violation warnings for cache directory. The fix deadline is 
2021-06-28. If that's going to be removed again, we need to retry via Apache 
Spark 3.2.0 after making some fix.
  https://cran.r-project.org/web/packages/SparkR/index.html


# Dependencies

- Apache Hadoop 3.3.2 becomes the default Hadoop profile for Apache Spark 3.2 
via SPARK-29250 today. We are observing big improvements in S3 use cases. 
Please try it and share your experience.

- Apache Hive 2.3.9 becomes the built-in Hive library with more HMS 
compatibility fixes recently. We need re-evaluate the previous HMS 
incompatibility reports.

- K8s 1.21 is released May 12th. K8s Client 5.4.1 supports it in Apache Spark 
3.2. In addition, public cloud vendors start to support K8s 1.20. Please note 
that this is a breaking K8s API change from K8s Client 4.x to 5.x.

- SPARK-33913 upgraded Apache Kafka Client dependency to 2.8.0 and Kafka 
community is considering the deprecation of Scala 2.12 support at Apache Kafka 
3.0.

- SPARK-34542 upgraded Apache Parquet dependency to 1.12.0. However, we need 
SPARK-34859 to fix column index issue before release. In addition, Apache 
Parquet encryption is added as a developer API. Custom KMS client should be 
implemented.

- SPARK-35489 upgraded Apache ORC dependency to 1.6.8. We still need ORC-804 
for better masking feature additionally.

- SPARK-34651 improved ZStandard support with ZStandard 1.4.9 and we are 
currently evaluating newly arrived ZStandard 1.5.0 additionally. Currently, 
JDK11 performance is under investigation. In addition, SPARK-35181 (Use zstd 
for spark.io.compression.codec by default) is still on the way seperately.


# Newly arrived items

- SPARK-35779 Dynamic filtering for Data Source V2

- SPARK-35781 Support Spark on Apple Silicon on macOS natively

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to