This is a continuation of the previous thread, `Apache Spark 3.2 Expectation`, in order to give you updates.
- https://lists.apache.org/thread.html/r61897da071729913bf586ddd769311ce8b5b068e7156c352b51f7a33%40%3Cdev.spark.apache.org%3E First of all, the AS-IS schedule is here - https://spark.apache.org/versioning-policy.html July 1st Code freeze. Release branch cut. Mid July QA period. Focus on bug fixes, tests, stability and docs. Generally, no new features merged. August Release candidates (RC), voting, etc. until final release passes Second, Gengliang Wang volunteered as a release manager and started to work as a release manager. Thank you! He shared the on-going issues and I want to piggy-back the followings to his list. # Languages - Scala 2.13 Support: Although SPARK-25075 is almost done and we have Scala 2.13 Jenkins job on master branch, we do not support Scala 2.13.6. We should document it if Scala 2.13.7 is not arrived on time. Please see https://github.com/scala/scala/pull/9641 (Milestone Scala 2.13.7). - SparkR CRAN publishing: Apache SparkR 3.1.2 is in CRAN as of today, but we get policy violation warnings for cache directory. The fix deadline is 2021-06-28. If that's going to be removed again, we need to retry via Apache Spark 3.2.0 after making some fix. https://cran.r-project.org/web/packages/SparkR/index.html # Dependencies - Apache Hadoop 3.3.2 becomes the default Hadoop profile for Apache Spark 3.2 via SPARK-29250 today. We are observing big improvements in S3 use cases. Please try it and share your experience. - Apache Hive 2.3.9 becomes the built-in Hive library with more HMS compatibility fixes recently. We need re-evaluate the previous HMS incompatibility reports. - K8s 1.21 is released May 12th. K8s Client 5.4.1 supports it in Apache Spark 3.2. In addition, public cloud vendors start to support K8s 1.20. Please note that this is a breaking K8s API change from K8s Client 4.x to 5.x. - SPARK-33913 upgraded Apache Kafka Client dependency to 2.8.0 and Kafka community is considering the deprecation of Scala 2.12 support at Apache Kafka 3.0. - SPARK-34542 upgraded Apache Parquet dependency to 1.12.0. However, we need SPARK-34859 to fix column index issue before release. In addition, Apache Parquet encryption is added as a developer API. Custom KMS client should be implemented. - SPARK-35489 upgraded Apache ORC dependency to 1.6.8. We still need ORC-804 for better masking feature additionally. - SPARK-34651 improved ZStandard support with ZStandard 1.4.9 and we are currently evaluating newly arrived ZStandard 1.5.0 additionally. Currently, JDK11 performance is under investigation. In addition, SPARK-35181 (Use zstd for spark.io.compression.codec by default) is still on the way seperately. # Newly arrived items - SPARK-35779 Dynamic filtering for Data Source V2 - SPARK-35781 Support Spark on Apple Silicon on macOS natively --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org