It’s time for our quarterly ASF board report on Apache Spark this Wednesday. Here’s a draft, feel free to suggest changes.
==================== Description: Apache Spark is a fast and general purpose engine for large-scale data processing. It offers high-level APIs in Java, Scala, Python, R and SQL as well as a rich set of libraries including stream processing, machine learning, and graph analytics. Issues for the board: - None Project status: - We made two patch releases: Spark 3.5.1 on February 28, 2024, and Spark 3.4.2 on April 18, 2024. - The votes on "SPIP: Structured Logging Framework for Apache Spark" and "Pure Python Package in PyPI (Spark Connect)" have passed. - The votes for two behavior changes have passed: "SPARK-44444: Use ANSI SQL mode by default" and "SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false". - The community decided that upcoming Spark 4.0 release will drop support for Python 3.8. - We started a discussion about the definition of behavior changes that is critical for version upgrades and user experience. - We've opened a dedicated repository for the Spark Kubernetes Operator at https://github.com/apache/spark-kubernetes-operator. We added a new version in Apache Spark JIRA for versioning of the Spark operator based on a vote result. Trademarks: - No changes since the last report. Latest releases: - Spark 3.4.3 was released on April 18, 2024 - Spark 3.5.1 was released on February 28, 2024 - Spark 3.3.4 was released on December 16, 2023 Committers and PMC: - The latest committer was added on Oct 2nd, 2023 (Jiaan Geng). - The latest PMC members were added on Oct 2nd, 2023 (Yuanjian Li and Yikun Jiang). ==================== --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org