Hi everyone, Here's my draft report for July. Feel free to comment and suggest updates that I've missed. Thanks!
rb ## Description: Apache Iceberg is a table format for huge analytic datasets that is designed for high performance and ease of use. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache Iceberg was founded 2020-05-19 (2 months ago) There are currently 9 committers and 9 PMC members in this project. The Committer-to-PMC ratio is 1:1. Community changes, past quarter: - No new PMC members (project graduated recently). - No new committers were added. ## Project Activity: In July, the community held one sync meeting to discuss general topics, and one specifically to discuss how to include both groups that have been working on integration with Hive. To address the question on the last board report, the community sync meetings are video conferences that anyone in the community is welcome to attend. The discussion is documented and summarized for anyone that can't attend. We have found these to be a good way to exchange context and ideas more quickly, but recognize that this isn't the best way for some people to participate and so we don't consider these a forum for making decisions or voting. If we come to a tentative conclusion on a topic, it is still open for further discussion on the dev list. The idea for this comes from the Parquet community that has been doing this for several years. Development activity: * Spark vectorized reads for flat schemas was merged and benchmarked * The Spark 3 integration branch was merged into master * Name mapping for Parquet files without IDs was committed * And action to compact data files was added * Support was added for managing and adding delete files in table metadata * Refactoring to support reuse Spark components for Flink * Several PRs for Flink support have been committed and more are open * CI tests for JDK 11 have been added The community also plans to release 0.9.0 with Spark 3 support soon. ## Community Health: Most community metrics have again increased in the last month, although dev list traffic is a bit lower. More importantly, the community has made further progress on several large areas with different groups leading the efforts, like Hive support, Spark 3 support, and Flink support. -- Ryan Blue Software Engineer Netflix