Hi everyone,

Here's my draft report for July. Feel free to comment and suggest updates
that I've missed. Thanks!

rb

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Iceberg was founded 2020-05-19 (2 months ago)
There are currently 9 committers and 9 PMC members in this project.
The Committer-to-PMC ratio is 1:1.

Community changes, past quarter:
- No new PMC members (project graduated recently).
- No new committers were added.

## Project Activity:
In July, the community held one sync meeting to discuss general topics, and
one specifically to discuss how to include both groups that have been
working
on integration with Hive.

To address the question on the last board report, the community sync
meetings
are video conferences that anyone in the community is welcome to attend. The
discussion is documented and summarized for anyone that can't attend. We
have
found these to be a good way to exchange context and ideas more quickly, but
recognize that this isn't the best way for some people to participate and so
we don't consider these a forum for making decisions or voting. If we come
to
a tentative conclusion on a topic, it is still open for further discussion
on the dev list. The idea for this comes from the Parquet community that has
been doing this for several years.

Development activity:
* Spark vectorized reads for flat schemas was merged and benchmarked
* The Spark 3 integration branch was merged into master
* Name mapping for Parquet files without IDs was committed
* And action to compact data files was added
* Support was added for managing and adding delete files in table metadata
* Refactoring to support reuse Spark components for Flink
* Several PRs for Flink support have been committed and more are open
* CI tests for JDK 11 have been added

The community also plans to release 0.9.0 with Spark 3 support soon.

## Community Health:
Most community metrics have again increased in the last month, although dev
list traffic is a bit lower. More importantly, the community has made
further
progress on several large areas with different groups leading the efforts,
like Hive support, Spark 3 support, and Flink support.

-- 
Ryan Blue
Software Engineer
Netflix

Reply via email to