Re: [DISCUSS] October board report

Steven Wu Wed, 12 Oct 2022 13:05:35 -0700

Two Iceberg talks in Flink Forward San Francisco 2022

   - Batch Processing at Scale with Flink & Iceberg (Andreas Hailu)
   - Tame the small files problem and optimize data layout for streaming
   ingestion to Iceberg (Steven Wu, Gang Ye)



On Wed, Oct 12, 2022 at 12:28 PM Ryan Blue <[email protected]> wrote:

> Awesome, thanks Szehon!
>
> I'll definitely include these. Any other talks that we should highlight?
>
> On Wed, Oct 12, 2022 at 11:18 AM Szehon Ho <[email protected]>
> wrote:
>
>> Hi Ryan,
>>
>> Do you mention Iceberg-related talks in the board report?  There were
>> four Iceberg talks at ApacheCon2022 (somehow the event schedule is hidden
>> only to participants, not sure why):
>>
>>
>>    - Accelerate Data Lakehouse deployment with Apache Iceberg in
>>    Cloudera Data Platform  (Attila Turoczy, Bill Zhang)
>>    - Apache Iceberg's REST Catalog - A Gateway to Enriching Data Access
>>    via the Simplicity of an HTTP Service (Sam Redai)
>>    - Iceberg's Best Secret: Exploring Metadata Tables (Szehon Ho)
>>    - Integrated Audits: Streamlined Data Observability with Apache
>>    Iceberg (Sam Redai)
>>
>> If not, feel free to ignore.
>> Thanks,
>> Szehon
>>
>>
>>
>> On Wed, Oct 12, 2022 at 9:36 AM Ryan Blue <[email protected]> wrote:
>>
>>> Hi everyone,
>>>
>>> Here’s the board report I just posted. If you have anything to add,
>>> please reply to let me know!
>>> Description:
>>>
>>> Apache Iceberg is a table format for huge analytic datasets that is
>>> designed
>>> for high performance and ease of use.
>>> Issues:
>>>
>>> There are no issues requiring board attention.
>>> Membership Data:
>>>
>>> Apache Iceberg was founded 2020-05-19 (2 years ago)
>>> There are currently 22 committers and 12 PMC members in this project.
>>> The Committer-to-PMC ratio is roughly 2:1.
>>>
>>> Community changes, past quarter:
>>>
>>>    - No new PMC members. Last addition was Jack Ye on 2021-11-14.
>>>    - Fokko Driesprong was added as committer on 2022-08-21
>>>    - Steven Wu was added as committer on 2022-10-07
>>>    - Yufei Gu was added as committer on 2022-08-25
>>>
>>> Project Activity:
>>>
>>> The community had 2 releases in the 0.14.x line and an initial Python
>>> release,
>>> 0.1.0. In addition, the vote for a 1.0.0 release is currently passing.
>>>
>>> The Python release is the result of significant community effort and
>>> includes
>>> a new CLI utility (pyiceberg), support for Hive and REST catalogs, and
>>> the
>>> ability to read table metadata. The next goal is a 0.2.0 release that
>>> can handle
>>> query planning to enable reads in Python and Python-based engines.
>>>
>>> The 1.0.0 JVM release adds API guarantees to the API module, but is
>>> closely
>>> based on 0.14.1 to make transitioning to a new major version simple.
>>>
>>> Next, the community is preparing a 1.1.0 release with significant new
>>> updates:
>>>
>>>    - The ability to read and write table branches
>>>    - Scan metrics reporting
>>>    - Support for Spark FunctionCatalog
>>>    - FLIP-27 reader support in Flink SQL
>>>    - Z-order support when rewriting or compacting data files
>>>    - Support for Puffin stats in table metadata
>>>
>>> Community Health:
>>>
>>> The community continues to be healthy in terms of commits. The number of
>>> unique contributors decreased slightly, which indicates the community
>>> should
>>> ensure pull requests from contributors are getting enough attention.
>>>
>>> The increase of issues closed is due to setting up a stale issues bot to
>>> help
>>> keep issues fresh and relevant. The community also added issue templates
>>> to
>>> make bug reports and feature requests better and more clear.
>>>
>>> --
>>> Ryan Blue
>>>
>>
>
> --
> Ryan Blue
> Tabular
>

Re: [DISCUSS] October board report

Reply via email to