Iceberg 0.11.0 Release Plan

Ye, Jack Wed, 13 Jan 2021 20:16:12 -0800

Hi everyone,

This is Jack Ye from AWS, and I will be the release manager for Iceberg 0.11.0. 
The purpose of this email is to start the preparation of this release.


Overview

We have discussed with groups of people working on different areas of the 
codebase, and the current plan is to have the initial branch cut by the end of 
01/20/2021 PST.
Starting now, we will focus on code reviews based on priority defined in the 
next two sections.
Any help for the code reviews would be greatly appreciated!

If you think your PR is required or good to have (and close to be done) in this 
release train, please reply to this email thread so that we can evaluate the 
content and situation.

The information will also be tracked at 
https://github.com/apache/iceberg/milestone/12.
If you have any question around the release, feel free to contact me through 
email or slack.

Required Pull Requests

Here is a list of PRs that are currently considered as required but not merged 
yet AOD 1/13/2020:

Core:

  1.  Fix date and timestamp transforms 
(https://github.com/apache/iceberg/pull/1981)
  2.  Handle NaN as min/max stats in evaluators 
(https://github.com/apache/iceberg/pull/2069)
  3.  Update record_count behavior, include in manifest reader 
(https://github.com/apache/iceberg/pull/1820)

Hive:

  1.  Support case insensitive in hive query 
(https://github.com/apache/iceberg/pull/2053)
  2.  Fix join issues when CBO is enabled 
(https://github.com/apache/iceberg/pull/2052)

Flink:

  1.  Support streaming reader (https://github.com/apache/iceberg/pull/1793)
  2.  Support filter pushdown (https://github.com/apache/iceberg/pull/1893)
  3.  Add rewrite file operator after iceberg committer 
(https://github.com/apache/iceberg/pull/1669)
  4.  Support sink when disable flink checkpoint disable 
(https://github.com/apache/iceberg/pull/1515)
Nessie:

  1.  Fix property for custom catalog in Flink 
(https://github.com/apache/iceberg/pull/2031)
  2.  Add timestamp to table definition in Nessie catalog 
(https://github.com/apache/iceberg/pull/1825)

Docs:

  1.  Fix bug in AWS doc that HTTP client package is not included in bundle 
(https://github.com/apache/iceberg/pull/2072)
  2.  Adds initial Documentation for Iceberg Stored Procedures 
(https://github.com/apache/iceberg/pull/2067)

Good-to-have Pull Requests

Here are a list of PRs that people consider good to have and possible to be 
merged in the current release train AOD 1/13/2020:

Core:

  1.  Allow binary truncation length to be zero to handle evaluators that 
encounter empty string values (https://github.com/apache/iceberg/pull/2081)
  2.  Add contains_nan to field_summary 
(https://github.com/apache/iceberg/pull/1872)
  3.  Core: Implement NaN counts in ORC 
(https://github.com/apache/iceberg/pull/1790)

Hive:

  1.  Fix Deserializer to use source deserializer instead of the Iceberg ones 
(https://github.com/apache/iceberg/pull/2078)
  2.  Implementation for INSERT INTO Iceberg backed Hive tables using the new 
HiveIcebergRecordWriter (https://github.com/apache/iceberg/pull/2038)
  3.  Allow auto conversion of Hive types when the CREATE TABLE statement 
contains a not supported type (https://github.com/apache/iceberg/pull/2054)
  4.  Add ObjectInspector implementations for UUID, Fixed and Time type 
(https://github.com/apache/iceberg/pull/2077)

Merge & Update:

  1.  Spark MERGE INTO Support (copy-on-write implementation) 
(https://github.com/apache/iceberg/pull/1947)
  2.  Add the cardinality check to detect ambiguous target row for MERGE INTO 
(https://github.com/apache/iceberg/pull/2021)
  3.  Implement logic to group and sort rows before writing rows for MERGE 
INTO. (https://github.com/apache/iceberg/pull/2022)

Thank you,
Jack Ye

Iceberg 0.11.0 Release Plan

Reply via email to