+1 on feature flag in this case.

I encourage everyone interested to take a look at the PR itself [1]. There are 
a few open things to address, though.

- Anton

[1] - https://github.com/apache/incubator-iceberg/pull/675 
<https://github.com/apache/incubator-iceberg/pull/675>

> On 14 Jan 2020, at 09:27, Gautam <gautamkows...@gmail.com> wrote:
> 
> A feature flag sounds good to me with associated regression tests to pair 
> along with each feature.
> 
> Re: Snapshot Id Inheritance,  would be good to update the spec with the 
> change in metadata guarantees.
> 
> -Gautam.
> 
> On Mon, Jan 13, 2020 at 11:28 AM Ryan Blue <rb...@netflix.com.invalid> wrote:
> Hi everyone,
> 
> Anton has a PR almost ready to merge that implements snapshot ID inheritance, 
> similar to how we plan to inherit sequence IDs in metadata. That allows 
> people to create manifests that are missing data that will be assigned at 
> commit time (snapshot ID) or that may change if a commit is retried (sequence 
> number). The inherited information is stored as a field of ManifestFile that 
> is stored in the ManifestList.
> 
> This change makes the snapshot ID optional for each data file in a manifest, 
> so that a null snapshot ID indicates that it should be inherited from the 
> manifest metadata. This is a breaking change because older readers consider 
> this field required. A change that can break older readers is not allowed 
> because we guarantee forward compatibility within a format version.
> 
> There are some options for how we handle this. First, we could bump the 
> format version and break compatibility, but there are cases when it is 
> possible to read tables that use appended manifests. For example, tables that 
> don't use appended manifests, or tables that rewrite those manifests quickly 
> will be compatible with old readers. That's why I think we should consider a 
> second option: adding a feature flag that ensures that manifests will not be 
> written with missing snapshot IDs unless the table has the compatibility flag 
> set. Then tables are opted into breaking changes within a format version and 
> we have a way to release format features before the version where they become 
> standard; format v2 will mark the snapshot ID optional and have requirements 
> for inheritance.
> 
> What do people think about this strategy for managing breaking changes? I 
> like the idea of getting the changes out early behind feature flags, where 
> possible, but it would be great to hear whether other people see problems 
> with this approach.
> 
> 
> rb
> 
> -- 
> Ryan Blue
> Software Engineer
> Netflix

Reply via email to