+1 to having a feature flag mechanism that defaults to ‘compatible’ mode by 
default.

I would suggest a robust test suite around this new feature, with a focus on 
not breaking existing behavior when in ‘compatible’ mode.

Xabriel J Collazo Mojica  |  Sr Software Engineer  |  Adobe

From: Ryan Blue <rb...@netflix.com.INVALID>
Reply-To: "dev@iceberg.apache.org" <dev@iceberg.apache.org>, 
"rb...@netflix.com" <rb...@netflix.com>
Date: Monday, January 13, 2020 at 11:28 AM
To: Iceberg Dev List <dev@iceberg.apache.org>
Subject: [DISCUSS] Forward compatibility and snapshot ID inheritance

Hi everyone,

Anton has a PR almost ready to merge that implements snapshot ID inheritance, 
similar to how we plan to inherit sequence IDs in metadata. That allows people 
to create manifests that are missing data that will be assigned at commit time 
(snapshot ID) or that may change if a commit is retried (sequence number). The 
inherited information is stored as a field of ManifestFile that is stored in 
the ManifestList.

This change makes the snapshot ID optional for each data file in a manifest, so 
that a null snapshot ID indicates that it should be inherited from the manifest 
metadata. This is a breaking change because older readers consider this field 
required. A change that can break older readers is not allowed because we 
guarantee forward compatibility within a format version.

There are some options for how we handle this. First, we could bump the format 
version and break compatibility, but there are cases when it is possible to 
read tables that use appended manifests. For example, tables that don't use 
appended manifests, or tables that rewrite those manifests quickly will be 
compatible with old readers. That's why I think we should consider a second 
option: adding a feature flag that ensures that manifests will not be written 
with missing snapshot IDs unless the table has the compatibility flag set. Then 
tables are opted into breaking changes within a format version and we have a 
way to release format features before the version where they become standard; 
format v2 will mark the snapshot ID optional and have requirements for 
inheritance.

What do people think about this strategy for managing breaking changes? I like 
the idea of getting the changes out early behind feature flags, where possible, 
but it would be great to hear whether other people see problems with this 
approach.


rb

--
Ryan Blue
Software Engineer
Netflix

Reply via email to