Hi everyone, As a part of the ongoing 0.13.0 release, we are starting to formally support multiple engine versions for Spark, Flink and Hive. I think it is worth defining a formal process for us to add a new supported version, maintain existing versions and deprecate old versions. We briefly touched this topic when doing the refactoring, but I think now is a good time to formalize it and place it as a part of the Iceberg public documentation. As a starter for brainstorming, here is the process I think:
Each engine has the following lifecycle states: 1. *Beta*: an engine supported is added, but still in the experimental stage. Maybe the engine version itself is still in preview (e.g. Spark 3.0.0-preview), or the engine does not yet have full feature compatibility compared to old versions yet. This state allows us to release an engine version support without the need to wait for feature parity, shortening the release time. 2. *Maintained*: an engine version is being actively maintained by the community. Users can expect feature parity for most features across all the maintained versions. If a feature has to leverage some new engine functionalities that older versions don't have, then feature parity is not required. For code contributors, - New features should always be prioritized first in the latest version (the latest version could be a maintained or beta version) - For features that could be backported, the contributor is encouraged to either also perform backports in separated PRs, or at least create some issues to track the backport. - If the change is small enough like a few lines, updating all versions at once is good enough. Otherwise, using separated PRs for each version is recommended. 3. *Deprecating*: an engine version is no longer actively maintained. People who are still interested in the version can backport any necessary feature or bug fix from newer versions, but the community will not spend effort in achieving feature parity. We recommend users to move towards a newer version, and we expect contributions to the specific version to diminish over time, and eventually no change is added to the version. At that time we can move the version to the end of life. 4. *End-of-life*: a vote can be initiated to fully remove a deprecating version out of the Iceberg repo to mark as its end of life. I am not sure if we should remove all the code, but I think it would help push people forward and keep the repository healthy. With the lifecycle states described above, we will add 1 doc section under each engine to describe the current engine version support status. A PR will be needed to perform any state transition, and that could serve as the place to discuss if the transition is appropriate or not. Any thoughts about the process? Best, Jack Ye