Hello,
over the past week I worked on putting the final things into place to
enable the first release of an externalized elasticsearch connector.
Then it dawned on me that there are a few things we _haven't really
decided yet_, which are rather important though.
Let's fix that.
Note that while this discussions is motivated by the upcoming
Elasticsearch release, it should very much apply to _all_ externalized
connectors.
# Open questions
a) How many connector variants are supported per Flink version?
b) What is the version scheme?
c) How does the branch structure look like?
d) What is the purpose of the main branch?
None of these were really discussed in the past.
# Background
a)
One of the goals of the externalization is allow us to more rapidly
evolve connectors.
We don't want to be limited by Flinks release cycle, both when it comes
to providing new features but also when it comes to making breaking changes.
This implies that we may end up having several _variants_ of a connector
that are compatible with a given Flink version (e.g., 1.0.0, 2.0.0).
b)
With a) in mind we need to integrate the concept of supported Flink
versions _somehow_ into the version scheme.
c)
We need a branch structure that actually supports a).
d)
This is not obvious; depending on how you implement c) the main branch
becomes pointless. See my proposal for examples.
# My proposal
a)
Only the last 2 versions of a connector are supported per Flink version,
with only the latest version receiving new features.
E.g., assuming connector version 1,2 and 3 exist for Flink version X:
1 is not supported; 2 gets patches; 3 gets features and patches.
Note: This only provides value if we have significantly more minor than
major releases. We should maybe impose a limit of 2 major releases per
Flink version.
This a compromise in keeping the number of supported versions low
(max=4) without immediately forcing users to buy into breaking changes
in the connector.
b)
Let's clarify our requirements.
1) We need to be able to create different variant for a given Flink
version, in case some API got changed or a useful feature was added
2) We want to be able to create a new variant for an an existing Flink
version, to speed up the evolution of the connector.
In my eyes this leads to a the obvious version schema:
<major.minor.patch connector version>-<major.minor supported Flink version>
If a particular connector is compatible with multiple Flink versions
we'll just duplicate the maven deployment with different Flink version
suffixes.
This is perfectly fine and in practice we did that with the Scala
suffixes already.
c)
vmajor.minor-flink-major.flink-minor
E.g., v1.0-1.15; v1.1-1.15
It must be obvious for us which branches use which Flink/connector
version, because we must ensure that all variants of a connector across
all Flink versions are similar, as should all connectors for a
particular Flink version.
d)
The main branch will not have any purpose for the development. Let me
explain:
A change that lands in the main branch would either
1) add support for a new Flink version to an existing connector variant
(e.g., 1.0-1.15 exists, now we add 1.0-1.16), but this is better
implemented by forking the 1.0-1.15 branch
2) add a new connector variant (e.g., 1.0-1.15 exists, now we add
2.0-1.15), but this is again beter implemented by forking the 1.9-1.15
branch.
In short I can't think of a single case where the main branch provides
value.
As such I'd relegate it to a sort of landing / documentation page.