Hello,

over the past week I worked on putting the final things into place to enable the first release of an externalized elasticsearch connector.

Then it dawned on me that there are a few things we _haven't really decided yet_, which are rather important though.

Let's fix that.

Note that while this discussions is motivated by the upcoming Elasticsearch release, it should very much apply to _all_ externalized connectors.


   # Open questions

a) How many connector variants are supported per Flink version?
b) What is the version scheme?
c) How does the branch structure look like?
d) What is the purpose of the main branch?

None of these were really discussed in the past.


   # Background

a)
One of the goals of the externalization is allow us to more rapidly evolve connectors. We don't want to be limited by Flinks release cycle, both when it comes to providing new features but also when it comes to making breaking changes. This implies that we may end up having several _variants_ of a connector that are compatible with a given Flink version (e.g., 1.0.0, 2.0.0).

b)
With a) in mind we need to integrate the concept of supported Flink versions _somehow_ into the version scheme.

c)
We need a branch structure that actually supports a).

d)
This is not obvious; depending on how you implement c) the main branch becomes pointless. See my proposal for examples.


   # My proposal

a)

Only the last 2 versions of a connector are supported per Flink version, with only the latest version receiving new features.

E.g., assuming connector version 1,2 and 3 exist for Flink version X:
1 is not supported; 2 gets patches; 3 gets features and patches.

Note: This only provides value if we have significantly more minor than major releases. We should maybe impose a limit of 2 major releases per Flink version.

This a compromise in keeping the number of supported versions low (max=4) without immediately forcing users to buy into breaking changes in the connector.

b)

Let's clarify our requirements.

1) We need to be able to create different variant for a given Flink version, in case some API got changed or a useful feature was added 2) We want to be able to create a new variant for an an existing Flink version, to speed up the evolution of the connector.

In my eyes this leads to a the obvious version schema:
<major.minor.patch connector version>-<major.minor supported Flink version>

If a particular connector is compatible with multiple Flink versions we'll just duplicate the maven deployment with different Flink version suffixes. This is perfectly fine and in practice we did that with the Scala suffixes already.

c)

vmajor.minor-flink-major.flink-minor

E.g., v1.0-1.15; v1.1-1.15

It must be obvious for us which branches use which Flink/connector version, because we must ensure that all variants of a connector across all Flink versions are similar, as should all connectors for a particular Flink version.

d)

The main branch will not have any purpose for the development. Let me explain:

A change that lands in the main branch would either

1) add support for a new Flink version to an existing connector variant (e.g., 1.0-1.15 exists, now we add 1.0-1.16), but this is better implemented by forking the 1.0-1.15 branch 2) add a new connector variant (e.g., 1.0-1.15 exists, now we add 2.0-1.15), but this is again beter implemented by forking the 1.9-1.15 branch.

In short I can't think of a single case where the main branch provides value.
As such I'd relegate it to a sort of landing / documentation page.


Reply via email to