Thanks for starting this discussion. I am working on the early stages of
the new DynamoDB connector and have been pondering the same thing.

a) Makes sense. On the flip side, how many Flink versions will we
support? Right now we support 2 versions for Flink, so it makes sense to
follow this rule.

For example if the latest connector version is 2.0.0, we would only publish
2.0.0-1.15 and 2.0.0-1.14.
Then once we want to ship connector 2.1.0 if Flink 1.16 is out, we would
publish 2.1.0-1.16 and 2.1.0-1.15.
Which leaves the case when a new Flink version is released (1.16 for
example) and connector 2.0.0 is already published. We do not have any
connector changes so could consider adding 2.0.0-1.16 (resulting in
2.0.0-1.16, 2.0.0-1.15 and 2.0.0-1.14 (no longer supported)) or requiring a
version bump to 2.1.0-1.16. I would prefer adding 2.0.0-1.16 if there are
no changes to the code, and this is possible. If a connector code never
changes, we would end up with 2.0.0-1.14, 2.0.0-1.15 ... 2.0.0-n.m

b) I like this suggestion, even though the Flink dependencies are usually
"provided" and therefore the builds are identical. It gives the users a
clear indicator of what is supported, and allows us to target tests against
different Flink versions consistently.

c) Instead of having a branch per Flink version can we have multiple build
profiles like the Scala variable? Having 1.0-1.15 and 1.0-1.16 branches
would likely be duplicate code and increase the maintenance burden (having
to merge PRs into multiple branches). If the connector code is not
compatible with both Flink versions we can bump the connector version at
this point. I would propose following the Flink branching strategy
"release-1.0" unless this will not work.

d) If we remove the Flink qualifier from the branch as discussed above,
then main can be the next major version, like in Flink.

-- 

Another point to discuss is the parent pom. The ElasticSearch pom [1] is
referencing the apache parent and duplicating a lot of config from Flink.
In the DynamoDB pom [2] I have referenced the flink-connectors parent
instead, and inherited the majority of config. I would be in favour of
referencing a common connector parent to inherit the default configs and
reduce copy and paste. What do you think?

[1]
https://github.com/apache/flink-connector-elasticsearch/blob/main/pom.xml
[2] https://github.com/apache/flink-connector-dynamodb/blob/main/pom.xml

On Thu, Sep 15, 2022 at 4:07 PM Chesnay Schepler <ches...@apache.org> wrote:

> Hello,
>
> over the past week I worked on putting the final things into place to
> enable the first release of an externalized elasticsearch connector.
>
> Then it dawned on me that there are a few things we _haven't really
> decided yet_, which are rather important though.
>
> Let's fix that.
>
> Note that while this discussions is motivated by the upcoming
> Elasticsearch release, it should very much apply to _all_ externalized
> connectors.
>
>
>     # Open questions
>
> a) How many connector variants are supported per Flink version?
> b) What is the version scheme?
> c) How does the branch structure look like?
> d) What is the purpose of the main branch?
>
> None of these were really discussed in the past.
>
>
>     # Background
>
> a)
> One of the goals of the externalization is allow us to more rapidly
> evolve connectors.
> We don't want to be limited by Flinks release cycle, both when it comes
> to providing new features but also when it comes to making breaking
> changes.
> This implies that we may end up having several _variants_ of a connector
> that are compatible with a given Flink version (e.g., 1.0.0, 2.0.0).
>
> b)
> With a) in mind we need to integrate the concept of supported Flink
> versions _somehow_ into the version scheme.
>
> c)
> We need a branch structure that actually supports a).
>
> d)
> This is not obvious; depending on how you implement c) the main branch
> becomes pointless. See my proposal for examples.
>
>
>     # My proposal
>
> a)
>
> Only the last 2 versions of a connector are supported per Flink version,
> with only the latest version receiving new features.
>
> E.g., assuming connector version 1,2 and 3 exist for Flink version X:
> 1 is not supported; 2 gets patches; 3 gets features and patches.
>
> Note: This only provides value if we have significantly more minor than
> major releases. We should maybe impose a limit of 2 major releases per
> Flink version.
>
> This a compromise in keeping the number of supported versions low
> (max=4) without immediately forcing users to buy into breaking changes
> in the connector.
>
> b)
>
> Let's clarify our requirements.
>
> 1) We need to be able to create different variant for a given Flink
> version, in case some API got changed or a useful feature was added
> 2) We want to be able to create a new variant for an an existing Flink
> version, to speed up the evolution of the connector.
>
> In my eyes this leads to a the obvious version schema:
> <major.minor.patch connector version>-<major.minor supported Flink version>
>
> If a particular connector is compatible with multiple Flink versions
> we'll just duplicate the maven deployment with different Flink version
> suffixes.
> This is perfectly fine and in practice we did that with the Scala
> suffixes already.
>
> c)
>
> vmajor.minor-flink-major.flink-minor
>
> E.g., v1.0-1.15; v1.1-1.15
>
> It must be obvious for us which branches use which Flink/connector
> version, because we must ensure that all variants of a connector across
> all Flink versions are similar, as should all connectors for a
> particular Flink version.
>
> d)
>
> The main branch will not have any purpose for the development. Let me
> explain:
>
> A change that lands in the main branch would either
>
> 1) add support for a new Flink version to an existing connector variant
> (e.g., 1.0-1.15 exists, now we add 1.0-1.16), but this is better
> implemented by forking the 1.0-1.15 branch
> 2) add a new connector variant (e.g., 1.0-1.15 exists, now we add
> 2.0-1.15), but this is again beter implemented by forking the 1.9-1.15
> branch.
>
> In short I can't think of a single case where the main branch provides
> value.
> As such I'd relegate it to a sort of landing / documentation page.
>
>
>

Reply via email to