Hello Danny,
Thanks for the starting the discussion.
-1 for mono-repo, and -+1 for dropping Flink version.
I have mixed opinion with dropping the Flink version. Usually, large
production migrations happen on Flink versions and users want also
naturally update the connectors compatible for that Flink version.
which is a burden on the community.
Maybe this is another point we should address?
I agree with Sergey's point to have CI builds with SNAPSHOT versions,
which would make updating the versions easily. We could start updating
builds to include SNAPSHOT version if they are missing.
Another suggestion would be to have a dedicated owners (PMC/committers)
of set of connectors that are responsible for these regular update
tasks together with volunteers. Maybe this should be decided similar
to release managers before each planned release.
Best,
Muhammet
On 2024-06-10 16:25, Danny Cranmer wrote:
Hello Flink community,
It has been over 2 years [1] since we started externalizing the Flink
connectors to dedicated repositories from the main Flink code base. The
past discussions can be found here [2]. The community decided to
externalize the connectors to primarily 1/ improve stability and speed
of
the CI, and 2/ decouple version and release lifecycle to allow the
projects
to evolve independently. The outcome of this has resulted in each
connector
requiring a dedicated release per Flink minor version, which is a
burden on
the community. Flink 1.19.0 was released on 2024-03-18 [3], the first
supported connector followed roughly 2.5 months later on 2024-06-06 [4]
(MongoDB). There are still 5 connectors that do not support Flink 1.19
[5].
Two decisions contribute to the high lag between releases. 1/ creating
one
repository per connector instead of a single flink-connector mono-repo
and
2/ coupling the Flink version to the connector version [6]. A single
connector repository would reduce the number of connector releases from
N
to 1, but would couple the connector CI and reduce release flexibility.
Decoupling the connector versions from Flink would eliminate the need
to
release each connector for each new Flink minor version, but we would
need
a new compatibility mechanism.
I propose that from each next connector release we drop the coupling on
the
Flink version. For example, instead of 3.4.0-1.20 (<connector>.<flink>)
we
would release 3.4.0 (<connector>). We can model a compatibility matrix
within the Flink docs to help users pick the correct versions. This
would
mean we would usually not need to release a new connector version per
Flink
version, assuming there are no breaking changes. Worst case, if
breaking
changes impact all connectors we would still need to release all
connectors. However, for Flink 1.17 and 1.18 there were only a handful
of
issues (breaking changes), and mostly impacting tests. We could decide
to
align this with Flink 2.0, however I see no compelling reason to do so.
This was discussed previously [2] as a long term goal once the
connector
APIs are stable. But I think the current compatibility rules support
this
change now.
I would prefer to not create a connector mono-repo. Separate repos
gives
each connector more flexibility to evolve independently, and removing
unnecessary releases will significantly reduce the release effort.
I would like to hear opinions and ideas from the community. In
particular,
are there any other issues you have observed that we should consider
addressing?
Thanks,
Danny.
[1]
https://github.com/apache/flink-connector-elasticsearch/commit/3ca2e625e3149e8864a4ad478773ab4a82720241
[2] https://lists.apache.org/thread/8k1xonqt7hn0xldbky1cxfx3fzh6sj7h
[3]
https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/
[4] https://flink.apache.org/downloads/#apache-flink-connectors-1
[5] https://issues.apache.org/jira/browse/FLINK-35131
[6]
https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development#ExternalizedConnectordevelopment-Examples