Hi everyone,
as you may know a first minimum version of FLIP-24 [1] for the upcoming
Flink SQL Client has been merged to the master. We also merged
possibilities to discover and configure table sources without a single
line of code using string-based properties [2] and Java service provider
discovery.
We are now facing the issue of how to manage dependencies in this new
environment. It is different from how regular Flink projects are created
(by setting up a a new Maven project and build a jar or fat jar).
Ideally, a user should be able to select from a set of prepared
connectors, catalogs, and formats. E.g., if a Kafka connector and Avro
format is needed, all that should be required is to move a
"flink-kafka.jar" and "flink-avro.jar" into the "sql_lib" directory that
is shipped to a Flink cluster together with the SQL query.
The question is how do we want to offer those JAR files in the future?
We see two options:
1) We prepare Maven build profiles for all offered modules and provide a
shell script for building fat jars. A script call could look like
"./sql-client-dependency.sh kafka 0.10". It would automatically download
what is needed and place the JAR file in the library folder. This
approach would keep our development effort low but would require Maven
to be present and builds to pass on different environments (e.g. Windows).
2) We build fat jars for these modules with every Flink release that can
be hostet somewhere (e.g. Apache infrastructure, but not Maven central).
This would make it very easy to add a dependency by downloading the
prepared JAR files. However, it would require to build and host large
fat jars for every connector (and version) with every Flink major and
minor release. The size of such a repository might grow quickly.
What do you think? Do you see other options to make adding dependencies
as possible?
Regards,
Timo
[1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client
[2] https://issues.apache.org/jira/browse/FLINK-8240