Hi everyone,

as you may know a first minimum version of FLIP-24 [1] for the upcoming Flink SQL Client has been merged to the master. We also merged possibilities to discover and configure table sources without a single line of code using string-based properties [2] and Java service provider discovery.

We are now facing the issue of how to manage dependencies in this new environment. It is different from how regular Flink projects are created (by setting up a a new Maven project and build a jar or fat jar). Ideally, a user should be able to select from a set of prepared connectors, catalogs, and formats. E.g., if a Kafka connector and Avro format is needed, all that should be required is to move a "flink-kafka.jar" and "flink-avro.jar" into the "sql_lib" directory that is shipped to a Flink cluster together with the SQL query.

The question is how do we want to offer those JAR files in the future? We see two options:

1) We prepare Maven build profiles for all offered modules and provide a shell script for building fat jars. A script call could look like "./sql-client-dependency.sh kafka 0.10". It would automatically download what is needed and place the JAR file in the library folder. This approach would keep our development effort low but would require Maven to be present and builds to pass on different environments (e.g. Windows).

2) We build fat jars for these modules with every Flink release that can be hostet somewhere (e.g. Apache infrastructure, but not Maven central). This would make it very easy to add a dependency by downloading the prepared JAR files. However, it would require to build and host large fat jars for every connector (and version) with every Flink major and minor release. The size of such a repository might grow quickly.

What do you think? Do you see other options to make adding dependencies as possible?


Regards,

Timo


[1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client

[2] https://issues.apache.org/jira/browse/FLINK-8240

Reply via email to