Re: [DISCUS] Flink SQL Client dependency management

Stephan Ewen Tue, 27 Feb 2018 02:39:07 -0800

My first intuition would be to go for approach #2 for the following reasons


  - I expect that in the long run, the scripts will not be that simple to
maintain. We saw that with all shell scripts thus far: they start simple,
and then grow with many special cases for this and that setup.

  - Not all users have Maven, automatically downloading and configuring
Maven could be an option, but that makes the scripts yet more tricky.

  - Download-and-drop-in is probably still easier to understand for users
than the syntax of a script with its parameters

  - I think it may actually be even simpler to maintain for us, because all
it does is add a profile or build target to each connector to also create
the fat jar.

  - Storage space is no longer really a problem. Worst case we host the fat
jars in an S3 bucket.



On Mon, Feb 26, 2018 at 7:33 PM, Timo Walther <twal...@apache.org> wrote:

> Hi everyone,
>
> as you may know a first minimum version of FLIP-24 [1] for the upcoming
> Flink SQL Client has been merged to the master. We also merged
> possibilities to discover and configure table sources without a single line
> of code using string-based properties [2] and Java service provider
> discovery.
>
> We are now facing the issue of how to manage dependencies in this new
> environment. It is different from how regular Flink projects are created
> (by setting up a a new Maven project and build a jar or fat jar). Ideally,
> a user should be able to select from a set of prepared connectors,
> catalogs, and formats. E.g., if a Kafka connector and Avro format is
> needed, all that should be required is to move a "flink-kafka.jar" and
> "flink-avro.jar" into the "sql_lib" directory that is shipped to a Flink
> cluster together with the SQL query.
>
> The question is how do we want to offer those JAR files in the future? We
> see two options:
>
> 1) We prepare Maven build profiles for all offered modules and provide a
> shell script for building fat jars. A script call could look like
> "./sql-client-dependency.sh kafka 0.10". It would automatically download
> what is needed and place the JAR file in the library folder. This approach
> would keep our development effort low but would require Maven to be present
> and builds to pass on different environments (e.g. Windows).
>
> 2) We build fat jars for these modules with every Flink release that can
> be hostet somewhere (e.g. Apache infrastructure, but not Maven central).
> This would make it very easy to add a dependency by downloading the
> prepared JAR files. However, it would require to build and host large fat
> jars for every connector (and version) with every Flink major and minor
> release. The size of such a repository might grow quickly.
>
> What do you think? Do you see other options to make adding dependencies as
> possible?
>
>
> Regards,
>
> Timo
>
>
> [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client
>
> [2] https://issues.apache.org/jira/browse/FLINK-8240
>
>

Re: [DISCUS] Flink SQL Client dependency management

Reply via email to