Hi Timo, thanks for your efforts. Personally, I think the second option would be better and here are my feelings.
(1) The SQL client is designed to offer a convenient way for users to manipulate data with Flink. Obviously, the second option would be more easy-to-use. (2) The script will help to manage the dependencies automatically, but with less flexibility. Once the script cannot meet the need, users have to modify it themselves. (3) I wonder whether we could package all these built-in connectors and formats into a single JAR. With this all-in-one solution, users don’t need to consider much about the dependencies. Best, Xingcan > On 27 Feb 2018, at 6:38 PM, Stephan Ewen <se...@apache.org> wrote: > > My first intuition would be to go for approach #2 for the following reasons > > - I expect that in the long run, the scripts will not be that simple to > maintain. We saw that with all shell scripts thus far: they start simple, > and then grow with many special cases for this and that setup. > > - Not all users have Maven, automatically downloading and configuring > Maven could be an option, but that makes the scripts yet more tricky. > > - Download-and-drop-in is probably still easier to understand for users > than the syntax of a script with its parameters > > - I think it may actually be even simpler to maintain for us, because all > it does is add a profile or build target to each connector to also create > the fat jar. > > - Storage space is no longer really a problem. Worst case we host the fat > jars in an S3 bucket. > > > On Mon, Feb 26, 2018 at 7:33 PM, Timo Walther <twal...@apache.org> wrote: > >> Hi everyone, >> >> as you may know a first minimum version of FLIP-24 [1] for the upcoming >> Flink SQL Client has been merged to the master. We also merged >> possibilities to discover and configure table sources without a single line >> of code using string-based properties [2] and Java service provider >> discovery. >> >> We are now facing the issue of how to manage dependencies in this new >> environment. It is different from how regular Flink projects are created >> (by setting up a a new Maven project and build a jar or fat jar). Ideally, >> a user should be able to select from a set of prepared connectors, >> catalogs, and formats. E.g., if a Kafka connector and Avro format is >> needed, all that should be required is to move a "flink-kafka.jar" and >> "flink-avro.jar" into the "sql_lib" directory that is shipped to a Flink >> cluster together with the SQL query. >> >> The question is how do we want to offer those JAR files in the future? We >> see two options: >> >> 1) We prepare Maven build profiles for all offered modules and provide a >> shell script for building fat jars. A script call could look like >> "./sql-client-dependency.sh kafka 0.10". It would automatically download >> what is needed and place the JAR file in the library folder. This approach >> would keep our development effort low but would require Maven to be present >> and builds to pass on different environments (e.g. Windows). >> >> 2) We build fat jars for these modules with every Flink release that can >> be hostet somewhere (e.g. Apache infrastructure, but not Maven central). >> This would make it very easy to add a dependency by downloading the >> prepared JAR files. However, it would require to build and host large fat >> jars for every connector (and version) with every Flink major and minor >> release. The size of such a repository might grow quickly. >> >> What do you think? Do you see other options to make adding dependencies as >> possible? >> >> >> Regards, >> >> Timo >> >> >> [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client >> >> [2] https://issues.apache.org/jira/browse/FLINK-8240 >> >>