Re: [DISCUSS] Releasing "fat" and "slim" Flink distributions

godfrey he Wed, 15 Apr 2020 01:53:27 -0700

Big +1.
This will improve user experience (special for Flink new users).
We answered so many questions about "class not found".


Best,
Godfrey

Dian Fu <dian0511...@gmail.com> 于2020年4月15日周三 下午4:30写道：

> +1 to this proposal.
>
> Missing connector jars is also a big problem for PyFlink users. Currently,
> after a Python user has installed PyFlink using `pip`, he has to manually
> copy the connector fat jars to the PyFlink installation directory for the
> connectors to be used if he wants to run jobs locally. This process is very
> confuse for users and affects the experience a lot.
>
> Regards,
> Dian
>
> > 在 2020年4月15日，下午3:51，Jark Wu <imj...@gmail.com> 写道：
> >
> > +1 to the proposal. I also found the "download additional jar" step is
> > really verbose when I prepare webinars.
> >
> > At least, I think the flink-csv and flink-json should in the
> distribution,
> > they are quite small and don't have other dependencies.
> >
> > Best,
> > Jark
> >
> > On Wed, 15 Apr 2020 at 15:44, Jeff Zhang <zjf...@gmail.com> wrote:
> >
> >> Hi Aljoscha,
> >>
> >> Big +1 for the fat flink distribution, where do you plan to put these
> >> connectors ? opt or lib ?
> >>
> >> Aljoscha Krettek <aljos...@apache.org> 于2020年4月15日周三 下午3:30写道：
> >>
> >>> Hi Everyone,
> >>>
> >>> I'd like to discuss about releasing a more full-featured Flink
> >>> distribution. The motivation is that there is friction for SQL/Table
> API
> >>> users that want to use Table connectors which are not there in the
> >>> current Flink Distribution. For these users the workflow is currently
> >>> roughly:
> >>>
> >>>  - download Flink dist
> >>>  - configure csv/Kafka/json connectors per configuration
> >>>  - run SQL client or program
> >>>  - decrypt error message and research the solution
> >>>  - download additional connector jars
> >>>  - program works correctly
> >>>
> >>> I realize that this can be made to work but if every SQL user has this
> >>> as their first experience that doesn't seem good to me.
> >>>
> >>> My proposal is to provide two versions of the Flink Distribution in the
> >>> future: "fat" and "slim" (names to be discussed):
> >>>
> >>>  - slim would be even trimmer than todays distribution
> >>>  - fat would contain a lot of convenience connectors (yet to be
> >>> determined which one)
> >>>
> >>> And yes, I realize that there are already more dimensions of Flink
> >>> releases (Scala version and Java version).
> >>>
> >>> For background, our current Flink dist has these in the opt directory:
> >>>
> >>>  - flink-azure-fs-hadoop-1.10.0.jar
> >>>  - flink-cep-scala_2.12-1.10.0.jar
> >>>  - flink-cep_2.12-1.10.0.jar
> >>>  - flink-gelly-scala_2.12-1.10.0.jar
> >>>  - flink-gelly_2.12-1.10.0.jar
> >>>  - flink-metrics-datadog-1.10.0.jar
> >>>  - flink-metrics-graphite-1.10.0.jar
> >>>  - flink-metrics-influxdb-1.10.0.jar
> >>>  - flink-metrics-prometheus-1.10.0.jar
> >>>  - flink-metrics-slf4j-1.10.0.jar
> >>>  - flink-metrics-statsd-1.10.0.jar
> >>>  - flink-oss-fs-hadoop-1.10.0.jar
> >>>  - flink-python_2.12-1.10.0.jar
> >>>  - flink-queryable-state-runtime_2.12-1.10.0.jar
> >>>  - flink-s3-fs-hadoop-1.10.0.jar
> >>>  - flink-s3-fs-presto-1.10.0.jar
> >>>  - flink-shaded-netty-tcnative-dynamic-2.0.25.Final-9.0.jar
> >>>  - flink-sql-client_2.12-1.10.0.jar
> >>>  - flink-state-processor-api_2.12-1.10.0.jar
> >>>  - flink-swift-fs-hadoop-1.10.0.jar
> >>>
> >>> Current Flink dist is 267M. If we removed everything from opt we would
> >>> go down to 126M. I would reccomend this, because the large majority of
> >>> the files in opt are probably unused.
> >>>
> >>> What do you think?
> >>>
> >>> Best,
> >>> Aljoscha
> >>>
> >>>
> >>
> >> --
> >> Best Regards
> >>
> >> Jeff Zhang
> >>
>
>

Re: [DISCUSS] Releasing "fat" and "slim" Flink distributions

Reply via email to