Hi,
Thanks all for your feedback.
I created JIRA for bundling format jars in lib. [1] FYI.
[1]https://issues.apache.org/jira/browse/FLINK-18173
Best,
Jingsong Lee
On Fri, Jun 5, 2020 at 3:59 PM Rui Li wrote:
> +1 to add light-weighted formats into the lib
>
> On Fri, Jun 5, 2020 at 3:28 PM L
+1 to add light-weighted formats into the lib
On Fri, Jun 5, 2020 at 3:28 PM Leonard Xu wrote:
> +1 for Jingsong’s proposal to put flink-csv, flink-json and flink-avro
> under lib/ directory.
> I have heard many SQL users(most of newbies) complaint the out-of-box
> experience in mail list.
>
> B
+1 for Jingsong’s proposal to put flink-csv, flink-json and flink-avro under
lib/ directory.
I have heard many SQL users(most of newbies) complaint the out-of-box
experience in mail list.
Best,
Leonard Xu
> 在 2020年6月5日,14:39,Benchao Li 写道:
>
> +1 to include them for sql-client by default;
>
+1 to include them for sql-client by default;
+0 to put into lib and exposed to all kinds of jobs, including DataStream.
Danny Chan 于2020年6月5日周五 下午2:31写道:
> +1, at least, we should keep an out of the box SQL-CLI, it’s very poor
> experience to add such required format jars for SQL users.
>
> Bes
+1, at least, we should keep an out of the box SQL-CLI, it’s very poor
experience to add such required format jars for SQL users.
Best,
Danny Chan
在 2020年6月5日 +0800 AM11:14,Jingsong Li ,写道:
> Hi all,
>
> Considering that 1.11 will be released soon, what about my previous
> proposal? Put flink-csv
+1 to add these 3 formast into dist, under the lib/ directory.
This is a worth trying step toward better usability for SQL users.
They don't have *any* dependencies and very small, so I think it's safe to
add them.
Best,
Jark
On Fri, 5 Jun 2020 at 11:14, Jingsong Li wrote:
> Hi all,
>
> Consid
Hi all,
Considering that 1.11 will be released soon, what about my previous
proposal? Put flink-csv, flink-json and flink-avro under lib.
These three formats are very small and no third party dependence, and they
are widely used by table users.
Best,
Jingsong Lee
On Tue, May 12, 2020 at 4:19 PM
Thanks for your discussion.
Sorry to start discussing another thing:
The biggest problem I see is the variety of problems caused by users' lack
of format dependency.
As Aljoscha said, these three formats are very small and no third party
dependence, and they are widely used by table users.
Actual
One downside would be that we're shipping more stuff when running on
YARN for example, since the entire plugins directory is shiped by default.
On 17/04/2020 16:38, Stephan Ewen wrote:
@Aljoscha I think that is an interesting line of thinking. the swift-fs may
be rarely enough used to move it t
Great discussion!
I'm also in favor of a single distribution that is optimized for the
initial user experience.
Most advanced users understand how to customize a distribution and many are
probably already building their own. A forcing function for custom builds
is the need to patch the official r
SQL client is one of the user cases. There are also use cases
like submitting SQL job
to a cluster and then meet the missing connector or format jars error. And
in that case,
it's actually more difficult for users to understand and fix. For example,
user submits a
SQL job to a running cluster with
Hi all,
Thanks Aljoscha for bringing this discussion, and thanks all for the
wonderful discussion.
In general, I think improving the user experience is a good idea, and it
seems that we
all agree on that.
Regarding how to achieve this,
I think Aljoscha has brought a good solution, which we have a
For SQL we could leave them in opt/. The SQL client shell script already
does discovery for some jars in opt, for example the main SQL client jar
is not in lib but it's loaded from opt/. We could do the same for the
connector/format jars.
@Timo or @Jark could you confirm whether this would wor
Are you suggesting to add the SQL dependencies to opt/ or lib/?
I thought the argument against opt/ was that it would not be much different
from downloading the additional dependencies.
Moving it to lib/ would justify in my opinion a separate release because of
potential dependency conflicts for
Thanks Till for summarizing!
Another alternative is also to stick to one distribution but remove one
of the very heavy filesystem connectors and add all the mentioned SQL
connectors/formats, which will keep the size of the distribution the
same, or a bit smaller.
Best,
Aljoscha
On 04.05.20
Thanks everyone for this lively discussion and all your thoughts.
Let me try to summarise the current state of the discussion and then let's
see how we can move it forward.
To begin with, I think everyone agrees that we want to improve Flink's user
experience. In particular, we want to improve th
It would be good if we could nail down what a slim/fat distribution
would look like, as there are various ideas floating around in this thread.
Like, what is a "slim" distribution? Are we just emptying /opt? Removing
everything larger than 1mb? Are we throwing out the Table API from /lib
for a
This would likely solve the issues surrounding the SQL client, so I
would go along with that.
On 17/04/2020 12:16, Aljoscha Krettek wrote:
I think having such tools and/or tailor-made distributions can be nice
but I also think the discussion is missing the main point: The initial
observation/m
I see no reason why we shouldn't put reporters into the plugins
directory by default, was already planning to do this for the JMX
reporter (FLINK-16970) and intend to do this for all remaining reporters.
I'm not sure about filesystems though; is there a clear 1:1 mapping of
scheme <-> filesyst
+1 for "slim" and "fat" solution. One comment about the fat one, I think we
need to
put all needed jars into /lib (or /plugins). Put jars into /opt and relying
on users moving
them from /opt to /lib doesn't really improve the out-of-box experience.
Best,
Kurt
On Fri, Apr 24, 2020 at 8:28 PM Aljo
re (1): I don't know about that, probably the people that did the
metrics reporter plugin support had some thoughts about that.
re (2): I agree, that's why I initially suggested to split it into
"slim" and "fat" because our current "medium fat" selection of jars in
Flink dist does not serve an
@Aljoscha I think that is an interesting line of thinking. the swift-fs may
be rarely enough used to move it to an optional download.
I would still drop two more thoughts:
(1) Now that we have plugins support, is there a reason to have a metrics
reporter or file system in /opt instead of /plugins
I think having such tools and/or tailor-made distributions can be nice
but I also think the discussion is missing the main point: The initial
observation/motivation is that apparently a lot of users (Kurt and I
talked about this) on the chinese DingTalk support groups, and other
support channel
A similar issue exists for the docker files.
I also heard the fame feedback from various users, for example why we don't
simply include all FS connectors in the images by default.
I actually like the idea of having a slim and a fat/convenience docker file.
- If you build a clean production imag
Hi,
I like the idea of web tool to assemble fat distribution. And the
https://code.quarkus.io/ looks very nice.
All the users need to do is just select what he/she need (I think this step
can't be omitted anyway).
We can also provide a default fat distribution on the web which default
selects some
As a reference for a nice first-experience I had, take a look at
https://code.quarkus.io/
You reach this page after you click "Start Coding" at the project homepage.
Rafi
On Thu, Apr 16, 2020 at 6:53 PM Kurt Young wrote:
> I'm not saying pre-bundle some jars will make this problem go away, and
I'm not saying pre-bundle some jars will make this problem go away, and
you're right that only hides the problem for
some users. But what if this solution can hide the problem for 90% users?
Would't that be good enough for us to try?
Regarding to would users following instructions really be such a
The problem with having a distribution with "popular" stuff is that it
doesn't really /solve/ a problem, it just hides it for users who fall
into these particular use-cases.
Move out of it and you once again run into exact same problems out-lined.
This is exactly why I like the tooling approach
I'm not so sure about the web tool solution though. The concern I have for
this approach is the final generated
distribution is kind of non-deterministic. We might generate too many
different combinations when user trying to
package different types of connector, format, and even maybe hadoop
releas
I think what Chesnay and Dawid proposed would be the ideal solution.
Ideally, we would also have a nice web tool for the website which generates
the corresponding distribution for download.
To get things started we could start with only supporting to
download/creating the "fat" version with the sc
Hi all,
Few points from my side:
1. I like the idea of simplifying the experience for first time users.
As for production use cases I share Jark's opinion that in this case I
would expect users to combine their distribution manually. I think in
such scenarios it is important to understand interco
I want to reinforce my opinion from earlier: This is about improving the
situation both for first-time users and for experienced users that want
to use a Flink dist in production. The current Flink dist is too "thin"
for first-time SQL users and it is too "fat" for production users, that
is whe
Hi all,
Regarding slim and fat distributions, I think different kinds of jobs may
prefer different type of distribution:
For DataStream job, I think we may not like fat distribution containing
connectors because user would always need to depend on the connector in
user code, it is easy to include
Hi,
I am thinking both "improve first experience" and "improve production
experience".
I'm thinking about what's the common mode of Flink?
Streaming job use Kafka? Batch job use Hive?
Hive 1.2.1 dependencies can be compatible with most of Hive server
versions. So Spark and Presto have built-in H
Hi,
I think we should first reach an consensus on "what problem do we want to
solve?"
(1) improve first experience? or (2) improve production experience?
As far as I can see, with the above discussion, I think what we want to
solve is the "first experience".
And I think the slim jar is still the
I don't see a lot of value in having multiple distributions.
The simple reality is that no fat distribution we could provide would
satisfy all use-cases, so why even try.
If users commonly run into issues for certain jars, then maybe those
should be added to the current distribution.
Personal
Regarding to the specific solution, I'm not sure about the "fat" and "slim"
solution though. I get the idea
that we can make the slim one even more lightweight than current
distribution, but what about the "fat"
one? Do you mean that we would package all connectors and formats into
this? I'm not su
Big +1.
I like "fat" and "slim".
For csv and json, like Jark said, they are quite small and don't have other
dependencies. They are important to kafka connector, and important
to upcoming file system connector too.
So can we move them to both "fat" and "slim"? They're so important, and
they're so
Big +1.
This will improve user experience (special for Flink new users).
We answered so many questions about "class not found".
Best,
Godfrey
Dian Fu 于2020年4月15日周三 下午4:30写道:
> +1 to this proposal.
>
> Missing connector jars is also a big problem for PyFlink users. Currently,
> after a Python us
+1 to this proposal.
Missing connector jars is also a big problem for PyFlink users. Currently,
after a Python user has installed PyFlink using `pip`, he has to manually copy
the connector fat jars to the PyFlink installation directory for the connectors
to be used if he wants to run jobs local
+1 to the proposal. I also found the "download additional jar" step is
really verbose when I prepare webinars.
At least, I think the flink-csv and flink-json should in the distribution,
they are quite small and don't have other dependencies.
Best,
Jark
On Wed, 15 Apr 2020 at 15:44, Jeff Zhang w
Hi Aljoscha,
Big +1 for the fat flink distribution, where do you plan to put these
connectors ? opt or lib ?
Aljoscha Krettek 于2020年4月15日周三 下午3:30写道:
> Hi Everyone,
>
> I'd like to discuss about releasing a more full-featured Flink
> distribution. The motivation is that there is friction for SQ
Big +1 from my side.
>From my experience, missing connector & format jar is the TOP 1 problem
which
SQL users will probably run into. Similar questions raised in Flink's
Dingtalk group
almost every 1 or 2 days. And I have personally answered dozens of such
question.
Sometimes it's still not enough
Hi Everyone,
I'd like to discuss about releasing a more full-featured Flink
distribution. The motivation is that there is friction for SQL/Table API
users that want to use Table connectors which are not there in the
current Flink Distribution. For these users the workflow is currently
roughly
44 matches
Mail list logo