Re: [DISCUSS] Graceful Shutdown Handling by UDFs.
Hi Klou, +1 for this proposal. I am missing any mention of "cancel" in the design document though. In my understanding we are not planning to deprecate "cancel" completely (just cancel-with-savepoint, which is superseded by "stop"). In any case we should consider it in the design document. It seems to me that "cancel" should be consider an ungraceful shutdown, so that the Job could be restarted from last (retained) checkpoint (as right now). Cheers, Konstantin On Thu, Jul 4, 2019 at 3:21 PM Kostas Kloudas wrote: > Hi all, > > In many cases, UDFs (User Defined Functions) need to be able to perform > application-specific actions when they stop in an orderly manner. > Currently, Flink's UDFs, and more specifically the RichFunction which > exposes lifecycle-related hooks, only have a close() method which is called > in any case of job termination. This includes any form of orderly > termination (STOP or End-Of-Stream) and termination due to an error. > > > The FLIP in [1] and the design document in [2] propose the addition of an > interface that will allow UDFs that implement it to perform application > specific logic in the case of graceful termination. These cases include > DRAIN and SUSPEND for streaming jobs (see FLIP-34), but also reaching the > End-Of-Stream for jobs with finite sources. > > Let's have a lively discussion to solve this issue that has been around for > quite some time. > > Cheers, > Kostas > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-46%3A+Graceful+Shutdown+Handling+by+UDFs > > [2] > > https://docs.google.com/document/d/1SXfhmeiJfWqi2ITYgCgAoSDUv5PNq1T8Zu01nR5Ebog/edit?usp=sharing > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Planned Absences: 10.08.2019 - 31.08.2019, 05.09. - 06.09.2010 -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
[ANNOUNCE] Weekly Community Update 2019/27
the submitted job. Right now it looks like a fix is targeted for 1.9.1 and 1.8.2 [14] [11] https://issues.apache.org/jira/browse/FLINK-13063 [12] https://issues.apache.org/jira/browse/FLINK-12889 [13] https://issues.apache.org/jira/browse/FLINK-13059 [14] https://issues.apache.org/jira/browse/FLINK-12122 Events, Blog Posts, Misc * *Flink Forward Europ*e early-bird ends on the 15th of July. [15] * Upcoming Meetups * On 18th of July *Christos Hadjinikolis* is speaking at the "Big Data LDN Meetup" on "How real-time data processing is used for application in customer experience?" [16] [15] https://berlin-2019.flink-forward.org/ [16] https://www.meetup.com/big-data-ldn/events/262638878/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
Re: [DISCUSS] Flink project bylaws
Hi all, thanks a lot for driving this, Becket. I have two remarks regarding the "Actions" section: * In addition to a simple "Code Change" we could also add a row for "Code Change requiring a FLIP" with a reference to the FLIP process page. A FLIP will have/does have different rules for approvals, etc. * For "Code Change" the draft currently requires "one +1 from a committer who has not authored the patch followed by a Lazy approval (not counting the vote of the contributor), moving to lazy majority if a -1 is received". In my understanding this means, that a committer always needs a review and +1 from another committer. As far as I know this is currently not always the case (often committer authors, contributor reviews & +1s). I think it is worth thinking about how we can make it easy to follow the bylaws e.g. by having more Flink-specific Jira workflows and ticket types + corresponding permissions. While this is certainly "Step 2", I believe, we really need to make it as easy & transparent as possible, otherwise they will be unintentionally broken. Cheers and thanks, Konstantin On Thu, Jul 11, 2019 at 9:10 AM Becket Qin wrote: > Hi all, > > As it was raised in the FLIP process discussion thread [1], currently Flink > does not have official bylaws to govern the operation of the project. Such > bylaws are critical for the community to coordinate and contribute > together. It is also the basis of other processes such as FLIP. > > I have drafted a Flink bylaws page and would like to start a discussion > thread on this. > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026 > > The bylaws will affect everyone in the community. It'll be great to hear > your thoughts. > > Thanks, > > Jiangjie (Becket) Qin > > [1] > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-META-FLIP-Sticking-or-not-to-a-strict-FLIP-voting-process-td29978.html#none > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Planned Absences: 10.08.2019 - 31.08.2019, 05.09. - 06.09.2010 -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
[DISCUSS] Create "flink-playgrounds" repository
Hi everyone, in the course of implementing FLIP-42 we are currently reworking the Getting Started section of our documentation. As part of this, we are adding docker-compose-based playgrounds to get started with Flink operations and Flink SQL quickly. To reduce as much friction as possible for new users, we would like to maintain the required configuration files (docker-comose.yaml, flink-conf.yaml) in a separate new repository, `apache/flink-playgrounds`. You can find more details and a brief discussion on this in the corresponding Jira ticket [2]. What do you think? I am not sure, what kind of approval is required for such a change. So, my suggestion would be that we have lazy majority within the next 24 hours to create the repository, we proceed. Please let me know, if this requires a more formal approval. Best and thanks, Konstantin [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-42%3A+Rework+Flink+Documentation [2] https://issues.apache.org/jira/browse/FLINK-12749 -- Konstantin Knauf | Solutions Architect +49 160 91394525 Planned Absences: 10.08.2019 - 31.08.2019, 05.09. - 06.09.2010 -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
Re: [DISCUSS] Create "flink-playgrounds" repository
Hi Stephan, putting it under "flink-quickstarts" alone would not help. The user would still need to check out the whole `apache/flink` repository, which is a bit overwhelming. The Java/Scala quickstarts use Maven archetypes. Is this what you are suggesting? I think, this would be an option, but it seems strange to manage a pure Docker setup (eventually maybe only one file) in a Maven project. Best, Konstantin On Thu, Jul 11, 2019 at 3:52 PM Stephan Ewen wrote: > Hi all! > > I am fine with a separate repository. > > Quick question. though: Have you considered putting the setup not under > "docs" but under "flink-quickstart" or so? > Would that be equally cumbersome for users? > > Best, > Stephan > > > On Thu, Jul 11, 2019 at 12:19 PM Fabian Hueske wrote: > > > Hi, > > > > I think Quickstart should be as lightweight as possible and follow common > > practices. > > A Git repository for a few configuration files might feel like overkill, > > but IMO it makes sense because this ensures users can get started with 3 > > commands: > > > > $ git clone .../flink-playground > > $ cd flink-playground > > $ docker-compose up -d > > > > So +1 to create a repository. > > > > Thanks, Fabian > > > > Am Do., 11. Juli 2019 um 12:07 Uhr schrieb Robert Metzger < > > rmetz...@apache.org>: > > > > > +1 to create a repo. > > > > > > On Thu, Jul 11, 2019 at 11:10 AM Konstantin Knauf < > > > konstan...@ververica.com> > > > wrote: > > > > > > > Hi everyone, > > > > > > > > in the course of implementing FLIP-42 we are currently reworking the > > > > Getting Started section of our documentation. As part of this, we are > > > > adding docker-compose-based playgrounds to get started with Flink > > > > operations and Flink SQL quickly. > > > > > > > > To reduce as much friction as possible for new users, we would like > to > > > > maintain the required configuration files (docker-comose.yaml, > > > > flink-conf.yaml) in a separate new repository, > > > `apache/flink-playgrounds`. > > > > > > > > You can find more details and a brief discussion on this in the > > > > corresponding Jira ticket [2]. > > > > > > > > What do you think? > > > > > > > > I am not sure, what kind of approval is required for such a change. > So, > > > my > > > > suggestion would be that we have lazy majority within the next 24 > hours > > > to > > > > create the repository, we proceed. Please let me know, if this > > requires a > > > > more formal approval. > > > > > > > > Best and thanks, > > > > > > > > Konstantin > > > > > > > > [1] > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-42%3A+Rework+Flink+Documentation > > > > [2] https://issues.apache.org/jira/browse/FLINK-12749 > > > > > > > > > > > > -- > > > > > > > > Konstantin Knauf | Solutions Architect > > > > > > > > +49 160 91394525 > > > > > > > > > > > > Planned Absences: 10.08.2019 - 31.08.2019, 05.09. - 06.09.2010 > > > > > > > > > > > > -- > > > > > > > > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany > > > > > > > > -- > > > > > > > > Ververica GmbH > > > > Registered at Amtsgericht Charlottenburg: HRB 158244 B > > > > Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen > > > > > > > > > > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Planned Absences: 10.08.2019 - 31.08.2019, 05.09. - 06.09.2010 -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
Re: [DISCUSS] Create "flink-playgrounds" repository
Hi Xuefu, thanks for having a look at this. I am sure this playground setup will need to be maintained and will go through revisions, too. So, we would still need to keep the content of the archive in some repository + the additional piece of automation to update the archive, when the documentation is build. To me this seems to be more overhead than a repository. Best, Konstantin On Thu, Jul 11, 2019 at 9:00 PM Xuefu Z wrote: > The idea seems interesting, but I'm wondering if we have considered > publishing .tz file hosted somewhere in Flink site with a link in the doc. > This might avoid the "overkill" of introducing a repo, which is main used > for version control in development cycles. On the other hand, a docker > setup, once published, will seldom (if ever) go thru revisions. > > Thanks, > Xuefu > > > > On Thu, Jul 11, 2019 at 6:58 AM Konstantin Knauf > > wrote: > > > Hi Stephan, > > > > putting it under "flink-quickstarts" alone would not help. The user would > > still need to check out the whole `apache/flink` repository, which is a > bit > > overwhelming. The Java/Scala quickstarts use Maven archetypes. Is this > what > > you are suggesting? I think, this would be an option, but it seems > strange > > to manage a pure Docker setup (eventually maybe only one file) in a Maven > > project. > > > > Best, > > > > Konstantin > > > > On Thu, Jul 11, 2019 at 3:52 PM Stephan Ewen wrote: > > > > > Hi all! > > > > > > I am fine with a separate repository. > > > > > > Quick question. though: Have you considered putting the setup not under > > > "docs" but under "flink-quickstart" or so? > > > Would that be equally cumbersome for users? > > > > > > Best, > > > Stephan > > > > > > > > > On Thu, Jul 11, 2019 at 12:19 PM Fabian Hueske > > wrote: > > > > > > > Hi, > > > > > > > > I think Quickstart should be as lightweight as possible and follow > > common > > > > practices. > > > > A Git repository for a few configuration files might feel like > > overkill, > > > > but IMO it makes sense because this ensures users can get started > with > > 3 > > > > commands: > > > > > > > > $ git clone .../flink-playground > > > > $ cd flink-playground > > > > $ docker-compose up -d > > > > > > > > So +1 to create a repository. > > > > > > > > Thanks, Fabian > > > > > > > > Am Do., 11. Juli 2019 um 12:07 Uhr schrieb Robert Metzger < > > > > rmetz...@apache.org>: > > > > > > > > > +1 to create a repo. > > > > > > > > > > On Thu, Jul 11, 2019 at 11:10 AM Konstantin Knauf < > > > > > konstan...@ververica.com> > > > > > wrote: > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > in the course of implementing FLIP-42 we are currently reworking > > the > > > > > > Getting Started section of our documentation. As part of this, we > > are > > > > > > adding docker-compose-based playgrounds to get started with Flink > > > > > > operations and Flink SQL quickly. > > > > > > > > > > > > To reduce as much friction as possible for new users, we would > like > > > to > > > > > > maintain the required configuration files (docker-comose.yaml, > > > > > > flink-conf.yaml) in a separate new repository, > > > > > `apache/flink-playgrounds`. > > > > > > > > > > > > You can find more details and a brief discussion on this in the > > > > > > corresponding Jira ticket [2]. > > > > > > > > > > > > What do you think? > > > > > > > > > > > > I am not sure, what kind of approval is required for such a > change. > > > So, > > > > > my > > > > > > suggestion would be that we have lazy majority within the next 24 > > > hours > > > > > to > > > > > > create the repository, we proceed. Please let me know, if this > > > > requires a > > > > > > more formal approval. > > > > > > > > > > > > Best and thanks, > > > > > > > > > > > > Konstantin > > > >
Re: [DISCUSS] Create "flink-playgrounds" repository
Hi everyone, thanks everyone for you remarks and questions! We have three +1s, so I think, we can proceed with this. @Robert: Could you create the request to the INFRA? Thanks, Konstantin On Fri, Jul 12, 2019 at 10:16 AM Stephan Ewen wrote: > I am fine with a separate repository, was just raising the other option as > a question. > > +1 to go ahead > > On Fri, Jul 12, 2019 at 9:49 AM Konstantin Knauf > > wrote: > > > Hi Xuefu, > > > > thanks for having a look at this. I am sure this playground setup will > need > > to be maintained and will go through revisions, too. So, we would still > > need to keep the content of the archive in some repository + the > additional > > piece of automation to update the archive, when the documentation is > build. > > To me this seems to be more overhead than a repository. > > > > Best, > > > > Konstantin > > > > > > On Thu, Jul 11, 2019 at 9:00 PM Xuefu Z wrote: > > > > > The idea seems interesting, but I'm wondering if we have considered > > > publishing .tz file hosted somewhere in Flink site with a link in the > > doc. > > > This might avoid the "overkill" of introducing a repo, which is main > used > > > for version control in development cycles. On the other hand, a docker > > > setup, once published, will seldom (if ever) go thru revisions. > > > > > > Thanks, > > > Xuefu > > > > > > > > > > > > On Thu, Jul 11, 2019 at 6:58 AM Konstantin Knauf < > > konstan...@ververica.com > > > > > > > wrote: > > > > > > > Hi Stephan, > > > > > > > > putting it under "flink-quickstarts" alone would not help. The user > > would > > > > still need to check out the whole `apache/flink` repository, which > is a > > > bit > > > > overwhelming. The Java/Scala quickstarts use Maven archetypes. Is > this > > > what > > > > you are suggesting? I think, this would be an option, but it seems > > > strange > > > > to manage a pure Docker setup (eventually maybe only one file) in a > > Maven > > > > project. > > > > > > > > Best, > > > > > > > > Konstantin > > > > > > > > On Thu, Jul 11, 2019 at 3:52 PM Stephan Ewen > wrote: > > > > > > > > > Hi all! > > > > > > > > > > I am fine with a separate repository. > > > > > > > > > > Quick question. though: Have you considered putting the setup not > > under > > > > > "docs" but under "flink-quickstart" or so? > > > > > Would that be equally cumbersome for users? > > > > > > > > > > Best, > > > > > Stephan > > > > > > > > > > > > > > > On Thu, Jul 11, 2019 at 12:19 PM Fabian Hueske > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > I think Quickstart should be as lightweight as possible and > follow > > > > common > > > > > > practices. > > > > > > A Git repository for a few configuration files might feel like > > > > overkill, > > > > > > but IMO it makes sense because this ensures users can get started > > > with > > > > 3 > > > > > > commands: > > > > > > > > > > > > $ git clone .../flink-playground > > > > > > $ cd flink-playground > > > > > > $ docker-compose up -d > > > > > > > > > > > > So +1 to create a repository. > > > > > > > > > > > > Thanks, Fabian > > > > > > > > > > > > Am Do., 11. Juli 2019 um 12:07 Uhr schrieb Robert Metzger < > > > > > > rmetz...@apache.org>: > > > > > > > > > > > > > +1 to create a repo. > > > > > > > > > > > > > > On Thu, Jul 11, 2019 at 11:10 AM Konstantin Knauf < > > > > > > > konstan...@ververica.com> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > in the course of implementing FLIP-42 we are currently > > reworking > > > > the > > > > > &g
Re: [DISCUSS] Create "flink-playgrounds" repository
Thanks Robert. I will prepare the first PR early next week. On Fri, Jul 12, 2019 at 11:58 AM Robert Metzger wrote: > The repo creation was faster than expected: > https://github.com/apache/flink-playgrounds (it's not even listed here > yet: > https://gitbox.apache.org/repos/asf) > > > On Fri, Jul 12, 2019 at 11:55 AM Robert Metzger > wrote: > > > I will request the repo now, so that you can continue working on the > > documentation (thanks for that again :) ) > > > > > > I actually like Xuefu's idea of making an archive available. > > The good thing is that we can get this from any GitHub hosted repository. > > For example for Flink, this link let's you download an archive of Flink's > > latest master: https://github.com/apache/flink/archive/master.zip > > We would not need to set up any additional automation for this. > > > > > > > > On Fri, Jul 12, 2019 at 11:51 AM Konstantin Knauf < > > konstan...@ververica.com> wrote: > > > >> Hi everyone, > >> > >> thanks everyone for you remarks and questions! We have three +1s, so I > >> think, we can proceed with this. > >> > >> @Robert: Could you create the request to the INFRA? > >> > >> Thanks, > >> > >> Konstantin > >> > >> On Fri, Jul 12, 2019 at 10:16 AM Stephan Ewen wrote: > >> > >> > I am fine with a separate repository, was just raising the other > option > >> as > >> > a question. > >> > > >> > +1 to go ahead > >> > > >> > On Fri, Jul 12, 2019 at 9:49 AM Konstantin Knauf < > >> konstan...@ververica.com > >> > > > >> > wrote: > >> > > >> > > Hi Xuefu, > >> > > > >> > > thanks for having a look at this. I am sure this playground setup > will > >> > need > >> > > to be maintained and will go through revisions, too. So, we would > >> still > >> > > need to keep the content of the archive in some repository + the > >> > additional > >> > > piece of automation to update the archive, when the documentation is > >> > build. > >> > > To me this seems to be more overhead than a repository. > >> > > > >> > > Best, > >> > > > >> > > Konstantin > >> > > > >> > > > >> > > On Thu, Jul 11, 2019 at 9:00 PM Xuefu Z wrote: > >> > > > >> > > > The idea seems interesting, but I'm wondering if we have > considered > >> > > > publishing .tz file hosted somewhere in Flink site with a link in > >> the > >> > > doc. > >> > > > This might avoid the "overkill" of introducing a repo, which is > main > >> > used > >> > > > for version control in development cycles. On the other hand, a > >> docker > >> > > > setup, once published, will seldom (if ever) go thru revisions. > >> > > > > >> > > > Thanks, > >> > > > Xuefu > >> > > > > >> > > > > >> > > > > >> > > > On Thu, Jul 11, 2019 at 6:58 AM Konstantin Knauf < > >> > > konstan...@ververica.com > >> > > > > > >> > > > wrote: > >> > > > > >> > > > > Hi Stephan, > >> > > > > > >> > > > > putting it under "flink-quickstarts" alone would not help. The > >> user > >> > > would > >> > > > > still need to check out the whole `apache/flink` repository, > which > >> > is a > >> > > > bit > >> > > > > overwhelming. The Java/Scala quickstarts use Maven archetypes. > Is > >> > this > >> > > > what > >> > > > > you are suggesting? I think, this would be an option, but it > seems > >> > > > strange > >> > > > > to manage a pure Docker setup (eventually maybe only one file) > in > >> a > >> > > Maven > >> > > > > project. > >> > > > > > >> > > > > Best, > >> > > > > > >> > > > > Konstantin > >> > > > > > >> > > > > On Thu, Jul 11, 2019 at 3:52 PM Stephan Ewen > >> > wrote: > >> >
Re: [DISCUSS] Create "flink-playgrounds" repository
Hi Chesnay, thanks for joining the discussion. For clarification: the repository will only contain a docker-compose.yaml and a few configuration files. In terms of Flink images the plan is to use `library/flink:` [1]. Best, Konstantin [1] https://github.com/docker-flink/docker-flink On Fri, Jul 12, 2019 at 1:25 PM Chesnay Schepler wrote: > The last time this came up was about our download page which contained > snapshot links, with a big warning that these are for dev purposes, and > we had to take that down. Back than the conclusion was that snapshot > artifacts must only be linked on pages intended for developers, and most > not be visible on any page that one would direct users to. > > So I'm not quite convinced that this would fly. > > Given that we're intending to offer files that assemble docker images (I > guess?) I personally believe that these should go through a formal vote > process; for licensing alone we have to check that users aren't being > given dependencies with surprising restrictions. > > On a side note, any extra link is kinda unnecessary since you can get a > zip that through the GitHub UI. (go to repo main page -> Clone or > download -> Download Zip) > > On 12/07/2019 12:21, Robert Metzger wrote: > > That's a good point. We should point readers in the documentation to > > the repository first, and then write "for convenience, you can also > > download a snapshot of the repository here" AND put a disclaimer on > > the page, that this archive is not an official product released by the > > Flink PMC. > > > > Since this is not on the official download page, and clearly in the > > context of a "playground" or "demonstration", people will not assume a > > proper release. > > > > Do you think that is okay, or should we reach out to somebody at the > > foundation? > > > > > > > > On Fri, Jul 12, 2019 at 12:09 PM Chesnay Schepler > <mailto:ches...@apache.org>> wrote: > > > > Wouldn't this qualify for releasing snapshot artifacts to users? > > (Which, > > you know, shouldn't be done?) > > > > On 12/07/2019 11:55, Robert Metzger wrote: > > > I will request the repo now, so that you can continue working on > the > > > documentation (thanks for that again :) ) > > > > > > > > > I actually like Xuefu's idea of making an archive available. > > > The good thing is that we can get this from any GitHub hosted > > repository. > > > For example for Flink, this link let's you download an archive > > of Flink's > > > latest master: https://github.com/apache/flink/archive/master.zip > > > We would not need to set up any additional automation for this. > > > > > > > > > > > > On Fri, Jul 12, 2019 at 11:51 AM Konstantin Knauf > > mailto:konstan...@ververica.com>> > > > wrote: > > > > > >> Hi everyone, > > >> > > >> thanks everyone for you remarks and questions! We have three > > +1s, so I > > >> think, we can proceed with this. > > >> > > >> @Robert: Could you create the request to the INFRA? > > >> > > >> Thanks, > > >> > > >> Konstantin > > >> > > >> On Fri, Jul 12, 2019 at 10:16 AM Stephan Ewen > <mailto:se...@apache.org>> wrote: > > >> > > >>> I am fine with a separate repository, was just raising the > > other option > > >> as > > >>> a question. > > >>> > > >>> +1 to go ahead > > >>> > > >>> On Fri, Jul 12, 2019 at 9:49 AM Konstantin Knauf < > > >> konstan...@ververica.com <mailto:konstan...@ververica.com> > > >>> wrote: > > >>> > > >>>> Hi Xuefu, > > >>>> > > >>>> thanks for having a look at this. I am sure this playground > > setup will > > >>> need > > >>>> to be maintained and will go through revisions, too. So, we > > would still > > >>>> need to keep the content of the archive in some repository + the > > >>> additional > > >>>> piece of automation to update the archive, when the > > documentation is > > >>> build. > >
[ANNOUNCE] Weekly Community Update 2019/28
Dear community, happy to share this weeks community update with Apache Flink 1.9, bylaws for Apache Flink, Savepoints vs Checkpoints, Flink on ARM, and more. As always, please feel free to add additional updates and news to this thread! Flink Development === * [releases] The release branch for *Apache Flink 1.9.0 *has been cut lasat Thursday and we are moving on to release testing. [1] * [development process] Following the discussion on our FLIP process. Becket has initiated a discussion [2] on writing up *bylaws for Apache Flink*, a set of rules governing the core processes of the community. The plan to add bylaws was very well received and the discussion quickly moved to the first draft [3] of these bylaws. [development process] The migration of our CI infrastructure [4] to a non-ASF Travis account [5] has been implemented very quickly by Chesnay. Updates to the *CI bot *are currently discussed in a new thread [6]. * [development process] Robert opened a PR [7] to add the *code style and quality guide* to the contribution guide on the Apache Flink website. If you have not checked out the guide yet, now is good time ;) * [docs] In the course of FLIP-42 Konstantin added a first version of a *glossary* [8] to the Flink documentation defining frequently used terms like Operators, Task, Partition, a.s.o. This can help us to use a common terminology in our documentation as well as on the mailing lists. * [state management] Kostas has started a FLIP(-47) [9] on how to think about *savepoints* *vs* *checkpoints* in the future. In a nutshell, the FLIP proposes to waive the distinction between savepoints & checkpoints: both would simply be snapshots. Each snapshot can have different properties depending on its origin (system/user) and format (incremental, universal,...). The topic has also been addressed by Yu Li in FLIP-45 [10] recently. * [build] Xiyuan Wang has started a discussion about officially supporting *Apache Flink on ARM*-based systems. It looks like there are no big blockers on the technical side and the discussion currently focuses on integrating an ARM-based build into our CI infrastructure. [11] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Flink-1-9-release-branch-has-been-created-tp30500.html [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-project-bylaws-tp30409.html [3] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026 [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-solve-unstable-build-capacity-problem-on-TravisCI-tp29881.html [5] https://travis-ci.com/flink-ci [6] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/CiBot-Update-td30536.html [7] https://github.com/apache/flink-web/pull/224 [8] https://ci.apache.org/projects/flink/flink-docs-master/concepts/glossary.html [9] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/FLIP-47-Savepoints-vs-Checkpoints-tp30324.html [10] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-45-Reinforce-Job-Stop-Semantic-tp30161.html [11] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-ARM-support-for-Flink-td30298.html Notable Bugs === Quite a lot of bugs are being opened right now, but these are basically all related to release testing for Flink 1.9 or test instabilities - hence not affecting released versions of Apache Flink. [FLINK-11654] [1.7.2] [1.8.1] I recently sumbled across this tickcet of the (exatly-once) FlinkKafkaProducer, which was already created back in February. When running multiple instances of the same Flink program with an exactly-once FlinkKafkaProducer, the transational ids used by these producers can clash (both jobs use the same ids) and the jobs crash frequently. The resolution is currently under discussion. [12] [12] https://issues.apache.org/jira/browse/FLINK-11654 Events, Blog Posts, Misc * *Flink Forward Europ*e early-bird ends on the 15th of July. [13] * Upcoming Meetups * On 18th of July *Christos Hadjinikolis* is speaking at the "Big Data LDN Meetup" on "How real-time data processing is used for application in customer experience?" [14] * Rong Rong is now an Apache Flink Committer. Congratulations! [15] [13] https://berlin-2019.flink-forward.org/ [14] https://www.meetup.com/big-data-ldn/events/262638878/ [15] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Rong-Rong-becomes-a-Flink-committer-td30451.html#a30474 Cheers and have a nice evening, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
Re: Add relative path support in Savepoint Connector
Hi Jiayi, I think, this is not an issue with the State Processor API specifically, but with savepoints in general. The _metadata file of a savepoint uses absolute path references. There is a pretty old Jira ticket, which already mentioned this limitation [1]. Stefan (cc) might know more about any ongoing development in that direction and might have an idea about the effort of making savepoints relocatable. Best, Konstantin [1] https://issues.apache.org/jira/browse/FLINK-5763 On Wed, Jul 17, 2019 at 8:35 AM bupt_ljy wrote: > Hi again, > Anyone has any opinion on this topic? > > > Best Regards, > Jiayi Liao > > > Original Message > Sender:bupt_ljybupt_...@163.com > Recipient:dev...@flink.apache.org > Cc:Tzu-Li (Gordon) taitzuli...@apache.org > Date:Tuesday, Jul 16, 2019 15:24 > Subject:Add relative path support in Savepoint Connector > > > Hi all, > Firstly I appreciate Gordon and Seth’s effort on this feature, which > is really helpful to our production use. Like you mentioned in the > FLINK-12047, one of the production uses is that we use the existing state > to derive new state. However, since the state handle is using the absolute > path to get the input stream, we need to directly operate the state in > production environment, which is not an anxiety-reducing situation, at > least for me. > So I wonder if we can add the relative path support in this module > because the files are persisted in a directory after we take a savepoint, > which makes it achievable. I’m not sure whether my scenario is a common > case or not, but I think I can give my contributions if you all are okay > about this. > > > > > Best Regards, > Jiayi Liao -- Konstantin Knauf | Solutions Architect +49 160 91394525 Planned Absences: 10.08.2019 - 31.08.2019, 05.09. - 06.09.2010 -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
[ANNOUNCE] Weekly Community Update 2019/28
Dear community, happy to share this week's community update with news about a Flink Ecosystem website, new Jira permissions, Apache Flink 1.9.0 and a few more topics. As always, please feel free to add additional updates and news to this thread! Flink Development === [ecosystem] Daryl and Robert have pushed the first version of a *Flink ecoystem website* [1] to a staging environment for public testing. The website will contain a catalog of community maintained Apache Flink packages (e.g., connectors, metrics connectors, tooling, ...), which will be an easy way to share one's work with the wider community. Currently, Daryl and Robert are looking for feedback on the corresponding thread [2] or github project [3]. [development process] Following our new contributions guidelines [4], the* Jira permissions *have been changed so that only committers can assign users to tickets. The discussion in the original thread has moved to the topic of how to deal with the existing set of about 600 contributors in our Jira project. [5] [development process] It seems the discussion around initial *bylaws for Apache Flink* has pretty much converged already. Expecting a vote to start soonish unless new points are raised [6]. [releases] Gordon has prepared a RC0 for *Apache Flink 1.9.0* [7]. There will not be a vote on this release candidate. It is merely a reference point for ongoing release testing. To follow the release testing efforts checkout this Kanban board [8] containing all blockers for 1.9.0. [client] Zili Chen started a discussion on dropping the hardly used *o.a.f.api.common.Program* interface to facilitate future changes in the client code base. Users of o.a.f.api.common.Program please join the discussion. [9] [distributed coordination] Lamber-Ken asks for opinions about an ongoing Jira ticket [10] in the context of the *leader selection*. He proposes that the JobManager should not loose leadership if the connection to Zookeeper is temporarily suspended, but only once the connection is lost. [11] [1] https://flink-ecosystem-demo.flink-resources.org/ [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Create-a-Flink-ecosystem-website-tp27519.html [3] https://github.com/sorahn/flink-ecosystem/issues [4] https://flink.apache.org/contributing/contribute-code.html [5] https://issues.apache.org/jira/browse/INFRA-18644 [6] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-A-more-restrictive-JIRA-workflow-td27344.html [7] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-project-bylaws-tp30409.html [8] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/PREVIEW-Apache-Flink-1-9-0-release-candidate-0-tp30583.html [9] https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=328&projectKey=FLINK&selectedIssue=FLINK-13249 [10] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Drop-stale-class-Program-tp30744.html [11] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISSCUSS-Tolerate-temporarily-suspended-ZooKeeper-connections-tp30781.html Notable Bugs === Same as last week, very active Jira due to release testing, but only a few bugs for released versions being filed. [FLINK-13044] [1.8.1] A ticket I missed last week. Due to the way we currently shade the com.amazonaws packages in flink-s3-fs-hadoop certain configurations are not forwarded properly to the filesystem classes. The Jira ticket contains a good explanation of the problem. [12] [12] https://issues.apache.org/jira/browse/FLINK-13044 Events, Blog Posts, Misc * *Jinagje (Becket) Qin* is now an Apache Flink committer. Congratulations! [13] * The conference *schedule of Flink Forward Europe* has been announced. [14] The training schedule has already been available a for a few weeks [15]. * Upcoming Meetups * On 30th of July *Berecz Dániel *of Ekata talks about a "Parameter Server on Flink, an approach for model-parallel machine learning" at the Budapest Scala Meetup [16] [13] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Jiangjie-Becket-Qin-has-been-added-as-a-committer-to-the-Flink-project-td30670.html [14] https://berlin-2019.flink-forward.org/conference-program [15] https://berlin-2019.flink-forward.org/training-program [16] https://www.meetup.com/budapest-scala/events/263025323/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
[ANNOUNCE] Weekly Community Update 2019/30
Dear community, happy to share this week's community update with news about a Flink on PyPi, two proposal on the Table API/SQL and a bit more. As always, please feel free to add additional updates and news to this thread! Flink Development === [sql] *Temporary tables* are always registered with the built-in catalog. If the built-in catalog is not the default catalog, access to temporary tables needs to be fully qualified, which breaks current functionality, e.g. in the SQL Client. Dawid and Bowen stumbled across this during release testing and are proposing three different solutions for the upcoming release and beyond [1] [python] The discussion on how to release (and if) to release the *Python Table API on PyPi *is converging to publish it under the name "apache-flink" and to include the binary Java/Scala release in the Python package for convenience. [2] [fault-tolerance] In the discussion thread about supporting *"at-most-once"* delivery Stephan proposed to work towards this feature outside of Flink core and to offer it to the community through the upcoming Flink ecosystem website. This seems to get a lot of agreement. [3] [table-api] Xuannan has started a discussion on *"FLIP-48: Pluggable Intermediate Result Storage"*. This will make the result cache for interactive programming (FLIP-36] pluggable. [4] [releases] Flink 1.9.0 release testing is still ongoing. There is no RC1 yet. [5] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-temporary-tables-in-SQL-API-tp30831.html [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Publish-the-PyFlink-into-PyPI-td30095.html [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Allow-at-most-once-delivery-in-case-of-failures-tp29464.html [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-48-Pluggable-Intermediate-Result-Storage-tp31001.html [5] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Progress-updates-for-Apache-Flink-1-9-0-release-tp30565.html <https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=328&projectKey=FLINK&selectedIssue=FLINK-13249> Notable Bugs === Not to much. Still a lot of blockers and critical issues in the Flink project due to release testing. * [FLINK-13372] [1.8.1] [1.7.2] [1.6.4] There is a bug in the timestamp conversion of the Table API (due to timezone handling, of course). The issue contains a good explanation of the problem. [6] [6] https://issues.apache.org/jira/browse/FLINK-13372 Events, Blog Posts, Misc * *Kurt* is now part of the Apache Flink PMC. Congratulations! [7] * *Zhijiang* is now an Apache Flink committer. Congrats! [8] * Upcoming Meetups * On 30th of July *Berecz Dániel *of Ekata talks about a "Parameter Server on Flink, an approach for model-parallel machine learning" at the Budapest Scala Meetup [9] * On 30th of *July Grzegorz *Liter talks about “Stream processing with Apache Flink” at the EPAM tech talks in Wroclaw. [10] [7] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Kete-Young-is-now-part-of-the-Flink-PMC-tp30884p30998.html [8] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Zhijiang-Wang-has-been-added-as-a-committer-to-the-Flink-project-tp30830p30943.html [9] https://www.meetup.com/budapest-scala/events/263025323/ [10] https://www.meetup.com/EPAM-Tech-Talks-Wroc%C5%82aw/events/263027190/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
[ANNOUNCE] Weekly Community Update 2019/31
Dear community, happy to share this week's community update with news about a Flink on PyPi, code style proposals, Flink 1.9.0 RC1, Flame Graphs in the WebUI and a bit more. As always, please feel free to add additional updates and news to this thread! Flink Development === * [development process] Andrey has opened three threads following up on the recently published "code style and quality guide". They deal with the usage of Java Optional [1] (tl;dr: only as return type in public APIs) , Collections with initial capacity [2] (tl;dr: only if trivial) and how to wrap long arguments lists and chained method calls [3] (tl;dr: yes, but details still open). * [python] Jingcheng has started a voting thread on publishing PyFlink to PyPi. Name will be "apache-flink" and the account will be managed by the Apache Flink PMC. The vote has passed unanimously. [4] * [metrics] David proposed to add a CPU flame graph [5] for each Task to the WebUI (similar to the backpressure monitor). A CPU flame graph is a visualisation method for stack trace samples, which makes it easier to determine hot code paths. This has been well received and David is looking for a comitter to sheperd the effort. [6] * [releases] Kurt announced a second preview RC (RC1) for Flink 1.9.0 to facilitate ongoing release testing. There will be no vote on this RC. [7] * [filesystems] Aljoscha proposed to removev flink-mapr-fs module from Apache Flink due to recent problems pulling its dependencies. If removed the MapR filesytem could still be used with Flink's HadoopFilesystem [8] * [client] Tison has published a first design document on the recently discussed improvements to Flink's client API. [9] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-CODE-STYLE-Usage-of-Java-Optional-tp31240.html [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-CODE-STYLE-Create-collections-always-with-initial-capacity-tp31229.html [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-CODE-STYLE-Breaking-long-function-argument-lists-and-chained-method-calls-tp31242.html [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Publish-the-PyFlink-into-PyPI-tp31201.html [5] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html [6] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-CPU-flame-graph-for-a-job-vertex-in-web-UI-tp31188.html [7] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/PREVIEW-Apache-Flink-1-9-0-release-candidate-1-tp31233.html [8] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Removing-the-flink-mapr-fs-module-tp31080.html [9] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-client-api-enhancement-for-downstream-project-tp25844.html Notable Bugs === Nothing for 1.6/1.7/1.8 that came to my attention. Events, Blog Posts, Misc * *Nico Kruber *published the second part of his series on Flink's network stack. This time about metrics, monitoring and backpressure. (This slipped through last week.) [10] [10] https://flink.apache.org/2019/07/23/flink-network-stack-2.html Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
Re: [ANNOUNCE] Progress updates for Apache Flink 1.9.0 release
Flink 1.9.0 > > > release > > > > > > > > > > > > Hi all, > > > > > > > > > > > > It's been a while since our last update for the release testing > of > > > > 1.9.0, > > > > > > so I want to bring attention to the current status of the > release. > > > > > > > > > > > > We are approaching RC1 soon, waiting on the following specific > last > > > > > ongoing > > > > > > threads to be closed: > > > > > > - FLINK-13241: This fixes a problem where when using YARN, slot > > > > > allocation > > > > > > requests may be ignored [1] > > > > > > - FLINK-13371: Potential partitions resource leak in case of > > producer > > > > > > restarts [2] > > > > > > - FLINK-13350: Distinguish between temporary tables and persisted > > > > tables > > > > > > [3]. Strictly speaking this would be a new feature, but there > was a > > > > > > discussion here [4] to include a workaround for now in 1.9.0, > and a > > > > > proper > > > > > > solution later on in 1.10.x. > > > > > > - FLINK-12858: Potential distributed deadlock in case of > > synchronous > > > > > > savepoint failure [5] > > > > > > > > > > > > The above is the critical path for moving forward with an RC1 for > > > > > official > > > > > > voting. > > > > > > All of them have PRs already, and are currently being reviewed or > > > close > > > > > to > > > > > > being merged. > > > > > > > > > > > > Cheers, > > > > > > Gordon > > > > > > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-13241 > > > > > > [2] https://issues.apache.org/jira/browse/FLINK-13371 > > > > > > [3] https://issues.apache.org/jira/browse/FLINK-13350 > > > > > > [4] > > > > > > > > > > > > > > > > > > > > > > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-temporary-tables-in-SQL-API-td30831.html > > > > > > [5] https://issues.apache.org/jira/browse/FLINK-12858 > > > > > > > > > > > > On Tue, Jul 16, 2019 at 5:26 AM Tzu-Li (Gordon) Tai < > > > > tzuli...@apache.org > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Update: RC0 for 1.9.0 has been created. Please see [1] for the > > > > preview > > > > > > > source / binary releases and Maven artifacts. > > > > > > > > > > > > > > Cheers, > > > > > > > Gordon > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/PREVIEW-Apache-Flink-1-9-0-release-candidate-0-td30583.html > > > > > > > > > > > > > > On Mon, Jul 15, 2019 at 6:39 PM Tzu-Li (Gordon) Tai < > > > > > tzuli...@apache.org > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > >> Hi Flink devs, > > > > > > >> > > > > > > >> As previously announced by Kurt [1], the release branch for > > 1.9.0 > > > > has > > > > > > >> been cut [2] and we've now started the testing phase for this > > > > release, > > > > > > as > > > > > > >> well as resolving remaining blockers. > > > > > > >> > > > > > > >> I want to quickly provide an overview of our progress here. > > > > > > >> Also, over the course of the testing phase, we will update > this > > > mail > > > > > > >> thread every 2-3 days with the overall progress of the release > > to > > > > keep > > > > > > you > > > > > > >> updated. > > > > > > >> > > > > > > >> *1. Remaining blockers and critical issues* > > > > > > >> You can find a link here [3] for a release Kanban board that > > > > provides > > > > > an > > > > > > >> overview of the remaining blockers and critical issues for > > > releasing > > > > > > 1.9.0. > > > > > > >> The issues listed there are high priority for the release, so > > any > > > > help > > > > > > >> with reviewing or fixing them is highly appreciated! > > > > > > >> If you do assign yourself to any unassigned issue and start > > > working > > > > on > > > > > > >> it, please make sure to pull it to the "In Progress" column to > > let > > > > > > others > > > > > > >> be aware of this. > > > > > > >> > > > > > > >> *2. Creating RC 0 for 1.9.0* > > > > > > >> We will create RC0 now to drive forward the testing efforts. > > > > > > >> This should be ready by tomorrow morning (July 16, 8am CET). > > > > > > >> Note that we will not have an official vote for RC0, as this > is > > > > mainly > > > > > > to > > > > > > >> drive testing efforts. > > > > > > >> RC1 with an official vote will be created once the blockers > > listed > > > > in > > > > > > [3] > > > > > > >> are resolved. > > > > > > >> > > > > > > >> Cheers, > > > > > > >> Gordon > > > > > > >> > > > > > > >> [1] > > > > > > >> > > > > > > > > > > > > > > > > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Flink-1-9-release-branch-has-been-created-td30500.html > > > > > > >> [2] > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://gitbox.apache.org/repos/asf?p=flink.git;a=shortlog;h=refs/heads/release-1.9 > > > > > > >> [3] > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://issues.apache.org/jira/secure/RapidBoard.jspa?projectKey=FLINK&rapidView=328 > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Planned Absences: 10.08.2019 - 31.08.2019, 05.09. - 06.09.2019 -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
Re: [ANNOUNCE] Progress updates for Apache Flink 1.9.0 release
Hi Till, as Fabian said, we considered the option you mentioned, but in the end decided that not maintaining a separate images has more advantages. In the context of FLIP-42 we are also revisiting the examples in general and want to clean these up a bit. So, for what it's worth, there will be an opportunity for revisiting this topic soon. Best, Konstantin On Thu, Aug 8, 2019 at 11:43 AM Fabian Hueske wrote: > The motivation for including the job as an example is to not have to > maintain a separate Docker image. > We would like to use the regular Flink 1.9 image for the playground and > avoid to maintain an image that is slightly different from the regular 1.9 > image. > > Maintaining the job in a different repository or somewhere else would mean, > that we need to have a proper release cycle for it as well. > Having it among the other examples means it's included in the regular > release. > > Best, Fabian > > > Am Do., 8. Aug. 2019 um 09:47 Uhr schrieb Till Rohrmann < > trohrm...@apache.org>: > > > Before backporting the playground PR to the release-1.9, I'd like to > > understand why the ClickEventCount job needs to be part of the Flink > > distribution. Looking at the example, it seems to only work in > combination > > with a Kafka cluster. Since it is not self-contained, it does not add > much > > value for a user who does not want to use the playgrounds. Moreover, we > > already have the StateMachineExample job which can be used to read from > > Kafka if a Kafka cluster is available. So my question would be why don't > we > > include the example job in the docker images for the playground? This > would > > be in my opinion a better separation of concerns. > > > > I've cross posted my question on the original PR as well. > > > > Cheers, > > Till > > > > On Thu, Aug 8, 2019 at 9:23 AM Kurt Young wrote: > > > > > +1 to include this in 1.9.0, adding some examples doesn't look like new > > > feature to me. > > > BTW, I am also trying this tutorial based on release-1.9 branch, but > > > blocked by: > > > > > > git clone --branch release-1.10-SNAPSHOT > > > g...@github.com:apache/flink-playgrounds.git > > > > > > Neither 1.10 nor 1.9 exists in flink-playground yet. > > > > > > Best, > > > Kurt > > > > > > > > > On Thu, Aug 8, 2019 at 3:18 PM Fabian Hueske > wrote: > > > > > > > Hi, > > > > I worked with Konstantin and reviewed the PR. > > > > I think the playground is a great way to get started with Flink and > > > explore > > > > it's recovery mechanism and unique features like savepoints. > > > > > > > > I'm in favor of adding the required streaming example program for the > > 1.9 > > > > release unless there's a good technical argument against it. > > > > > > > > Best, Fabian > > > > > > > > > > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Planned Absences: 10.08.2019 - 31.08.2019, 05.09. - 06.09.2019 -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
Re: [ANNOUNCE] Progress updates for Apache Flink 1.9.0 release
Hi Till, we will try to find another way to make the playground available for users soon. The discussion of and how to split up the Flink Repository started only after we discussed the playground and flink-playgrounds repositories. I think, this is the reason we went this way, not necessarily convenience. Cheers, Konstantin On Thu, Aug 8, 2019 at 2:25 PM Till Rohrmann wrote: > Just as a short addendum, there are also benefits of having the > ClickEventCount job not being part of the Flink repository. Assume there is > a bug in the job, then you would have to wait for the next Flink release to > fix it. > > On Thu, Aug 8, 2019 at 2:24 PM Till Rohrmann wrote: > > > I see that keeping the playground job in the Flink repository has a > couple > > of advantages, among other things that it's easier to keep up to date. > > However, in particular in the light of the potential repository split > where > > we want to separate connectors from Flink core, it seems very problematic > > to put the ClickEventCount which depends on Flink's Kafka connector in > > Flink's distribution. To me it seems that this was the path of least > > resistance but I'm not sure whether it stays like this. I think it would > > have been cleaner to separate the playground project from Flink core. > > > > Cheers, > > Till > > > > On Thu, Aug 8, 2019 at 1:28 PM Konstantin Knauf < > konstan...@ververica.com> > > wrote: > > > >> Hi Till, > >> > >> as Fabian said, we considered the option you mentioned, but in the end > >> decided that not maintaining a separate images has more advantages. > >> > >> In the context of FLIP-42 we are also revisiting the examples in general > >> and want to clean these up a bit. So, for what it's worth, there will be > >> an > >> opportunity for revisiting this topic soon. > >> > >> Best, > >> > >> Konstantin > >> > >> > >> > >> On Thu, Aug 8, 2019 at 11:43 AM Fabian Hueske > wrote: > >> > >> > The motivation for including the job as an example is to not have to > >> > maintain a separate Docker image. > >> > We would like to use the regular Flink 1.9 image for the playground > and > >> > avoid to maintain an image that is slightly different from the regular > >> 1.9 > >> > image. > >> > > >> > Maintaining the job in a different repository or somewhere else would > >> mean, > >> > that we need to have a proper release cycle for it as well. > >> > Having it among the other examples means it's included in the regular > >> > release. > >> > > >> > Best, Fabian > >> > > >> > > >> > Am Do., 8. Aug. 2019 um 09:47 Uhr schrieb Till Rohrmann < > >> > trohrm...@apache.org>: > >> > > >> > > Before backporting the playground PR to the release-1.9, I'd like to > >> > > understand why the ClickEventCount job needs to be part of the Flink > >> > > distribution. Looking at the example, it seems to only work in > >> > combination > >> > > with a Kafka cluster. Since it is not self-contained, it does not > add > >> > much > >> > > value for a user who does not want to use the playgrounds. Moreover, > >> we > >> > > already have the StateMachineExample job which can be used to read > >> from > >> > > Kafka if a Kafka cluster is available. So my question would be why > >> don't > >> > we > >> > > include the example job in the docker images for the playground? > This > >> > would > >> > > be in my opinion a better separation of concerns. > >> > > > >> > > I've cross posted my question on the original PR as well. > >> > > > >> > > Cheers, > >> > > Till > >> > > > >> > > On Thu, Aug 8, 2019 at 9:23 AM Kurt Young wrote: > >> > > > >> > > > +1 to include this in 1.9.0, adding some examples doesn't look > like > >> new > >> > > > feature to me. > >> > > > BTW, I am also trying this tutorial based on release-1.9 branch, > but > >> > > > blocked by: > >> > > > > >> > > > git clone --branch release-1.10-SNAPSHOT > >> > > > g...@github.com:apache/flink-playgrounds.git > >> > > >
Re: [DISCUSS] Flink Docker Playgrounds
> > > > > specific Flink job. > > > > > * We had planned to add the example job of the playground as an > > example > > > > to > > > > > the flink main repository to bundle it with the Flink distribution. > > > > Hence, > > > > > it would have been included in the Docker-hub-official (soon to be > > > > > published) Flink 1.9 Docker image [2]. > > > > > * The main motivation of adding the job to the examples module in > the > > > > flink > > > > > main repo was to avoid the maintenance overhead for a customized > > Docker > > > > > image. > > > > > > > > > > When discussing to backport the playground job (and its data > > generator) > > > > to > > > > > include it in the Flink 1.9 examples, concerns were raised about > > their > > > > > Kafka dependency which will become a problem, if the community > agrees > > > on > > > > > the recently proposed repository split, which would remove > > flink-kafka > > > > from > > > > > the main repository [3]. I think this is a fair concern, that we > did > > > not > > > > > consider when designing the playground (also the repo split was not > > > > > proposed yet). > > > > > > > > > > If we don't add the playground job to the examples, we need to put > it > > > > > somewhere else. The obvious choice would be the flink-playgrounds > [4] > > > > > repository, which was intended for the docker-compose configuration > > > > files. > > > > > However, we would not be able to include it in the > > Docker-hub-official > > > > > Flink image any more and would need to maintain a custom Docker > > image, > > > > what > > > > > we tried to avoid. The custom image would of course be based on the > > > > > Docker-hub-official Flink image. > > > > > > > > > > There are different approaches for this: > > > > > > > > > > 1) Building one (or more) official ASF images > > > > > There is an official Apache Docker Hub user [5] and a bunch of > > projects > > > > > publish Docker images via this user. Apache Infra seems to support > an > > > > > process that automatically builds and publishes Docker images when > a > > > > > release tag is added to a repository. This feature needs to be > > > enabled. I > > > > > haven't found detailed documentation on this but there is a bunch > of > > > > INFRA > > > > > Jira tickets that discuss this mechanism. > > > > > This approach would mean that we need a formal Apache release for > > > > > flink-playgrounds (similar to flink-shaded). The obvious benefits > are > > > > that > > > > > these images would be ASF-official Docker images. In case we can > > > publish > > > > > more than one image per repo, we could also publish images for > other > > > > > playgrounds (like the SQL playground, which could be based on the > SQL > > > > > training that I built [6] which uses an image that is published > under > > > my > > > > > user [7]). > > > > > > > > > > 2) Rely on an external image > > > > > This image could be build by somebody in the community (like me). > > > Problem > > > > > is of course, that the image is not an official image and we would > > rely > > > > on > > > > > a volunteer to build the images. > > > > > OTOH, the overhead would be pretty small. No need to roll run full > > > > > releases, integration with Infra's build process, etc. > > > > > > > > > > IMO, the first approach is clearly the better choice but also > needs a > > > > bunch > > > > > of things to be put into place. > > > > > > > > > > What do others think? > > > > > Does somebody have another idea? > > > > > > > > > > Cheers, > > > > > Fabian > > > > > > > > > > [0] > > > > > > > > > > > > > > > https://ci.apache.org/projects/flink/flink-docs-master/getting-started/docker-playgrounds/flink_cluster_playground.html > > > > > [1] > > > > > > > > > > > > > > > https://ci.apache.org/projects/flink/flink-docs-master/getting-started/docker-playgrounds/flink_cluster_playground.html#anatomy-of-this-playground > > > > > [2] https://hub.docker.com/_/flink > > > > > [3] > > > > > > > > > > > > > > > https://lists.apache.org/thread.html/eb841f610ef2c191b8d00b6c07b2eab513da2e4eb2d7da5c5e6846f4@%3Cdev.flink.apache.org%3E > > > > > [4] https://github.com/apache/flink-playgrounds > > > > > [5] https://hub.docker.com/u/apache > > > > > [6] https://github.com/ververica/sql-training/ > > > > > [7] > https://hub.docker.com/r/fhueske/flink-sql-client-training-1.7.2 > > > > > > > > > > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Planned Absences: 10.08.2019 - 31.08.2019, 05.09. - 06.09.2019 -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
[ANNOUNCE] Weekly Community Update 2019/32
management in Apache Flink by *Yun Tang*. [15] [12] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Hequn-becomes-a-Flink-committer-tp31378.html [13] https://www.meetup.com/seattle-flink/events/263782233 [14] https://www.eventbrite.com/e/apache-pulsar-meetup-beijing-tickets-67849484635 [15] https://www.meetup.com/acm-sf/events/263768407/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 Planned Absences: 10.08.2019 - 31.08.2019, 05.09. - 06.09.2019 -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
[ANNOUNCE] Weekly Community Update 2019/33-36
e.1008284.n3.nabble.com/ANNOUNCE-Andrey-Zagrebin-becomes-a-Flink-committer-tp31735p31931.html [31] https://europe-2019.flink-forward.org/training-program [32] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/262680261/ [33] https://www.meetup.com/Apache-Flink-London-Meetup/events/264123672/ Cheers, Konstantin -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: Checkpoint metrics.
t;>> lock", which can be long if the source is blocked in trying to emit > data > >>> (again, backpressure). > >>> > >>> As part of FLIP-27 we will eliminate the checkpoint lock (have a > mailbox > >>> instead) which should lead to faster lock acquisition. > >>> > >>> The "unaligned checkpoints" discussion is looking at ways to make > >>> checkpoints much less susceptible to back pressure. > >>> > >>> > >>> > https://lists.apache.org/thread.html/fd5b6cceb4bffb635e26e7ec0787a8db454ddd64aadb40a0d08a90a8@%3Cdev.flink.apache.org%3E > >>> > >>> Hope that helps understanding what is going on. > >>> > >>> Best, > >>> Stephan > >>> > >>> > >>> On Thu, Sep 12, 2019 at 1:25 AM Seth Wiesman > >>> wrote: > >>> > >>> > Great timing, I just debugged this on Monday. E2e time is checkpoint > >>> > coordinator to checkpoint coordinator, so it includes RPC to the > >>> source and > >>> > RPC from the operator back for the JM. > >>> > > >>> > Seth > >>> > > >>> > > On Sep 11, 2019, at 6:17 PM, Jamie Grier > >>> > wrote: > >>> > > > >>> > > Hey all, > >>> > > > >>> > > I need to make sense of this behavior. Any help would be > >>> appreciated. > >>> > > > >>> > > Here’s an example of a set of Flink checkpoint metrics I don’t > >>> > understand. This is the first operator in a job and as you can see > the > >>> > end-to-end time for the checkpoint is long, but it’s not explained by > >>> > either sync, async, or alignment times. I’m not sure what to make of > >>> > this. It makes me think I don’t understand the meaning of the > metrics > >>> > themselves. In my interpretation the end-to-end time should always > be, > >>> > roughly, the sum of the other components — certainly in the case of a > >>> > source task such as this. > >>> > > > >>> > > Any thoughts or clarifications anyone can provide on this? We have > >>> many > >>> > jobs with slow checkpoints that suffer from this sort of thing with > >>> metrics > >>> > that look similar. > >>> > > > >>> > > -Jamie > >>> > > > >>> > > >>> > >> > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2019/37
Dear community, happy to share this week's community update with the release of Flink 1.8.2, more work in the area of dynamic resource management, three proposals in the SQL space and a bit more. Flink Development == * [releases] *Flink 1.8.2* has been released. [1] * [resource management] Xintong has started a discussion on FLIP-56 "*Dynamic Slot Allocation*". With upcoming FLIP-53 [2] different Tasks can have different resource requirements leading to different resources requirements for the TaskSlots these Tasks are deployed into. On the other hand, TaskSlots are currently statically configured during TaskManager creation. FLIP-56 proposes to start TaskManagers without any TaskSlots initially and to dynamically create/release TaskSlots based on the resource requirements of the Tasks to be deployed. [3] * [connectors] Stephan has started a discussion to drop the Kafka connector for *Kafka 0.9* and *0.10.* If you are relying on these connectors, it's a good idea to join the discussion. [4] * [connectors] The discussion on contributing a *Pulsar connector to Flink* seems to be converging towards adding the connector soon based on the existing source interface, but clearly documenting (experimental?) that in the long-term only a version based on the new source interface will be supported. [5] * [sql] As a spin-off of the discussion on reworking the function catalog, Bowen has started a discussion to support loading *external built-in function via modules*. [6] * [sql] Dawid proposes to rework the support for* temporary objects* (tables/views, function) in the Table API. [7] * [sql] Jark started a discussion on FLIP-66 to add support for *time attributes in Flink's SQL DDL*. This means you will be able to specify event time and processing time columns (including watermarking strategy) via the SQL DDL. [8] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Apache-Flink-1-8-2-released-tp33050.html [2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management [3] https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tp32538p33055.html [5] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Drop-older-versions-of-Kafka-Connectors-0-9-0-10-for-Flink-1-10-tp32954.html [6] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-modular-built-in-functions-tp32918.html [7] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-64-Support-for-Temporary-Objects-in-Table-module-tp32684.html [8] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-66-Support-time-attribute-in-SQL-DDL-tp32766.html Notable Bugs == Overall, there were only 20 ticket bug tickets (excluding test instabilities) updated this week, none of which seem to be particularly relevant for a wider audience ;) Events, Blog Posts, Misc === * *Zili Chen *is now an Apache Flink committer. Congrats! [9] * *Fabian* and *Seth* have published a blog post on the newly added *State Processor API* on the Flink blog [10] * *Marta *has published a Flink Community Update on the Flink blog focusing on stats, events and ongoing initiatives in the Flink Community.[11] * *Google *has open-sourced a *Kubernetes Operator for Flink* [12] which automates cluster creation and job submission on Kubernetes. [9] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Zili-Chen-becomes-a-Flink-committer-tp32961.html [10] https://flink.apache.org/feature/2019/09/13/state-processor-api.html [11] https://flink.apache.org/news/2019/09/10/community-update.html [12] https://github.com/GoogleCloudPlatform/flink-on-k8s-operator Cheers, Konstantin -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [VOTE][2.0] FLIP-336: Remove "now" timestamp field from REST responses
+1 (binding) Am Mo., 24. Juli 2023 um 14:15 Uhr schrieb Martijn Visser < martijnvis...@apache.org>: > +1 (binding) > > On Mon, Jul 24, 2023 at 1:08 PM Chesnay Schepler > wrote: > > > Hello, > > > > I'd like to start a vote on FLIP-336. > > > > Discussion thread: > > https://lists.apache.org/thread/ms3sk0p21n7q2oq0fjtq43koqj2pmwv4 > > FLIP: > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263424789 > > > > Regards, > > Chesnay > > > -- https://twitter.com/snntrable https://github.com/knaufk
Re: [VOTE][2.0] FLIP-340: Remove rescale REST endpoint
+1 (binding) Am Mo., 24. Juli 2023 um 14:15 Uhr schrieb Martijn Visser < martijnvis...@apache.org>: > +1 (binding) > > On Mon, Jul 24, 2023 at 1:10 PM Chesnay Schepler > wrote: > > > Hello, > > > > I'd like to start a vote on FLIP-340. > > > > Discussion thread: > > https://lists.apache.org/thread/zkslk0qzttwgs8j3s951rht3v1tsyqqk > > FLIP: > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-340%3A+Remove+rescale+REST+endpoint > > > > Regards, > > Chesnay > > > -- https://twitter.com/snntrable https://github.com/knaufk
Re: [DISCUSS] Proposing an LTS Release for the 1.x Line
Hi Alex, yes, I think, it makes sense to support the last 1.x release longer than usual. This should be limited to bugfixes in my opinion. Best, Konstantin Am Di., 25. Juli 2023 um 07:07 Uhr schrieb Xintong Song < tonysong...@gmail.com>: > Hi Alex, > > Providing a longer supporting period for the last 1.x minor release makes > sense to me. > > I think we need to be more specific about what LTS means here. > >- IIUC, that means for the last 1.x minor release, we will keep >providing 1.x.y / 1.x.z bugfix release. This is a stronger support > compared >to regular minor releases which by default are only supported for 2 > minor >release cycles. >- Do we only provide bug fixes for the LTS release, or do we also allow >backporting features to that release? >- How long exactly shall we support the LTS release? > > And maybe we can make this a general convention for last minor releases for > all major releases, rather than only discuss it for the 2.0 version bump. > > @Leonard, > > I'd like to clarify that there are no community decisions yet on release > 2.0 after 1.19. It is possible to have 1.20 before 2.0. > > Best, > > Xintong > > > > On Tue, Jul 25, 2023 at 11:54 AM Leonard Xu wrote: > > > +1, it’s pretty necessary especially we deprecated so many APIs in 1.18 > > and plan to remove in 2.0. > > > > The 1.19 should be a proper version for LTS Release. > > > > Best, > > Leonard > > > > > > > On Jul 25, 2023, at 3:30 AM, Alexander Fedulov < > > alexander.fedu...@gmail.com> wrote: > > > > > > Hello everyone, > > > > > > Recently, there were a lot of discussions about the deprecation of > > various > > > APIs for the upcoming 2.0 release. It appears there are two main > > motivations > > > with opposing directions, causing these discussions to remain > unsettled. > > On > > > one hand, there's a desire to finally trim a wide range of legacy APIs, > > some > > > lingering around since the beginning of the 1.x release line (as far > > back as > > > 2016). On the other hand, there is a commitment to uphold our > guarantees > > to > > > the users, ensuring a smooth transition. > > > > > > I believe we could reconcile these two motivations. My proposition is > to > > > designate the final release of the 1.x timeline as a Long-Term Support > > (LTS) > > > release. By doing so, we would: > > > > > > 1. Enable more efficient cleanup and be liberated to introduce more > > breaking > > > changes, paving the way for greater innovation in the 2.0 release. > > > 2. Sustain a positive user experience by granting enough time for the > > > changes > > > introduced in 2.0 to stabilize, allowing users to confidently > > transition > > > their production code to the new release. > > > > > > I look forward to hearing your thoughts on this proposal. > > > > > > Best Regards, > > > Alex > > > > > -- https://twitter.com/snntrable https://github.com/knaufk
Re: [DISCUSS] Proposing an LTS Release for the 1.x Line
Hi Jing, let's not overindex on the Source-/SinkFunction discussion in this thread. We will generally drop/break a lot of APIs in Flink 2.0. So, naturally users will need to make more changes to their code in order to migrate from 1.x to Flink 2.0. In order to give them more time to do this, we support the last Flink 1.x release for a longer time with bug fix releases. Of course, we still encourage users to migrate to Flink 2.0, because at some point, we will stop support Flink 1.x. For example, if we followed Marton's proposal we would support Flink 1.x LTS for about 2 years (roughly 4 minor release cycles) instead of about 1 year (2 minor release cycles) for regular minor releases. This seems like a reasonable timeframe to me. It also gives us more time to discover and address blockers in migrating to Flink 2.x that we are not aware of right now. Best, Konstantin Am Di., 25. Juli 2023 um 12:48 Uhr schrieb Jing Ge : > Hi all, > > Overall, it is a good idea to provide the LTS release, but I'd like to > reference a concrete case as an example to understand what restrictions the > LTS should have. > > Hypothetically, Source-/Sink- Function have been deprecated in 1.x LTS and > removed in 2.0 and the issues[1] are not solved in 2.0. This is a typical > scenario that the old APIs are widely used in 1.x LTS and the new APIs in > 2.0 are not ready yet to take over all users. We will have the following > questions: > > 1. Is this scenario allowed at all? Do we all agree that there could be > some features/functionalities that only work in 1.x LTS after 2.0 has been > released? > 2. How long are we going to support 1.x LTS? 1 year? 2 years? As long as > the issues that block users from migrating to 2.0 are not solved, we can't > stop the LTS support, even if the predefined support time expires. > 3. What is the intention to release a new version with (or without) LTS? Do > we still want to engage users to migrate to the new release asap? If the > old APIs 1.x LTS offer more than the new APIs in 2.0 or it is almost > impossible to migrate, double effort will be required to maintain those > major releases for a very long time. We will be facing many cohorts. > > IMHO, we should be clear with those questions before we start talking about > LTS. WDYT? > > Best regards, > Jing > > > [1] https://lists.apache.org/thread/734zhkvs59w2o4d1rsnozr1bfqlr6rgm > > On Tue, Jul 25, 2023 at 6:08 PM Márton Balassi > wrote: > > > Hi team, > > > > +1 for supporting the last 1.x for a longer than usual period of time and > > limiting it to bugfixes. I would suggest supporting it for double the > usual > > amount of time (4 minor releases). > > > > On Tue, Jul 25, 2023 at 9:25 AM Konstantin Knauf > > wrote: > > > > > Hi Alex, > > > > > > yes, I think, it makes sense to support the last 1.x release longer > than > > > usual. This should be limited to bugfixes in my opinion. > > > > > > Best, > > > > > > Konstantin > > > > > > Am Di., 25. Juli 2023 um 07:07 Uhr schrieb Xintong Song < > > > tonysong...@gmail.com>: > > > > > > > Hi Alex, > > > > > > > > Providing a longer supporting period for the last 1.x minor release > > makes > > > > sense to me. > > > > > > > > I think we need to be more specific about what LTS means here. > > > > > > > >- IIUC, that means for the last 1.x minor release, we will keep > > > >providing 1.x.y / 1.x.z bugfix release. This is a stronger support > > > > compared > > > >to regular minor releases which by default are only supported for > 2 > > > > minor > > > >release cycles. > > > >- Do we only provide bug fixes for the LTS release, or do we also > > > allow > > > >backporting features to that release? > > > >- How long exactly shall we support the LTS release? > > > > > > > > And maybe we can make this a general convention for last minor > releases > > > for > > > > all major releases, rather than only discuss it for the 2.0 version > > bump. > > > > > > > > @Leonard, > > > > > > > > I'd like to clarify that there are no community decisions yet on > > release > > > > 2.0 after 1.19. It is possible to have 1.20 before 2.0. > > > > > > > > Best, > > > > > > > > Xintong > > > > > > > > > > > > > > > > On Tue, Jul 25, 2023 at 11:54 AM Leonard Xu > w
Re: [DISCUSS] FLIP-348: Support System Columns in SQL and Table API
Hi Timo, this makes sense to me. Option 3 seems reasonable, too. Cheers, Konstantin Am Di., 25. Juli 2023 um 12:53 Uhr schrieb Timo Walther : > Hi everyone, > > I would like to start a discussion about introducing the concept of > "System Columns" in SQL and Table API. > > The subject sounds bigger than it actually is. Luckily, Flink SQL > already exposes the concept of metadata columns. And this proposal is > just a slight adjustment for how metadata columns can be used as system > columns. > > The biggest problem of metadata columns currently is that a catalog > implementation can't provide them by default because they would affect > `SELECT *` when adding another one. > > Looking forward to your feedback on FLIP-348: > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-348%3A+Support+System+Columns+in+SQL+and+Table+API > > Thanks, > Timo > -- https://twitter.com/snntrable https://github.com/knaufk
Re: [VOTE] Release 2.0 must-have work items - Round 2
I assume this vote includes a decision to not removing SourceFunction/SinkFunction in Flink 2.0 (as it has been removed from the table). If this is the case, I don't think, this discussion has concluded. There are multiple contributors like myself, Martijn, Alex Fedulov and Maximilian Michels, who have indicated they would be in favor of deprecating/dropping them. This Source/Sink Function discussion seems to go in circles in general. I am wondering if it makes sense to have a call about this instead of repeating mailing list discussions. Am Di., 25. Juli 2023 um 13:38 Uhr schrieb Yu Li : > +1 (binding) > > Thanks for driving this, Xintong! > > Best Regards, > Yu > > > On Sun, 23 Jul 2023 at 18:28, Yuan Mei wrote: > > > +1 (binding) > > > > Thanks for driving the discussion through and for all the efforts in > > resolving the complexities :-) > > > > Best > > Yuan > > > > On Thu, Jul 20, 2023 at 5:23 PM Xintong Song > > wrote: > > > > > Hi all, > > > > > > I'd like to start another round of VOTE for the must-have work items > for > > > release 2.0 [1]. The corresponding discussion thread is [2], and the > > > previous voting thread is [3]. All comments from the previous voting > > thread > > > have been addressed. > > > > > > Please note that once the vote is approved, any changes to the > must-have > > > items (adding / removing must-have items, changing the priority) > requires > > > another vote. Assigning contributors / reviewers, updating > descriptions / > > > progress, changes to nice-to-have items do not require another vote. > > > > > > The vote will be open until at least July 25, following the consensus > > > voting process. Votes of PMC members are binding. > > > > > > Best, > > > > > > Xintong > > > > > > > > > [1] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release > > > > > > [2] https://lists.apache.org/thread/l3dkdypyrovd3txzodn07lgdwtwvhgk4 > > > > > > [3] https://lists.apache.org/thread/r0y9syc6k5nmcxvnd0hj33htdpdj9k6m > > > > > > -- https://twitter.com/snntrable https://github.com/knaufk
Re: [VOTE] Release 2.0 must-have work items - Round 2
Hi Xingtong, yes, I am fine with the conclusion for SourceFunction. I chatted with Leonard a bit last night. Let's continue this vote. Thanks for the clarification, Konstantin Am Mi., 26. Juli 2023 um 04:03 Uhr schrieb Xintong Song < tonysong...@gmail.com>: > Hi Konstantin, > > It seems the offline discussion has already taken place [1], and part of > the outcome is that removal of SourceFunction would be a *nice-to-have* > item for release 2.0 which may not block this *must-have* vote. Do you have > different opinions about the conclusions in [1]? > > If there are still concerns, and the discussion around this topic needs to > be continued, then I'd suggest (as I mentioned in [2]) not to further block > this vote (i.e. the decision on other must-have items). Release 2.0 still > has a long way to go, and it is likely we need to review and update the > list every once in a while. We can update the list with another vote if > later we decide to add the removal of SourceFunction to the must-have list. > > WDYT? > > Best, > > Xintong > > > [1] https://lists.apache.org/thread/yyw52k45x2sp1jszldtdx7hc98n72w7k > [2] https://lists.apache.org/thread/j5d5022ky8k5t088ffm03727o5g9x9jr > > On Tue, Jul 25, 2023 at 8:49 PM Konstantin Knauf > wrote: > > > I assume this vote includes a decision to not removing > > SourceFunction/SinkFunction in Flink 2.0 (as it has been removed from the > > table). If this is the case, I don't think, this discussion has > concluded. > > There are multiple contributors like myself, Martijn, Alex Fedulov and > > Maximilian Michels, who have indicated they would be in favor of > > deprecating/dropping them. This Source/Sink Function discussion seems to > go > > in circles in general. I am wondering if it makes sense to have a call > > about this instead of repeating mailing list discussions. > > > > Am Di., 25. Juli 2023 um 13:38 Uhr schrieb Yu Li : > > > > > +1 (binding) > > > > > > Thanks for driving this, Xintong! > > > > > > Best Regards, > > > Yu > > > > > > > > > On Sun, 23 Jul 2023 at 18:28, Yuan Mei wrote: > > > > > > > +1 (binding) > > > > > > > > Thanks for driving the discussion through and for all the efforts in > > > > resolving the complexities :-) > > > > > > > > Best > > > > Yuan > > > > > > > > On Thu, Jul 20, 2023 at 5:23 PM Xintong Song > > > > wrote: > > > > > > > > > Hi all, > > > > > > > > > > I'd like to start another round of VOTE for the must-have work > items > > > for > > > > > release 2.0 [1]. The corresponding discussion thread is [2], and > the > > > > > previous voting thread is [3]. All comments from the previous > voting > > > > thread > > > > > have been addressed. > > > > > > > > > > Please note that once the vote is approved, any changes to the > > > must-have > > > > > items (adding / removing must-have items, changing the priority) > > > requires > > > > > another vote. Assigning contributors / reviewers, updating > > > descriptions / > > > > > progress, changes to nice-to-have items do not require another > vote. > > > > > > > > > > The vote will be open until at least July 25, following the > consensus > > > > > voting process. Votes of PMC members are binding. > > > > > > > > > > Best, > > > > > > > > > > Xintong > > > > > > > > > > > > > > > [1] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release > > > > > > > > > > [2] > https://lists.apache.org/thread/l3dkdypyrovd3txzodn07lgdwtwvhgk4 > > > > > > > > > > [3] > https://lists.apache.org/thread/r0y9syc6k5nmcxvnd0hj33htdpdj9k6m > > > > > > > > > > > > > > > > > > -- > > https://twitter.com/snntrable > > https://github.com/knaufk > > > -- *Konstantin Knauf* knauf.konstan...@gmail.com
Re: [VOTE] Release 2.0 must-have work items
Hi everyone, I'd just like to add that we also said, that we would continue the discussion to come up and agree on a list of concrete blockers for the removal of SourceFunction, so that don't need to have the same discussion again in half a year. And while we are add it, we should do the same thing for SinkFunction. Best, Konstantin Am Mi., 26. Juli 2023 um 03:35 Uhr schrieb Xintong Song < tonysong...@gmail.com>: > Thanks Leonard for driving this, and thanks everyone for the discussion. > The back-and-force reflects the importance and complexity around this > topic. Glad to see we finally reached consensus. > > Best, > > Xintong > > > > On Wed, Jul 26, 2023 at 12:42 AM Jing Ge wrote: > > > Thanks Leonard for driving it. We are now on the same page. > > > > Best regards, > > Jing > > > > On Tue, Jul 25, 2023 at 9:19 PM Leonard Xu wrote: > > > >> We’ve detailed offline discussions with @Alexander and @Jingsong, about > >> “Remove SourceFunction” item, we’ve reached a consensus as following: > >> > >> 1. Deprecate SourceFunction in 1.18 and implement following improvement > >> subtasks of FLINK-28045[1] later is reasonable for all of us. > >> > >> 2. Deleting SourceFunction API depends on future’s work progress, thus > >> “Remove SourceFunction APIs” should be a nice to have item. Alexander > has > >> volunteered to take these subtasks and would try to finish them next, > >> thanks again. > >> > >> 3. As a nice to have item, and its READY status depends on future’s > work > >> progress, this won't block release 2.0 must-have item vote. > >> > >> Thanks again @Alexander, @Jingsong and @Xintong for driving these > things > >> forward. > >> > >> Also CC RMs for 1.18 @QingSheng @Jing @Martijn @Konstantin, I’ve > >> communicated with Alexander and would like to help review the > deprecation > >> PR again. > >> > >> Best, > >> Leonard > >> > >> [1] https://issues.apache.org/jira/browse/FLINK-28045 > >> > >> > >> On Jul 21, 2023, at 6:09 PM, Chesnay Schepler > wrote: > >> > >> On 21/07/2023 11:45, Leonard Xu wrote: > >> > >> In this way, the user will see the deprecated API firstly but they can > >> not find a candidate if we can not finish all tasks in one minor > version . > >> > >> > >> i'm not convinced that this matters. There will be a whole bunch of APIs > >> deprecated in 1.18 (that will remain in 1.x!) without a replacement so > we > >> can remove them in 2.0. > >> We already accepted this scenario. > >> > >> > >> > -- https://twitter.com/snntrable https://github.com/knaufk
Re: [DISCUSS] Proposing an LTS Release for the 1.x Line
Hi Jing, > How could we help users and avoid this happening? I don't think we will be able to avoid this in all cases. And I think that's ok. Its always a trade-off between supporting new use cases and moving the project forward and backwards compatibility (in a broad sense). For example, we dropped Mesos support in a minor release in the past. If you're only option for running Flink was Mesos, you were stuck on Flink 1.13 or so. So, I think, it is in the end a case-by-case decision. How big is the cost of continued support a "legacy feature/system" and how many users are affected to which degree by dropping it? Best, Konstantin Am Di., 25. Juli 2023 um 18:34 Uhr schrieb Jing Ge : > Hi Konstantin, > > I might have not made myself clear enough, apologies. The > source-/sink-function was used as a concrete example to discuss the pattern > before we decided to offer LTS. The intention was not to hijack this thread > to discuss how to deprecate them. > > We all wish that the only thing users need to migrate from Flink 1.x to 2.0 > is some code changes in their repos and we all wish users will migrate, if > LTS has long enough support time. But the question I tried to discuss is > not the wish but the "How?". We might be able to toss the high migration > effort aside(we shouldn't), since it is theoretically still doable if users > have long enough time, even if the effort is extremely high. Another > concern is that if "function regressions" is allowed in 2.0, i.e. if 2.0 > has a lack of functionalities or bugs compared to 1.x, there will be no way > for users to do the migration regardless of whether we encourage them to > migrate or they haven been given enough time(how long is enough?) because > LTS has been offered. How could we help users and avoid this happening? > > Best regards, > Jing > > On Tue, Jul 25, 2023 at 6:57 PM Konstantin Knauf > wrote: > > > Hi Jing, > > > > let's not overindex on the Source-/SinkFunction discussion in this > thread. > > > > We will generally drop/break a lot of APIs in Flink 2.0. So, naturally > > users will need to make more changes to their code in order to migrate > from > > 1.x to Flink 2.0. In order to give them more time to do this, we support > > the last Flink 1.x release for a longer time with bug fix releases. > > > > Of course, we still encourage users to migrate to Flink 2.0, because at > > some point, we will stop support Flink 1.x. For example, if we followed > > Marton's proposal we would support Flink 1.x LTS for about 2 years > (roughly > > 4 minor release cycles) instead of about 1 year (2 minor release cycles) > > for regular minor releases. This seems like a reasonable timeframe to me. > > It also gives us more time to discover and address blockers in migrating > to > > Flink 2.x that we are not aware of right now. > > > > Best, > > > > Konstantin > > > > Am Di., 25. Juli 2023 um 12:48 Uhr schrieb Jing Ge > > : > > > > > Hi all, > > > > > > Overall, it is a good idea to provide the LTS release, but I'd like to > > > reference a concrete case as an example to understand what restrictions > > the > > > LTS should have. > > > > > > Hypothetically, Source-/Sink- Function have been deprecated in 1.x LTS > > and > > > removed in 2.0 and the issues[1] are not solved in 2.0. This is a > typical > > > scenario that the old APIs are widely used in 1.x LTS and the new APIs > in > > > 2.0 are not ready yet to take over all users. We will have the > following > > > questions: > > > > > > 1. Is this scenario allowed at all? Do we all agree that there could be > > > some features/functionalities that only work in 1.x LTS after 2.0 has > > been > > > released? > > > 2. How long are we going to support 1.x LTS? 1 year? 2 years? As long > as > > > the issues that block users from migrating to 2.0 are not solved, we > > can't > > > stop the LTS support, even if the predefined support time expires. > > > 3. What is the intention to release a new version with (or without) > LTS? > > Do > > > we still want to engage users to migrate to the new release asap? If > the > > > old APIs 1.x LTS offer more than the new APIs in 2.0 or it is almost > > > impossible to migrate, double effort will be required to maintain those > > > major releases for a very long time. We will be facing many cohorts. > > > > > > IMHO, we should be clear with those questions before we start talking > >
Re: [ANNOUNCE] New Apache Flink PMC Member - Matthias Pohl
Congrats, Matthias! Am Fr., 4. Aug. 2023 um 09:15 Uhr schrieb Paul Lam : > Congratulation, Matthias! > > Best, > Paul Lam > > > 2023年8月4日 15:09,yuxia 写道: > > > > Congratulation, Matthias! > > > > Best regards, > > Yuxia > > > > - 原始邮件 - > > 发件人: "Yun Tang" > > 收件人: "dev" > > 发送时间: 星期五, 2023年 8 月 04日 下午 3:04:52 > > 主题: Re: [ANNOUNCE] New Apache Flink PMC Member - Matthias Pohl > > > > Congratulation, Matthias! > > > > > > Best > > Yun Tang > > > > From: Jark Wu > > Sent: Friday, August 4, 2023 15:00 > > To: dev@flink.apache.org > > Subject: Re: [ANNOUNCE] New Apache Flink PMC Member - Matthias Pohl > > > > Congratulations, Matthias! > > > > Best, > > Jark > > > > On Fri, 4 Aug 2023 at 14:59, Weihua Hu wrote: > > > >> Congratulations, Matthias! > >> > >> Best, > >> Weihua > >> > >> > >> On Fri, Aug 4, 2023 at 2:49 PM Yuxin Tan > wrote: > >> > >>> Congratulations, Matthias! > >>> > >>> Best, > >>> Yuxin > >>> > >>> > >>> Sergey Nuyanzin 于2023年8月4日周五 14:21写道: > >>> > Congratulations, Matthias! > Well deserved! > > On Fri, Aug 4, 2023 at 7:59 AM liu ron wrote: > > > Congrats, Matthias! > > > > Best, > > Ron > > > > Shammon FY 于2023年8月4日周五 13:24写道: > > > >> Congratulations, Matthias! > >> > >> On Fri, Aug 4, 2023 at 1:13 PM Samrat Deb > wrote: > >> > >>> Congrats, Matthias! > >>> > >>> > >>> On Fri, 4 Aug 2023 at 10:13 AM, Benchao Li >>> > > wrote: > >>> > Congratulations, Matthias! > > Jing Ge 于2023年8月4日周五 12:35写道: > > > Congrats! Matthias! > > > > Best regards, > > Jing > > > > On Fri, Aug 4, 2023 at 12:09 PM Yangze Guo < > >> karma...@gmail.com > > >> wrote: > > > >> Congrats, Matthias! > >> > >> Best, > >> Yangze Guo > >> > >> On Fri, Aug 4, 2023 at 11:44 AM Qingsheng Ren < > re...@apache.org> > wrote: > >>> > >>> Congratulations, Matthias! This is absolutely well > >>> deserved. > >>> > >>> Best, > >>> Qingsheng > >>> > >>> On Fri, Aug 4, 2023 at 11:31 AM Rui Fan < > 1996fan...@gmail.com> > wrote: > >>> > Congratulations Matthias, well deserved! > > Best, > Rui Fan > > On Fri, Aug 4, 2023 at 11:30 AM Leonard Xu < > > xbjt...@gmail.com> > > wrote: > > > Congratulations, Matthias. > > > > Well deserved ^_^ > > > > Best, > > Leonard > > > > > >> On Aug 4, 2023, at 11:18 AM, Xintong Song < > tonysong...@gmail.com > >> > wrote: > >> > >> Hi everyone, > >> > >> On behalf of the PMC, I'm very happy to announce > >> that > >>> Matthias > >> Pohl has > >> joined the Flink PMC! > >> > >> Matthias has been consistently contributing to the > > project > since > >> Sep > > 2020, > >> and became a committer in Dec 2021. He mainly works > >>> in > >>> Flink's > > distributed > >> coordination and high availability areas. He has > >>> worked > > on > >>> many > >> FLIPs > >> including FLIP195/270/285. He helped a lot with the > > release > >> management, > >> being one of the Flink 1.17 release managers and > >> also > > very > active > >> in > > Flink > >> 1.18 / 2.0 efforts. He also contributed a lot to > > improving > >>> the > >> build > >> stability. > >> > >> Please join me in congratulating Matthias! > >> > >> Best, > >> > >> Xintong (on behalf of the Apache Flink PMC) > > > > > > >> > > > > > -- > > Best, > Benchao Li > > >>> > >> > > > > > -- > Best regards, > Sergey > > >>> > >> > > -- https://twitter.com/snntrable https://github.com/knaufk
Re: [DISCUSS] [FLINK-32873] Add a config to allow disabling Query hints
Hi Bonnie, this makes sense to me, in particular, given that we already have this toggle for a different type of hints. Best, Konstantin Am Mi., 16. Aug. 2023 um 19:38 Uhr schrieb Bonnie Arogyam Varghese : > Hi Liu, > Options hints could be a security concern since users can override > settings. However, query hints specifically could affect performance. > Since we have a config to disable Options hint, I'm suggesting we also have > a config to disable Query hints. > > On Wed, Aug 16, 2023 at 9:41 AM liu ron wrote: > > > Hi, > > > > Thanks for driving this proposal. > > > > Can you explain why you would need to disable query hints because of > > security issues? I don't really understand why query hints affects > > security. > > > > Best, > > Ron > > > > Bonnie Arogyam Varghese 于2023年8月16日周三 > > 23:59写道: > > > > > Platform providers may want to disable hints completely for security > > > reasons. > > > > > > Currently, there is a configuration to disable OPTIONS hint - > > > > > > > > > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/#table-dynamic-table-options-enabled > > > > > > However, there is no configuration available to disable QUERY hints - > > > > > > > > > https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/dev/table/sql/queries/hints/#query-hints > > > > > > The proposal is to add a new configuration: > > > > > > Name: table.query-options.enabled > > > Description: Enable or disable the QUERY hint, if disabled, an > > > exception would be thrown if any QUERY hints are specified > > > Note: The default value will be set to true. > > > > > > -- https://twitter.com/snntrable https://github.com/knaufk
Re: [ANNOUNCE] Release 1.18.0, release candidate #1
Hi everyone, I've just opened a PR for the release announcement [1] and I am looking forward to reviews and feedback. Cheers, Konstantin [1] https://github.com/apache/flink-web/pull/680 Am Fr., 6. Okt. 2023 um 11:03 Uhr schrieb Sergey Nuyanzin < snuyan...@gmail.com>: > sorry for not mentioning it in previous mail > > based on the reason above I'm > -1 (non-binding) > > also there is one more issue [1] > which blocks all the externalised connectors testing against the most > recent commits in > to corresponding branches > [1] https://issues.apache.org/jira/browse/FLINK-33175 > > > On Thu, Oct 5, 2023 at 11:19 PM Sergey Nuyanzin > wrote: > > > Thanks for creating RC1 > > > > * Downloaded artifacts > > * Built from sources > > * Verified checksums and gpg signatures > > * Verified versions in pom files > > * Checked NOTICE, LICENSE files > > > > The strange thing I faced is > > CheckpointAfterAllTasksFinishedITCase.testRestoreAfterSomeTasksFinished > > fails on AZP [1] > > > > which looks like it is related to [2], [3] fixed in 1.18.0 (not 100% > > sure). > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-33186 > > [2] https://issues.apache.org/jira/browse/FLINK-32996 > > [3] https://issues.apache.org/jira/browse/FLINK-32907 > > > > On Tue, Oct 3, 2023 at 2:53 PM Ferenc Csaky > > wrote: > > > >> Thanks everyone for the efforts! > >> > >> Checked the following: > >> > >> - Downloaded artifacts > >> - Built Flink from source > >> - Verified checksums/signatures > >> - Verified NOTICE, LICENSE files > >> - Deployed dummy SELECT job via SQL gateway on standalone cluster, > things > >> seemed fine according to the log files > >> > >> +1 (non-binding) > >> > >> Best, > >> Ferenc > >> > >> > >> --- Original Message --- > >> On Friday, September 29th, 2023 at 22:12, Gabor Somogyi < > >> gabor.g.somo...@gmail.com> wrote: > >> > >> > >> > > >> > > >> > Thanks for the efforts! > >> > > >> > +1 (non-binding) > >> > > >> > * Verified versions in the poms > >> > * Built from source > >> > * Verified checksums and signatures > >> > * Started basic workloads with kubernetes operator > >> > * Verified NOTICE and LICENSE files > >> > > >> > G > >> > > >> > On Fri, Sep 29, 2023, 18:16 Matthias Pohl > matthias.p...@aiven.io.invalid > >> > > >> > wrote: > >> > > >> > > Thanks for creating RC1. I did the following checks: > >> > > > >> > > * Downloaded artifacts > >> > > * Built Flink from sources > >> > > * Verified SHA512 checksums GPG signatures > >> > > * Compared checkout with provided sources > >> > > * Verified pom file versions > >> > > * Went over NOTICE file/pom files changes without finding anything > >> > > suspicious > >> > > * Deployed standalone session cluster and ran WordCount example in > >> batch > >> > > and streaming: Nothing suspicious in log files found > >> > > > >> > > +1 (binding) > >> > > > >> > > On Fri, Sep 29, 2023 at 10:34 AM Etienne Chauchot > >> echauc...@apache.org > >> > > wrote: > >> > > > >> > > > Hi all, > >> > > > > >> > > > Thanks to the team for this RC. > >> > > > > >> > > > I did a quick check of this RC against user pipelines (1) coded > with > >> > > > DataSet (even if deprecated and soon removed), DataStream and SQL > >> APIs > >> > > > > >> > > > based on the small scope of this test, LGTM > >> > > > > >> > > > +1 (non-binding) > >> > > > > >> > > > [1] https://github.com/echauchot/tpcds-benchmark-flink > >> > > > > >> > > > Best > >> > > > Etienne > >> > > > > >> > > > Le 28/09/2023 à 19:35, Jing Ge a écrit : > >> > > > > >> > > > > Hi everyone, > >> > > > > > >> > > > > The RC1 for Apache Flink 1.18.0 has been created. The related > >> voting > >> > > > > process will be triggered once the announcement is ready. The > RC1 > >> has > >> > > > > all > >> > > > > the artifacts that we would typically have for a release, except > >> for > >> > > > > the > >> > > > > release note and the website pull request for the release > >> announcement. > >> > > > > > >> > > > > The following contents are available for your review: > >> > > > > > >> > > > > - Confirmation of no benchmarks regression at the thread[1]. > >> > > > > - The preview source release and binary convenience releases > [2], > >> which > >> > > > > are signed with the key with fingerprint 96AE0E32CBE6E0753CE6 > [3]. > >> > > > > - all artifacts that would normally be deployed to the Maven > >> > > > > Central Repository [4]. > >> > > > > - source code tag "release-1.18.0-rc1" [5] > >> > > > > > >> > > > > Your help testing the release will be greatly appreciated! And > >> we'll > >> > > > > create the rc1 release and the voting thread as soon as all the > >> efforts > >> > > > > are > >> > > > > finished. > >> > > > > > >> > > > > [1] > >> https://lists.apache.org/thread/yxyphglwwvq57wcqlfrnk3qo9t3sr2ro > >> > > > > [2] > https://dist.apache.org/repos/dist/dev/flink/flink-1.18.0-rc1/ > >> > > > > [3]https://dist.apache.org/repos/dist/release/flink/KEYS > >> > > > > [4] > >> > > > > > >> https://reposi
Re: [ANNOUNCE] The Flink Speed Center and benchmark daily run are back online
Thanks a lot for working on this! Am Do., 19. Okt. 2023 um 10:24 Uhr schrieb Zakelly Lan < zakelly@gmail.com>: > Hi everyone, > > Flink benchmarks [1] generate daily performance reports in the Apache > Flink slack channel (#flink-dev-benchmarks) to detect performance > regression [2]. Those benchmarks previously were running on several > machines donated and maintained by Ververica. Unfortunately, those > machines were gone due to account issues [3] and the benchmarks daily > run stopped since August 24th delaying the release of Flink 1.18 a > bit. [4]. > > Ververica donated several new machines! After several weeks of work, I > have successfully re-established the codespeed panel and benchmark > daily run pipelines on them. At this time, we are pleased to announce > that the Flink Speed Center and benchmark pipelines are back online. > These new machines have a more formal management to ensure that > previous accidents will not occur in the future. > > What's more, I successfully recovered historical data backed up by > Yanfei Lei [5]. So with the old domain [6] redirected to the new > machines, the old links that existed in previous records will still be > valid. Besides the benchmarks with Java8 and Java11, I also added a > pipeline for Java17 running daily. > > How to use it: > We also registered a new domain name 'flink-speed.xyz' for the Flink > Speed Center [7]. It is recommended to use the new domain in the > future. Currently, the self-service method of triggering benchmarks is > unavailable considering the lack of resources and potential > vulnerabilities of Jenkins. Please contact one of Apache Flink PMCs to > submit a benchmark. More info is updated on the wiki[8]. > > Daily Monitoring: > The performance daily monitoring on the Apache Flink slack channel [2] > is still unavailable as the benchmark results need more time to > stabilize in the new environment. Once the baseline results become > available for regression detection, I will enable the daily > monitoring. > > Please feel free to reach out to me if you have any suggestions or > questions. Thanks Ververica again for denoting machines! > > > Best, > Zakelly > > [1] https://github.com/apache/flink-benchmarks > [2] https://lists.apache.org/thread/zok62sx4m50c79htfp18ymq5vmtgbgxj > [3] https://issues.apache.org/jira/browse/FLINK-33052 > [4] https://lists.apache.org//thread/5x28rp3zct4p603hm4zdwx6kfr101w38 > [5] https://issues.apache.org/jira/browse/FLINK-30890 > [6] http://codespeed.dak8s.net:8000 > [7] http://flink-speed.xyz > [8] > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=115511847 > -- https://twitter.com/snntrable https://github.com/knaufk
Re: [DISCUSS] FLIP-364: Improve the restart-strategy
Hi Rui, Thank you for this proposal and working on this. I also agree that exponential back off makes sense as a new default in general. I think restarting indefinitely (no max attempts) makes sense by default, though, but of course allowing users to change is valuable. So, overall +1. Cheers, Konstantin Am Di., 17. Okt. 2023 um 07:11 Uhr schrieb Rui Fan <1996fan...@gmail.com>: > Hi all, > > I would like to start a discussion on FLIP-364: Improve the > restart-strategy[1] > > As we know, the restart-strategy is critical for flink jobs, it mainly > has two functions: > 1. When an exception occurs in the flink job, quickly restart the job > so that the job can return to the running state. > 2. When a job cannot be recovered after frequent restarts within > a certain period of time, Flink will not retry but will fail the job. > > The current restart-strategy support for function 2 has some issues: > 1. The exponential-delay doesn't have the max attempts mechanism, > it means that flink will restart indefinitely even if it fails frequently. > 2. For multi-region streaming jobs and all batch jobs, the failure of > each region will increase the total number of job failures by +1, > even if these failures occur at the same time. If the number of > failures increases too quickly, it will be difficult to set a reasonable > number of retries. > If the maximum number of failures is set too low, the job can easily > reach the retry limit, causing the job to fail. If set too high, some jobs > will never fail. > > In addition, when the above two problems are solved, we can also > discuss whether exponential-delay can replace fixed-delay as the > default restart-strategy. In theory, exponential-delay is smarter and > friendlier than fixed-delay. > > I also thank Zhu Zhu for his suggestions on the option name in > FLINK-32895[2] in advance. > > Looking forward to and welcome everyone's feedback and suggestions, thank > you. > > [1] https://cwiki.apache.org/confluence/x/uJqzDw > [2] https://issues.apache.org/jira/browse/FLINK-32895 > > Best, > Rui > -- https://twitter.com/snntrable https://github.com/knaufk
Re: [DISCUSS] Release 1.17.2
Thank you for picking it up! +1 Cheers, Konstantin Am Mo., 6. Nov. 2023 um 03:48 Uhr schrieb Yun Tang : > Hi all, > > I would like to discuss creating a new 1.17 patch release (1.17.2). The > last 1.17 release is near half a year old, and since then, 79 tickets have > been closed [1], of which 15 are blocker/critical [2]. Some > of them are quite important, such as FLINK-32758 [3], FLINK-32296 [4], > FLINK-32548 [5] > and FLINK-33010[6]. > > In addition to this, FLINK-33149 [7] is important to bump snappy-java to > 1.1.10.4. > Although FLINK-33149 is unresolved, it was done in 1.17.2. > > I am not aware of any unresolved blockers and there are no in-progress > tickets [8]. Please let me know if there are any issues you'd like to be > included in this release but still not merged. > > If the community agrees to create this new patch release, I could > volunteer as the release manager with Yu Chen. > > Since there will be another flink-1.16.3 release request during the same > time, we will work with Rui Fan since many issues will be fixed in both > releases. > > [1] > https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%201.17.2%20%20and%20resolution%20%20!%3D%20%20Unresolved%20order%20by%20priority%20DESC > [2] > https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%201.17.2%20and%20resolution%20%20!%3D%20Unresolved%20%20and%20priority%20in%20(Blocker%2C%20Critical)%20ORDER%20by%20priority%20%20DESC > [3] https://issues.apache.org/jira/browse/FLINK-32758 > [4] https://issues.apache.org/jira/browse/FLINK-32296 > [5] https://issues.apache.org/jira/browse/FLINK-32548 > [6] https://issues.apache.org/jira/browse/FLINK-33010 > [7] https://issues.apache.org/jira/browse/FLINK-33149 > [8] https://issues.apache.org/jira/projects/FLINK/versions/12353260 > > Best > Yun Tang > -- https://twitter.com/snntrable https://github.com/knaufk
Re: [DISCUSS] Release Flink 1.16.3
+1. Thank you for picking it up. Yes, this will be the final bug fix for Flink 1.16. Am Mo., 6. Nov. 2023 um 03:47 Uhr schrieb Rui Fan <1996fan...@gmail.com>: > Hi all, > > I would like to discuss creating a new 1.16 patch release (1.16.3). The > last 1.16 release is over five months old, and since then, 50 tickets have > been closed [1], of which 10 are blocker/critical [2]. Some > of them are quite important, such as FLINK-32296 [3], FLINK-32548 [4] > and FLINK-33010[5]. > > In addition to this, FLINK-33149 [6] is important to bump snappy-java to > 1.1.10.4. > Although FLINK-33149 is unresolved, it was done in 1.16.3. > > I am not aware of any unresolved blockers and there are no in-progress > tickets [7]. Please let me know if there are any issues you'd like to be > included in this release but still not merged. > > Since 1.18.0 has been released, I'd suggest that we vote to make 1.16.3 > the final bugix release of 1.16, looking forward to any feedback from you. > Background info could be found at [8], and thanks Jing for the information. > > If the community agrees to create this new patch release, I could > volunteer as the release manager. > > Since there will be another flink-1.17.2 release request during the same > time, > I will work with Yun and Yu since many issues will be fixed in both > releases. > > [1] > > https://issues.apache.org/jira/browse/FLINK-32231?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%201.16.3%20%20and%20resolution%20%20!%3D%20%20Unresolved%20order%20by%20priority%20DESC > [2] > > https://issues.apache.org/jira/browse/FLINK-32231?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%201.16.3%20and%20resolution%20%20!%3D%20Unresolved%20%20and%20priority%20in%20(Blocker%2C%20Critical)%20ORDER%20by%20priority%20%20DESC > [3] https://issues.apache.org/jira/browse/FLINK-32296 > [4] https://issues.apache.org/jira/browse/FLINK-32548 > [5] https://issues.apache.org/jira/browse/FLINK-33010 > [6] https://issues.apache.org/jira/browse/FLINK-33149 > [7] https://issues.apache.org/jira/projects/FLINK/versions/12353259 > [8] https://lists.apache.org/thread/szq23kr3rlkm80rw7k9n95js5vqpsnbv > > Best, > Rui > -- https://twitter.com/snntrable https://github.com/knaufk
[ANNOUNCE] Weekly Community Update 2020/06
s/2020/02/07/a-guide-for-unit-testing-in-apache-flink.html [15] https://www.flink-forward.org/sf-2020/speakers [16] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Community-Discounts-for-Flink-Forward-SF-2020-Registrations-td37055.html [17] https://www.meetup.com/Apache-Flink-London-Meetup/events/268400545/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2020/07
Dear community, happy to share this week's community digest with the release of Flink 1.10, a proposal for better changelog support in Flink SQL, a documentation style guide, the Flink Forward San Francisco schedule and a bit more. Flink Development == * [releases] Apache Flink 1.10 has been released! [1] Read all about it in Marta's release blog post [2]. With the release of Flink 1.10 (technically already Flink 1.9.2), the maintenance of the "official" Flink Docker images [3] has moved to the Apache Flink project. [4] * [releases] Moreover, you can now also find apache-flink on PyPi [5] * https://pypi.org/project/apache-flink/1.9.2/#files * https://pypi.org/project/apache-flink/1.10.0/#files * [releases] Chesnay published a third release candidate for flink-shaded 10.0. on Wednesday. Only +1s so far. [6] * [sql] Jark has published FLIP-105, which proposes to support changelog streams in Flink SQL. In essence, this means being able to interpret changelogs (Debezium, compacted topics, ...) as a dynamic table in update mode. Afterwards, the resulting continuously updating table could be directly used in a (temporal table) joins and aggregations. [7] * [sql] Dawid has started a discussion on moving around packages in the Table API in order to greatly simplify the required imports for users of the Table API. [8] * [cep] Shuai Xu proposes to support notFollowedBy() as the last part of a pattern, if an additional "within" time interval is given. He is waiting for feedback on his design document. [9] * [development process] The documentation style guide [10] has finally been merged. Thanks to Marta, Robert, Aljoscha and others for driving this. * [connectors] Dawid proposes to drop the ElasticSearch 2.x and 5.x connector. There is a consensus to drop 2.x, but some uncertainty about the current usage of 5.x connector. Please get involved in that thread if you are using ElasticSearch 5.x sink right now. [11] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Apache-Flink-1-10-0-released-tp37564.html [2] https://flink.apache.org/news/2020/02/11/release-1.10.0.html [3] https://hub.docker.com/_/flink [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/RESULT-VOTE-Integrate-Flink-Docker-image-publication-into-Flink-release-process-tp37096.html [5] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Apache-Flink-Python-API-PyFlink-1-9-2-released-tp37597.html [6] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-flink-shaded-10-0-release-candidate-3-tp37567.html [7] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-105-Support-to-Interpret-and-Emit-Changelog-in-Flink-SQL-tp37665.html [8] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-TABLE-Issue-with-package-structure-in-the-Table-API-tp37623.html [9] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notFollowedBy-with-interval-as-the-last-part-of-a-Pattern-tp37513.html [10] https://flink.apache.org/contributing/docs-style.html [11] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Drop-connectors-for-Elasticsearch-2-x-and-5-x-tp37471.html Notable Bugs == Not today. Events, Blog Posts, Misc === * Andrew Torson of Salesforce has published a blog post on application log analysis with *Apache Flink at Salesforce*.[12] * The conference program for Flink Forward San Francisco is live now including speakers from *AWS, Bird, Cloudera, Lyft, Netflix, Splunk, Uber, Yelp, Alibaba, Ververica *and others! Use "FFSF20-MailingList" to get a 50% discount on your conference pass. [13] * Upcoming Meetups * On February 19th, Apache Flink Meetup London, "Monitoring and Analysing Communication & Trade Events as Graphs", hosted by *Christos Hadjinikolis* [14] [12] https://engineering.salesforce.com/application-log-intelligence-performance-insights-at-salesforce-using-flink-92955f30573f [13] https://www.flink-forward.org/sf-2020/conference-program [14] https://www.meetup.com/Apache-Flink-London-Meetup/events/268400545/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Head of Product +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [ANNOUNCE] Apache Flink-shaded 10.0 released
Thanks, Chesnay, for your continuous work on flink-shaded and managing this release. Best, Konstantin On Thu, Feb 20, 2020 at 2:33 PM Xingbo Huang wrote: > Thanks a lot Chesnay and all contributes to this release. > > Best, > Xingbo > > Hequn Cheng 于2020年2月20日周四 下午8:17写道: > > > Thanks a lot for the release Chesnay! > > Also thanks to everyone who contributes to this release! > > > > Best, Hequn > > > > On Thu, Feb 20, 2020 at 11:11 AM Yu Li wrote: > > > > > Thanks Chesnay and all participants for making the release possible! > > > > > > Best Regards, > > > Yu > > > > > > > > > On Thu, 20 Feb 2020 at 09:50, Zhu Zhu wrote: > > > > > > > Thanks Chesnay for the great work and everyone who helps with the > > > > improvements and release! > > > > > > > > Thanks, > > > > Zhu Zhu > > > > > > > > Dian Fu 于2020年2月20日周四 上午9:44写道: > > > > > > > > > Thanks Chesnay for the great work and everyone involved! > > > > > > > > > > Regards, > > > > > Dian > > > > > > > > > > > 在 2020年2月20日,上午12:21,Zhijiang .INVALID> > > > 写道: > > > > > > > > > > > > Thanks Chesnay for making the release efficiently and also thanks > > to > > > > all > > > > > the other participants! > > > > > > > > > > > > Best, > > > > > > Zhijiang > > > > > > > > > > > > > > > > > > > -- > > > > > > From:Till Rohrmann > > trohrm...@apache.org > > > > >> > > > > > > Send Time:2020 Feb. 19 (Wed.) 22:21 > > > > > > To:dev mailto:dev@flink.apache.org>> > > > > > > Subject:Re: [ANNOUNCE] Apache Flink-shaded 10.0 released > > > > > > > > > > > > Thanks for making the release possible Chesnay and everyone who > was > > > > > > involved! > > > > > > > > > > > > Cheers, > > > > > > Till > > > > > > > > > > > > On Wed, Feb 19, 2020 at 7:47 AM jincheng sun < > > > sunjincheng...@gmail.com > > > > > > > > > > > wrote: > > > > > > > > > > > >> Thanks a lot for the release Chesnay! > > > > > >> And thanks to everyone who make this release possible! > > > > > >> > > > > > >> Best, > > > > > >> Jincheng > > > > > >> > > > > > >> > > > > > >> Chesnay Schepler 于2020年2月19日周三 上午12:45写道: > > > > > >> > > > > > >>> The Apache Flink community is very happy to announce the > release > > of > > > > > >>> Apache Flink-shaded 10.0. > > > > > >>> > > > > > >>> The flink-shaded project contains a number of shaded > dependencies > > > for > > > > > >>> Apache Flink. > > > > > >>> > > > > > >>> Apache Flink(r) is an open-source stream processing framework > for > > > > > >>> distributed, high-performing, always-available, and accurate > data > > > > > >>> streaming applications. > > > > > >>> > > > > > >>> The release is available for download at: > > > > > >>> https://flink.apache.org/downloads.html > > > > > >>> > > > > > >>> The full release notes are available in Jira: > > > > > >>> > > > > > >>> > > > > > >> > > > > > > > > > > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12346746 > > > > > >>> > > > > > >>> We would like to thank all contributors of the Apache Flink > > > community > > > > > >>> who made this release possible! > > > > > >>> > > > > > >>> Regards, > > > > > >>> Chesnay > > > > > > > > > > > > > > > > > > > > -- Konstantin Knauf | Head of Product +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2020/07
ps * On February 26th, Prateep Kumar will host an online event on log analytics with Apache Flink [19]. * On March 5th, Stephan Ewen will talk about Apache Flink Stateful Function at the Utrecht Data Engineering Meetup. [20] * Cloudera is hosting a couple of "Future of Data" events on stream processing with Apache Flink in * Budapest, (February 25th, meetup) [21] * Vienna (March 4th, full-day workshop) [22] * Zurich (March 10th, full-day workshop) [23] * New Jersey (May 5th, meetup) [24] [17] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Jingsong-Lee-becomes-a-Flink-committer-tp37938p38006.html [18] https://flink.apache.org/news/2020/02/20/ddl.html [19] https://www.meetup.com/apache-flink-aws-kinesis-hyd-india/events/268907057/ [20] https://www.meetup.com/Data-Engineering-NL/events/268424399/ [21] https://www.meetup.com/futureofdata-budapest/events/268423538/ [22] https://www.meetup.com/futureofdata-vienna/events/268418974/ [23] https://www.meetup.com/futureofdata-zurich/events/268423809/ [24] https://www.meetup.com/futureofdata-princeton/events/268830725/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Head of Product +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2020/09
Dear community, happy to share this week's community update. It was a relatively quiet week on the dev@ mailing list (mostly votes on previously covered FLIPs), but there is always something to share. Additionally, I have decided to also feature *flink-packages.org <http://flink-packages.org> *in this newsletter going forward. Depending on the level of activity, I will cover newly added packages or introduce one of the existing packages. Flink Development == * [sql] Dawid has started a discussion to enable Tabla API/SQL sources to read columns from different parts of source records. With this it would, for example, be possible to read partition, timestamp or offset from a Kafka source record. Similarly, it would be possible to specify override partitioning when writing to Kafka or Kinesis. [1] * [sql, python] FLIP-58 introduced Python UDFs in SQL and Table API. FLIP-79 added a Function DDL in Flink SQL to register Java & Scala UDFs in pure SQL. Based on these two FLIPs, Wei Zhon published FLIP-106 to also support Python UDFs in the SQL Function DDL. [2] * [development] Chesnay started a discussion on Eclipse support for Apache Flink (framework) development. If you are using Eclipse as an Apache Flink contributor, please get involved in the thread. [3] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-107-Reading-table-columns-from-different-parts-of-source-records-tp38277.html [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-106-Support-Python-UDF-in-SQL-Function-DDL-tp38107.html [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Remove-Eclipse-specific-plugin-configurations-tp38255.html Notable Bugs == [FLINK-16262] [1.10.0] The FlinkKafkaProducer can not be used in EXACTLY_ONCE mode when using the user code classloader. For application cluster (per-job clusters) you can work around this issue by using the system classloader (user jar in lib/ directory). Will be fixed in 1.10.1. [4] [4] https://issues.apache.org/jira/browse/FLINK-16262 flink-packages.org = DTStack, a Chinese cloud technology company, has recently published FlinkX [5] on flink-packages.org. The documentation is Chinese only, but it seems to be a configuration-based integration framework based on Apache Flink with an impressive set of connectors. [5] https://flink-packages.org/packages/flinkx Events, Blog Posts, Misc === * This week I stumbled across this Azure tutorial to use Event Hubs with Apache Flink. [6] * Gökce Sürenkök has written a blog post on setting up a highly available Flink cluster on Kubernetes based on Zookeeper for Flink Master failover and HDFS as checkpoint storage. [7] * Upcoming Meetups * On March 5th, Stephan Ewen will talk about Apache Flink Stateful Function at the Utrecht Data Engineering Meetup. [8] * On March 12th, Prateep Kumar will host an online event comparing Kafka Streams and Apache Flink [9]. * On April 22, Ververica will host the next Apache Flink meetup in Berlin. [10] * Cloudera is hosting a couple of "Future of Data" events on stream processing with Apache Flink in * Vienna (March 4th, full-day workshop) [11] * Zurich (March 10th, full-day workshop) [12] * New Jersey (May 5th, meetup) [13] [6] https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-kafka-flink-tutorial [7] https://medium.com/hepsiburadatech/high-available-flink-cluster-on-kubernetes-setup-73b2baf9200e [8] https://www.meetup.com/Data-Engineering-NL/events/268424399/ [9] https://www.meetup.com/apache-flink-aws-kinesis-hyd-india/events/268930388/ [10] https://www.meetup.com/Apache-Flink-Meetup/events/269005339/ [11] https://www.meetup.com/futureofdata-vienna/events/268418974/ [12] https://www.meetup.com/futureofdata-zurich/events/268423809/ [13] https://www.meetup.com/futureofdata-princeton/events/268830725/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Head of Product +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [DISCUSS] FLIP-111: Docker image unification
Hi Andrey, thanks a lot for this proposal. The variety of Docker files in the project has been causing quite some confusion. For the entrypoint, have you considered to also allow setting configuration via environment variables as in "docker run -e FLINK_REST_BIN_PORT=8081 ..."? This is quite common and more flexible, e.g. it makes it very easy to pass values of Kubernetes Secrets into the Flink configuration. With respect to logging, I would opt to keep this very basic and to only support logging to the console (maybe with a fix for the web user interface). For everything else, users can easily build their own images based on library/flink (provide the dependencies, change the logging configuration). Cheers, Konstantin On Thu, Mar 5, 2020 at 11:01 AM Yang Wang wrote: > Hi Andrey, > > > Thanks for driving this significant FLIP. From the user ML, we could also > know there are > many users running Flink in container environment. Then the docker image > will be the > very basic requirement. Just as you say, we should provide a unified place > for all various > usage(e.g. session, job, native k8s, swarm, etc.). > > > > About docker utils > > I really like the idea to provide some utils for the docker file and entry > point. The > `flink_docker_utils` will help to build the image easier. I am not sure > about the > `flink_docker_utils start_jobmaster`. Do you mean when we build a docker > image, we > need to add `RUN flink_docker_utils start_jobmaster` in the docker file? > Why do we need this? > > > > About docker entry point > > I agree with you that the docker entry point could more powerful with more > functionality. > Mostly, it is about to override the config options. If we support dynamic > properties, i think > it is more convenient for users without any learning curve. > `docker run flink session_jobmanager -D rest.bind-port=8081` > > > > About the logging > > Updating the `log4j-console.properties` to support multiple appender is a > better option. > Currently, the native K8s is suggesting users to debug the logs in this > way[1]. However, > there is also some problems. The stderr and stdout of JM/TM processes > could not be > forwarded to the docker container console. > > > [1]. > https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html#log-files > > > Best, > Yang > > > > > Andrey Zagrebin 于2020年3月4日周三 下午5:34写道: > >> Hi All, >> >> If you have ever touched the docker topic in Flink, you >> probably noticed that we have multiple places in docs and repos which >> address its various concerns. >> >> We have prepared a FLIP [1] to simplify the perception of docker topic in >> Flink by users. It mostly advocates for an approach of extending official >> Flink image from the docker hub. For convenience, it can come with a set >> of >> bash utilities and documented examples of their usage. The utilities allow >> to: >> >>- run the docker image in various modes (single job, session master, >>task manager etc) >>- customise the extending Dockerfile >>- and its entry point >> >> Eventually, the FLIP suggests to remove all other user facing Dockerfiles >> and building scripts from Flink repo, move all docker docs to >> apache/flink-docker and adjust existing docker use cases to refer to this >> new approach (mostly Kubernetes now). >> >> The first contributed version of Flink docker integration also contained >> example and docs for the integration with Bluemix in IBM cloud. We also >> suggest to maintain it outside of Flink repository (cc Markus Müller). >> >> Thanks, >> Andrey >> >> [1] >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-111%3A+Docker+image+unification >> > -- Konstantin Knauf | Head of Product +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2020/10
, full-day workshop) [15] * Zurich (March 10th, full-day workshop) [16] * New Jersey (May 5th, meetup) [17] [13] https://www.meetup.com/apache-flink-aws-kinesis-hyd-india/events/268930388/ [14] https://www.meetup.com/Apache-Flink-Meetup/events/269005339/ [15] https://www.meetup.com/futureofdata-vienna/events/268418974/ [16] https://www.meetup.com/futureofdata-zurich/events/268423809/ [17] https://www.meetup.com/futureofdata-princeton/events/268830725/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Head of Product +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2020/11
ink blog post about the portable Apache Flink runner of Apache Beam. [15] * Upcoming Meetups: I personally believe all upcoming meetups in the regions, I usually cover, will be cancelled. So, no update on this today. [15] https://flink.apache.org/ecosystem/2020/02/22/apache-beam-how-beam-runs-on-top-of-flink.html Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Head of Product +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2020/12
Dear community, happy to share this week's community digest featuring "Flink Forward Virtual Conference 2020", a small update on Flink 1.10.1, a better Filesystem connector for the Table API & SQL, new source/sink interfaces for the Table API and a bit more. Flink Development == * [releases] For an update on the outstanding tickets ("Blocker"/"Critical") planned for Apache *Flink 1.10.1* please see the overview posted by Yu Li in this release discussion thread [1]. * [sql] Timo has shared a proposal (FLIP-95) for *new TableSource and TableSink interfaces*. It is based on discussions with Jark, Dawid, Aljoscha, Kurt, Jingsong and many more. Its goals are to simplify the current interface architecture, to support changelog sources (FLIP-105) and to remove dependencies on the DataStream API as well as the planner components. [2] * [hadoop] Following up on a discussion [3] with Stephan and Till, Sivaprasanna has shared an overview of Hadoop related utility components to kick off a discussion on moving these into a separate module "flink-hadoop-utils". [4] * [sql] Jingsong Li has started a discussion on introducing a table source that in essence generates a random stream of data of a given schema to facilitate development and testing in Flink SQL [5]. * [sql] Jingsong Li has started a discussion on improving the filesystem connector for the Table API. The current filesystem connector only supports CSV format and can only be considered experimental for streaming use cases. There seems to be a consensus to build on top of the existing StreamingFileSink (DataStream API) and to focus on ORC, Parquet and better Hive interoperability. [6] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Releasing-Flink-1-10-1-tp38689.html [2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-95%3A+New+TableSource+and+TableSink+interfaces [3] https://lists.apache.org/thread.html/r198f09496ba46885adbcc41fe778a7a34ad1cd685eeae8beb71e6fbb%40%3Cdev.flink.apache.org%3E [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Introduce-a-new-module-flink-hadoop-utils-tp39107.html [5] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Introduce-TableFactory-for-StatefulSequenceSource-tp39116.html [6] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-115-Filesystem-connector-in-Table-tp38870.html Notable Bugs == * [FLINK-16684] [1.10.0] [1.9.2] The builder of the StreamingFileSink does not work in Scala. This is one of the blockers to drop support for the BucketingSink (covered in last week's update). Resolved in Flink 1.10.1. [7] [7] https://issues.apache.org/jira/browse/FLINK-16684 Events, Blog Posts, Misc === * Unfortunately, we had to cancel Flink Forward SF due to the spread of SARS-CoV-2 two weeks ago. But instead we will have a three day virtual Flink Forward conference April 22 - 24. You can register for free under [8] * Stefan Hausmann has published a blog post on how Apache Flink can be used for streaming ETL on AWS (Kinesis, Kafka, ElasticSearch and S3 (StreamingFileSink)). [9] * On the Ververica blog Nico Kruber presents a small benchmark comparing the overhead of SSL encryption in Flink depending on the SSL provider (JDK vs OpenSSL). The difference seems to be quite significant. [10] * Upcoming Meetups: None. [8] https://www.flink-forward.org/sf-2020 [9] https://aws.amazon.com/blogs/big-data/streaming-etl-with-apache-flink-and-amazon-kinesis-data-analytics [10] https://www.ververica.com/blog/how-openssl-in-ververica-platform-improves-your-flink-job-performance Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Head of Product +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: Kafka - FLink - MongoDB using Scala
cc user@f.a.o Hi Siva, I am not aware of a Flink MongoDB Connector in either Apache Flink, Apache Bahir or flink-packages.org. I assume that you are doing idempotent upserts, and hence do not require a transactional sink to achieve end-to-end exactly-once results. To build one yourself, you implement org.apache.flink.streaming.api.functions.sink.SinkFunction (better inherit from org.apache.flink.streaming.api.functions.sink.RichSinkFunction). Roughly speaking, you would instantiate the MongoDB client in the "open" method and write records in the MongoDB client. Usually, such sinks us some kind of batching to increase write performance. I suggest you also have a look at the source code of the ElasticSearch or Cassandra Sink. Best, Konstantin On Sat, Mar 28, 2020 at 1:47 PM Sivapragash Krishnan < sivapragas...@gmail.com> wrote: > Hi > > I'm working on creating a streaming pipeline which streams data from Kafka > and stores in MongoDB using Flink scala. > > I'm able to successfully stream data from Kafka using FLink Scala. I'm not > finding any support to store the data into MongoDB, could you please help > me with the code snippet to store data into MongoDB. > > Thanks > Siva > -- Konstantin Knauf | Head of Product +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [DISCUSS] FLIP-118: Improve Flink’s ID system
Hi Yangze, Hi Till, thanks you for working on this topic. I believe it will make debugging large Apache Flink deployments much more feasible. I was wondering whether it would make sense to allow the user to specify the Resource ID in standalone setups? For example, many users still implicitly use standalone clusters on Kubernetes (the native support is still experimental) and in these cases it would be interesting to also set the PodName as the ResourceID. What do you think? Cheers, Kosntantin On Thu, Mar 26, 2020 at 6:49 PM Till Rohrmann wrote: > Hi Yangze, > > thanks for creating this FLIP. I think it is a very good improvement > helping our users and ourselves understanding better what's going on in > Flink. > > Creating the ResourceIDs with host information/pod name is a good idea. > > Also deriving ExecutionGraph IDs from their superset ID is a good idea. > > The InstanceID is used for fencing purposes. I would not make it a > composition of the ResourceID + a monotonically increasing number. The > problem is that in case of a RM failure the InstanceIDs would start from 0 > again and this could lead to collisions. > > Logging more information on how the different runtime IDs are correlated is > also a good idea. > > Two other ideas for simplifying the ids are the following: > > * The SlotRequestID was introduced because the SlotPool was a separate > RpcEndpoint a while ago. With this no longer being the case I think we > could remove the SlotRequestID and replace it with the AllocationID. > * Instead of creating new SlotRequestIDs for multi task slots one could > derive them from the SlotRequestID used for requesting the underlying > AllocatedSlot. > > Given that the slot sharing logic will most likely be reworked with the > pipelined region scheduling, we might be able to resolve these two points > as part of the pipelined region scheduling effort. > > Cheers, > Till > > On Thu, Mar 26, 2020 at 10:51 AM Yangze Guo wrote: > > > Hi everyone, > > > > We would like to start a discussion thread on "FLIP-118: Improve > > Flink’s ID system"[1]. > > > > This FLIP mainly discusses the following issues, target to enhance the > > readability of IDs in log and help user to debug in case of failures: > > > > - Enhance the readability of the string literals of IDs. Most of them > > are hashcodes, e.g. ExecutionAttemptID, which do not provide much > > meaningful information and are hard to recognize and compare for > > users. > > - Log the ID’s lineage information to make debugging more convenient. > > Currently, the log fails to always show the lineage information > > between IDs. Finding out relationships between entities identified by > > given IDs is a common demand, e.g., slot of which AllocationID is > > assigned to satisfy slot request of with SlotRequestID. Absence of > > such lineage information, it’s impossible to track the end to end > > lifecycle of an Execution or a Task now, which makes debugging > > difficult. > > > > Key changes proposed in the FLIP are as follows: > > > > - Add location information to distributed components > > - Add topology information to graph components > > - Log the ID’s lineage information > > - Expose the identifier of distributing component to user > > > > Please find more details in the FLIP wiki document [1]. Looking forward > to > > your feedbacks. > > > > [1] > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=148643521 > > > > Best, > > Yangze Guo > > > -- Konstantin Knauf | Head of Product +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2020/13
integrating Flink with Hive and gives an introduction to the recent improvements in Flink 1.10. [18] * Robert has published a first blogpost in the Flink "Engine Room" on the migration of Flink's CI infrastructure from Travis CI to Azure Pipelines. [19] [14] https://www.flink-forward.org/sf-2020/conference-program [15] https://www.bigmarker.com/series/flink-forward-virtual-confer1/series_summit [16] https://www.datadoghq.com/blog/monitor-apache-flink-with-datadog/ [17] https://flink.apache.org/news/2020/03/24/demo-fraud-detection-2.html [18] https://flink.apache.org/features/2020/03/27/flink-for-data-warehouse.html [19] https://cwiki.apache.org/confluence/display/FLINK/2020/03/22/Migrating+Flink%27s+CI+Infrastructure+from+Travis+CI+to+Azure+Pipelines Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Head of Product +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [VOTE] Apache Flink Stateful Functions Release 2.0.0, release candidate #3
ifacts to be deployed to the Maven Central Repository > >>>> > > >>>> > **Staging Areas to Review** > >>>> > > >>>> > The staging areas containing the above mentioned artifacts are as > >>>> follows, > >>>> > for your review: > >>>> > * All artifacts for a) and b) can be found in the corresponding dev > >>>> > repository at dist.apache.org [2] > >>>> > * All artifacts for c) can be found at the Apache Nexus Repository > [3] > >>>> > > >>>> > All artifacts are singed with the > >>>> > key 1C1E2394D3194E1944613488F320986D35C33D6A [4] > >>>> > > >>>> > Other links for your review: > >>>> > * JIRA release notes [5] > >>>> > * source code tag "release-2.0.0-rc3" [6] [7] > >>>> > > >>>> > **Extra Remarks** > >>>> > > >>>> > * Part of the release is also official Docker images for Stateful > >>>> > Functions. This can be a separate process, since the creation of > those > >>>> > relies on the fact that we have distribution jars already deployed > to > >>>> > Maven. I will follow-up with this after these artifacts are > officially > >>>> > released. > >>>> > In the meantime, there is this discussion [8] ongoing about where to > >>>> host > >>>> > the StateFun Dockerfiles. > >>>> > * The Flink Website and blog post is also being worked on (by Marta) > >>>> as > >>>> > part of the release, to incorporate the new Stateful Functions > >>>> project. We > >>>> > can follow up with a link to those changes afterwards in this vote > >>>> thread, > >>>> > but that would not block you to test and cast your votes already. > >>>> > * Since the Flink website changes are still being worked on, you > will > >>>> not > >>>> > yet be able to find the Stateful Functions docs from there. Here are > >>>> the > >>>> > links [9] [10]. > >>>> > > >>>> > **Vote Duration** > >>>> > > >>>> > The vote will be open for at least 72 hours starting Monday > >>>> > *(target end date is Wednesday, April 1st).* > >>>> > It is adopted by majority approval, with at least 3 PMC affirmative > >>>> votes. > >>>> > > >>>> > Thanks, > >>>> > Gordon > >>>> > > >>>> > [1] > >>>> > > >>>> > > >>>> > https://docs.google.com/document/d/1P9yjwSbPQtul0z2AXMnVolWQbzhxs68suJvzR6xMjcs/edit?usp=sharing > >>>> > [2] > >>>> > https://dist.apache.org/repos/dist/dev/flink/flink-statefun-2.0.0-rc3/ > >>>> > [3] > >>>> > > >>>> > https://repository.apache.org/content/repositories/orgapacheflink-1342/ > >>>> > [4] https://dist.apache.org/repos/dist/release/flink/KEYS > >>>> > [5] > >>>> > > >>>> > > >>>> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12346878 > >>>> > [6] > >>>> > > >>>> > > >>>> > https://gitbox.apache.org/repos/asf?p=flink-statefun.git;a=commit;h=752e07fd9987ee430eb9d1c1d3fadff632ef9213 > >>>> > [7] https://github.com/apache/flink-statefun/tree/release-2.0.0-rc3 > >>>> > [8] > >>>> > > >>>> > > >>>> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Creating-a-new-repo-to-host-Stateful-Functions-Dockerfiles-td39342.html > >>>> > [9] > https://ci.apache.org/projects/flink/flink-statefun-docs-master/ > >>>> > [10] > >>>> https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.0/ > >>>> > > >>>> > TIP: You can create a `settings.xml` file with these contents: > >>>> > > >>>> > """ > >>>> > > >>>> > > >>>> > flink-statefun-2.0.0 > >>>> > > >>>> > > >>>> > > >>>> > flink-statefun-2.0.0 > >>>> > > >>>> > > >>>> > flink-statefun-2.0.0 > >>>> > > >>>> > > >>>> > https://repository.apache.org/content/repositories/orgapacheflink-1342/ > >>>> > > >>>> > > >>>> > > >>>> > archetype > >>>> > > >>>> > > >>>> > https://repository.apache.org/content/repositories/orgapacheflink-1342/ > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > """ > >>>> > > >>>> > And reference that in you maven commands via `--settings > >>>> > path/to/settings.xml`. > >>>> > This is useful for creating a quickstart based on the staged release > >>>> and > >>>> > for building against the staged jars. > >>>> > > >>>> > >>> > -- Konstantin Knauf | Head of Product +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [VOTE] Apache Flink Stateful Functions Release 2.0.0, release candidate #4
Hi Gordon, +1 (non-binding) * Maven build from source...check * Python build from source...check * Went through Walkthrough based on local builds...check Cheers, Konstantin On Mon, Mar 30, 2020 at 5:52 AM Tzu-Li (Gordon) Tai wrote: > Hi everyone, > > Please review and vote on the *release candidate #4* for the version 2.0.0 > of Apache Flink Stateful Functions, > as follows: > [ ] +1, Approve the release > [ ] -1, Do not approve the release (please provide specific comments) > > **Testing Guideline** > > You can find here [1] a doc that we can use for collaborating testing > efforts. > The listed testing tasks in the doc also serve as a guideline in what to > test for this release. > If you wish to take ownership of a testing task, simply put your name down > in the "Checked by" field of the task. > > **Release Overview** > > As an overview, the release consists of the following: > a) Stateful Functions canonical source distribution, to be deployed to the > release repository at dist.apache.org > b) Stateful Functions Python SDK distributions to be deployed to PyPI > c) Maven artifacts to be deployed to the Maven Central Repository > > **Staging Areas to Review** > > The staging areas containing the above mentioned artifacts are as follows, > for your review: > * All artifacts for a) and b) can be found in the corresponding dev > repository at dist.apache.org [2] > * All artifacts for c) can be found at the Apache Nexus Repository [3] > > All artifacts are singed with the > key 1C1E2394D3194E1944613488F320986D35C33D6A [4] > > Other links for your review: > * JIRA release notes [5] > * source code tag "release-2.0.0-rc4" [6] [7] > > **Extra Remarks** > > * Part of the release is also official Docker images for Stateful > Functions. This can be a separate process, since the creation of those > relies on the fact that we have distribution jars already deployed to > Maven. I will follow-up with this after these artifacts are officially > released. > In the meantime, there is this discussion [8] ongoing about where to host > the StateFun Dockerfiles. > * The Flink Website and blog post is also being worked on (by Marta) as > part of the release, to incorporate the new Stateful Functions project. We > can follow up with a link to those changes afterwards in this vote thread, > but that would not block you to test and cast your votes already. > * Since the Flink website changes are still being worked on, you will not > yet be able to find the Stateful Functions docs from there. Here are the > links [9] [10]. > > **Vote Duration** > > Since this RC only fixes licensing issues from previous RCs, > and the code itself has not been touched, > I'd like to stick with the original vote ending time. > > The vote will be open for at least 72 hours starting Monday > *(target end date is Wednesday, April 1st).* > It is adopted by majority approval, with at least 3 PMC affirmative votes. > > Thanks, > Gordon > > [1] > > https://docs.google.com/document/d/1P9yjwSbPQtul0z2AXMnVolWQbzhxs68suJvzR6xMjcs/edit?usp=sharing > [2] https://dist.apache.org/repos/dist/dev/flink/flink-statefun-2.0.0-rc4/ > [3] > https://repository.apache.org/content/repositories/orgapacheflink-1343/ > [4] https://dist.apache.org/repos/dist/release/flink/KEYS > [5] > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12346878 > [6] > > https://gitbox.apache.org/repos/asf?p=flink-statefun.git;a=commit;h=5d5d62fca2dbe3c75e8157b7ce67d4d4ce12ffd9 > [7] https://github.com/apache/flink-statefun/tree/release-2.0.0-rc4 > [8] > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Creating-a-new-repo-to-host-Stateful-Functions-Dockerfiles-td39342.html > [9] https://ci.apache.org/projects/flink/flink-statefun-docs-master/ > [10] https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.0/ > > TIP: You can create a `settings.xml` file with these contents: > > """ > > > flink-statefun-2.0.0 > > > > flink-statefun-2.0.0 > > > flink-statefun-2.0.0 > > https://repository.apache.org/content/repositories/orgapacheflink-1343/ > > > > archetype > > https://repository.apache.org/content/repositories/orgapacheflink-1343/ > > > > > > > """ > > And reference that in you maven commands via `--settings > path/to/settings.xml`. > This is useful for creating a quickstart based on the staged release and > for building against the staged jars. > -- Konstantin Kna
[ANNOUNCE] Weekly Community Update 2020/14
Dear community, happy to share this week's community update with Flink on Zeppelin, full support for VIEWs in Flink SQL, Ververica Platform Community Edition and a bit more. Flink Development == * [releases] Four issues (3 Blockers, 1 Critical) left for Flink 1.10.1 at this point. [1] * [statefun] On Friday, Gordon published the sixth release candidate for Apache Flink Stateful Functions 2.0.0. No votes so far. [2] * [clients] With the release of Zeppelin 0.9, Flink is now available on Zeppelin! The thread also contains a series of valuable, introductory blog posts on the topic. [3] * [sql] Kurt has formally started the discussion to make the Blink planner the default planner in Flink 1.11+. A lot of positive feedback so far. [4] * [sql] Zhenghua Gao has revived the discussion on FLIP-71 to fully support VIEWs in Flink SQL (including persisting VIEWs in the Catalog). [5] * [sql] Jark started a discussion on FLIP-112 to shorten & simplify the connector property keys in Flink SQL and Table API. [6] * [python] Dian Fu proposes to support the conversion between Flink Tables and Pandas DataFrames. [7] * [docker] Andrey has updated FLIP-111 to unify Flink Docker images based on feedback by Ufuk and Patrick Lucas. It seems like this can soon go to a vote. [8] * [development process] Aljoscha reminds everyone to create Jira tickets for all test failures if none exists for this test yet. [9] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Releasing-Flink-1-10-1-tp38689.html [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Apache-Flink-Stateful-Functions-Release-2-0-0-release-candidate-6-tp39776.html [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Flink-on-Zeppelin-Zeppelin-0-9-is-released-tp39498p39531.html [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Change-default-planner-to-blink-planner-in-1-11-tp39608p39686.html [5] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-71-E2E-View-support-in-Flink-SQL-tp33131p39787.html [6] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-120-Support-conversion-between-PyFlink-Table-and-Pandas-DataFrame-tp39611.html [7] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-122-New-Connector-Property-Keys-for-New-Factory-tp39759.html [8] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-111-Docker-image-unification-tp38444.html [9] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/PSA-Please-report-all-occurrences-of-test-failures-tp39793.html flink-packages.org == * On Tuesday, Ververica published Ververica Platform Community Edition on flink-packages.org. The Community Edition is a no-cost version of our enterprise offering and aims to make deploying, operating and managing Apache Flink applications easier for everyone. It requires Kubernetes and a distributed file system or object storage. [10] [10] https://flink-packages.org/packages/ververica-platform-community-edition Notable Bugs == * [FLINK-16913] [1.10.0] You currently can not configure the RocksDbStatebackend in the native Kubernetes support of Flink 1.10. Resolved for 1.10.1. [11] [11] https://issues.apache.org/jira/browse/FLINK-16913 Events, Blog Posts, Misc === * Dawid and Zhijiang joined the Apache Flink PMC. Congratulations! Additionally, I have joined as a Committer :) [12] * Marta has published a Flink community update on the Flink blog which among other things looks into the number of contributions to Apache Flink over time as well as the responsiveness of the community in addressing these. [13] [12] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-New-Committers-and-PMC-member-tp39640.html [13] https://flink.apache.org/news/2020/04/01/community-update.html Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Head of Product +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [VOTE] Apache Flink Stateful Functions Release 2.0.0, release candidate #6
east next *Tuesday, April 7, > > 05:00 UTC.* > > > > It is adopted by majority approval, with at least 3 PMC affirmative > votes. > > > > Thanks, > > Gordon > > > > [1] > > > https://docs.google.com/document/d/1P9yjwSbPQtul0z2AXMnVolWQbzhxs68suJvzR6xMjcs/edit?usp=sharing > > [2] > https://dist.apache.org/repos/dist/dev/flink/flink-statefun-2.0.0-rc6/ > > [3] > > https://repository.apache.org/content/repositories/orgapacheflink-1346/ > > [4] https://dist.apache.org/repos/dist/release/flink/KEYS > > [5] > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12346878 > > [6] > > > https://gitbox.apache.org/repos/asf?p=flink-statefun.git;a=commit;h=31e4df4ebf09fd9e74ae4c49bcdff56230e089ce > > [7] https://github.com/apache/flink-statefun/tree/release-2.0.0-rc6 > > [8] https://github.com/apache/flink-web/pull/318 > > [9] https://ci.apache.org/projects/flink/flink-statefun-docs-master/ > > [10] > https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.0/ > > > > TIP: You can create a `settings.xml` file with these contents: > > > > """ > > > > > > flink-statefun-2.0.0 > > > > > > > > flink-statefun-2.0.0 > > > > > > flink-statefun-2.0.0 > > > > https://repository.apache.org/content/repositories/orgapacheflink-1346/ > > > > > > > > archetype > > > > https://repository.apache.org/content/repositories/orgapacheflink-1346/ > > > > > > > > > > > > > > """ > > > > And reference that in you maven commands via `--settings > > path/to/settings.xml`. > > This is useful for creating a quickstart based on the staged release and > > for building against the staged jars. > > > -- Konstantin Knauf | Head of Product +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [ANNOUNCE] New Flink committer: Seth Wiesman
Congratulations, Seth! Well deserved :) On Tue, Apr 7, 2020 at 8:33 AM Tzu-Li (Gordon) Tai wrote: > Hi everyone! > > On behalf of the PMC, I’m very happy to announce Seth Wiesman as a new > Flink committer. > > Seth started contributing to the project in March 2017. You may know him > from several contributions in the past. > He had helped a lot with Flink documentation, and had contributed the State > Processor API. > Over the past few months, he has also helped tremendously in writing the > majority of the > Stateful Functions documentation. > > Please join me in congratulating Seth for becoming a Flink committer! > > Thanks, > Gordon > -- Konstantin Knauf | Head of Product +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL
Hi everyone, it would be very useful if temporal tables could be created via DDL. Currently, users either need to do this in the Table API or in the environment file of the Flink CLI, which both require the user to switch the context of the SQL CLI/Editor. I recently created a ticket for this request [1]. I see two main questions: 1) What would be the DDL syntax? A Temporal Table is on the one hand a view and on the other a function depending on how you look at it. 2) Would this temporal table view/function be stored in the catalog or only be temporary? I personally do not have much experience in this area of Flink, so I am looking forward to hearing your thoughts on this. Best, Konstantin [1] https://issues.apache.org/jira/browse/FLINK-16824 -- Konstantin Knauf
Re: [PROPOSAL] Google Season of Docs 2020.
Hi Marta, Thanks for kicking off the discussion. Aljoscha has recently revived the implementation of the FLIP-42 and has already moved things around quite a bit. [1] There are a lot of areas that can be improved of course, but a lot of them require very deep knowledge about the system (e.g. the "Deployment" or "Concepts" section). One area that I could imagine working well in such a format is to work on the "Connectors" section. Aljoscha has already moved this to the top-level, but it besides that it has not been touched yet in the course of FLIP-42. The documentation project could be around restructuring, standardization and generally improving the documentation of our connectors for both Datastream as well as Table API/SQL. Cheers, Konstantin [1] https://ci.apache.org/projects/flink/flink-docs-master/ On Wed, Apr 15, 2020 at 12:11 PM Marta Paes Moreira wrote: > Hi, Everyone. > > Google is running its Season of Docs [1] program again this year. The goal > of the program is to pair open source organizations/projects with > professional technical writers to improve their documentation. > > The Flink community submitted an application in 2019 (led by Konstantin) > [2,3], but was unfortunately not accepted into the program. This year, I'm > volunteering to write and submit the proposal in the upcoming weeks. To > achieve this, there are a few things that need to be sorted out in advance: > >- > *Mentors *Each proposed project idea requires at least two volunteers to >mentor technical writers through the process. *Who would like to >participate as a mentor*? You can read about the responsibilities here >[4]. > > >- > *Project Ideas *We can submit as many project ideas as we'd like, but it's >unlikely that more than 2 are accepted. *What would you consider a >priority for documentation improvement*? In my opinion, reorganizing the >documentation to make it easier to navigate and more accessible to >newcomers would be a top priority. You can check FLIP-42/FLINK-12639 [5] >for improvements that are already under consideration and [6] for last >year's mailing list discussion. > > >- *Alternative Organization Administrator* >I volunteer as an administrator, but Google requires two. *Who would >like to join me as an application administrator*? > > The deadline is *May 4th *and the accepted projects would kick-off the work > with technical writers on *September 14th*. Let me know if you have any > questions! > > Thanks, > > Marta > > [1] https://developers.google.com/season-of-docs > [2] > > https://lists.apache.org/thread.html/3c789b6187da23ad158df59bbc598543b652e3cfc1010a14e294e16a@%3Cdev.flink.apache.org%3E > [3] > > https://docs.google.com/document/d/1Up53jNsLztApn-mP76AB6xWUVGt3nwS9p6xQTiceKXo/edit?usp=sharing > [4] https://developers.google.com/season-of-docs/docs/mentor-guide > [5] https://issues.apache.org/jira/browse/FLINK-12639 > [6] > > https://lists.apache.org/thread.html/3c789b6187da23ad158df59bbc598543b652e3cfc1010a14e294e16a@%3Cdev.flink.apache.org%3E > -- Konstantin Knauf
[ANNOUNCE] Weekly Community Update 2020/16
rchive.1008284.n3.nabble.com/Configuring-autolinks-to-Flink-JIRA-ticket-in-github-repos-tp39712.html [17] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/PSA-Please-check-your-github-email-configuration-when-merging-on-Github-tp40016.html Notable Bugs == [FLINK-16662] [1.10.0] Currently, you can not convert a DataStream of POJOs to a Table. Fix planned for 1.10.1. [18] [18] https://issues.apache.org/jira/browse/FLINK-16662 Events, Blog Posts, Misc === * Hequn Chen joined the Apache Flink PMC. Congratulations! [19] * Seth Wiesman is an Apache Flink Committer now. Congrats! [20] * Flink Forward San Francisco Virtual will happen next week Wed - Fri. You can still register & attend for free and listen to over 40 talks by great speaker. [21] * David would like to contribute the material of Ververica's self-paced Apache Flink training to Apache Flink. [22] The feedback was positive and the details are discussed in a follow up thread. [23] * Abdelkrim Hadjidj has published a blog post that implements an imaginary supply chain use case with an Open Source stream processing stack including among others tools Apache Flink & Zeppelin. [24] * Nico started a series of posts on serialization in Apache Flink. A topic that is often crucial for performance in many DataStream API applications. His first post explains & compares the different available serializers available in Apache Flink. [25] * Also on the Flink blog, Jincheng and Markos recap the latest work on Python UDF support in the Table API, explain how to get started and have a look at future work in this area. [26] [19] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-New-Apache-Flink-PMC-Member-Hequn-Chen-tp40374p40443.html [20] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-New-Flink-committer-Seth-Wiesman-tp39917p39974.html [21] https://www.flink-forward.org/sf-2020/conference-program [22] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/PROPOSAL-Contribute-training-materials-to-Apache-Flink-tp40075.html [23] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Integration-of-training-materials-into-Apache-Flink-tp40299.html [24] https://medium.com/@abdelkrim.hadjidj/event-driven-supply-chain-for-crisis-with-flinksql-be80cb3ad4f9 [25] https://flink.apache.org/news/2020/04/15/flink-serialization-tuning-vol-1.html [26] https://flink.apache.org/2020/04/09/pyflink-udf-support-flink.html Cheers, Konstantin (@snntrable) -- Konstantin Knauf
Re: [ANNOUNCE] Apache Flink 1.9.3 released
Thanks for managing this release! On Sun, Apr 26, 2020 at 3:58 AM jincheng sun wrote: > Thanks for your great job, Dian! > > Best, > Jincheng > > > Hequn Cheng 于2020年4月25日周六 下午8:30写道: > >> @Dian, thanks a lot for the release and for being the release manager. >> Also thanks to everyone who made this release possible! >> >> Best, >> Hequn >> >> On Sat, Apr 25, 2020 at 7:57 PM Dian Fu wrote: >> >>> Hi everyone, >>> >>> The Apache Flink community is very happy to announce the release of >>> Apache Flink 1.9.3, which is the third bugfix release for the Apache Flink >>> 1.9 series. >>> >>> Apache Flink® is an open-source stream processing framework for >>> distributed, high-performing, always-available, and accurate data streaming >>> applications. >>> >>> The release is available for download at: >>> https://flink.apache.org/downloads.html >>> >>> Please check out the release blog post for an overview of the >>> improvements for this bugfix release: >>> https://flink.apache.org/news/2020/04/24/release-1.9.3.html >>> >>> The full release notes are available in Jira: >>> https://issues.apache.org/jira/projects/FLINK/versions/12346867 >>> >>> We would like to thank all contributors of the Apache Flink community >>> who made this release possible! >>> Also great thanks to @Jincheng for helping finalize this release. >>> >>> Regards, >>> Dian >>> >> -- Konstantin Knauf | Head of Product +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2020/17
Dear community, happy to share this week's community digest with an update on Flink 1.11, Flink 1.10.1 and Flink 1.9.3, the revival of FLIP-36 to support interactive programming, a new FLIP to unify (& separate) TimestampAssigners and a bit more. Flink Development == Releases ^ * [releases] Apache Flink 1.9.3 was released. [1,2] * [releases] Stephan has proposed 15th of May as the feature freeze date for Flink 1.11 [3]. Subsequently, Piotr also published a status update on the development progress for the upcoming release. Check it out to get an overview of all the features, which are still or not anymore planned for this release. [4] * [releases] The first release candidate for Flink 1.10.1 is out. [5] FLIPs ^^ * [table api] Xuannan has revived the discussion on FLIP-36 to support interactive programming the Table API. In essence, the FLIP proposes to support caching (intermediate) results of one (batch) job, so that they can be used by following (batch) jobs in the same TableEnvironment. [6] * [time] Aljoscha proposes to a) unify punctuated and periodic watermark assigners and b) to separate watermarks assigners and timestamp extractors. [7] More ^ * [configuration] Yangze started a discussion to unify the way max/min are used in the config options. Currently, there is a max of different patterns (**.max, max-**, and more). [8] * [connectors] Karim Mansour proposes a change to the current RabbitMQ connector in apache Flink to make message deduplication more flexible. [9] * [metrics] Jinhai would like to add additional labels to the metrics reported by the PrometheusReporter. [10] * [datastream api] Stephan proposes to remove a couple of deprecated methods for state access in Flink 1.11. [11] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Apache-Flink-1-9-3-released-tp40730.html [2] https://flink.apache.org/news/2020/04/24/release-1.9.3.html [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Exact-feature-freeze-date-tp40624.html [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Development-progress-of-Apache-Flink-1-11-tp40718.html [5] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-10-1-release-candidate-1-tp40724.html [6] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-36-Support-Interactive-Programming-in-Flink-Table-API-td40592.html [7] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-126-Unify-and-separate-Watermark-Assigners-tp40525p40565.html [8] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Should-max-min-be-part-of-the-hierarchy-of-config-option-tp40578.html [9] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-flink-connector-rabbitmq-api-changes-tp40704.html [10] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Add-custom-labels-on-AbstractPrometheusReporter-like-PrometheusPushGatewayReporter-s-groupiny-tp40708.html [11] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Removing-deprecated-state-methods-in-1-11-tp40651.html Notable Bugs == * [FLINK-17350] [1.10.0] [1.9.3] [1.8.3] Since Flink 1.5 users can choose not to fail their Flink job on checkpoint errors. For the synchronous part of a checkpoint the implementation of this feature was incorrect, leaving operators in an inconsistent state following such a failure. Piotr proposes to always fail tasks on failures in the synchronous part of a checkpoint going forward. [12] * [FLINK-17351] [1.10.0] [1.9.3] CheckpointFailureManager ignores checkpoint timeouts when checking against the maximally tolerable number of checkpoints failures. So, checkpoint failures are not discovered when they only surface in the CheckpointFailureManager as a checkpoint timeout instead of an exception. [13] Background: both issues were discovered based on a bug report by Jun Qin [14]. [12] https://issues.apache.org/jira/browse/FLINK-17351 [13] https://issues.apache.org/jira/browse/FLINK-17350 [14] https://issues.apache.org/jira/browse/FLINK-17327 Events, Blog Posts, Misc === * Andrey recaps the changes and simplifications to Flink's memory management (released in Flink 1.10) on the Apache Flink blog. [15] Closely related, there is also a small tool to test different memory configurations on flink-packages.org. [16] [15] https://flink.apache.org/news/2020/04/21/memory-management-improvements-flink-1.10.html [16] https://flink-packages.org/packages/flink-memory-calculator Cheers, Konstantin -- Konstantin Knauf https://twitter.com/snntrable https://github.com/knaufk
Re: [DISCUSS] FLIP-60: Restructure the Table API & SQL documentation
gt; - Aggregate Functions > >>> - ... > >>> > >>> Currently, all the functions are squeezed in one page. It make the > >>> page bloated. > >>> Meanwhile, I think it would be great to enrich the built-in functions > >> with > >>> argument explanation and more clear examples like MySQL[1] and other > >>> DataBase docs. > >>> > >>> 2) +1 to the "Architecture & Internals" chapter. > >>> We already have a pull request[2] to add "Streaming Aggregation > >> Performance > >>> Tuning" page which talks about the performance tuning tips around > >> streaming > >>> aggregation and the internals. > >>> Maybe we can put it under the internal chapter or a "Performance > Tuning" > >>> chapter. > >>> > >>> 3) How about restructure SQL chapter a bit like this? > >>> > >>> SQL > >>> - Overview > >>> - Data Manipulation Statements (all operations available in SQL) > >>> - Data Definition Statements (DDL syntaxes) > >>> - Pattern Matching > >>> > >>> It renames "Full Reference" to "Data Manipulation Statements" which is > >> more > >>> align with "Data Definition Statements". > >>> > >>> > >>> Regards, > >>> Jark > >>> > >>> [1]: > >>> > >>> > >> > https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_adddate > >>> [2]: https://github.com/apache/flink/pull/9525 > >>> > >>> > >>> > >>> > >>> > >>> On Mon, 2 Sep 2019 at 17:29, Kurt Young wrote: > >>> > >>>> +1 to the general idea and thanks for driving this. I think the new > >>>> structure is > >>>> more clear than the old one, and i have some suggestions: > >>>> > >>>> 1. How about adding a "Architecture & Internals" chapter? This can > help > >>>> developers > >>>> or users who want to contribute more to have a better understanding > >> about > >>>> Table. > >>>> Essentially with blink planner, we merged a lots of codes and features > >>> but > >>>> lack of > >>>> proper user and design documents. > >>>> > >>>> 2. Add a dedicated "Hive Integration" chapter. We spend lots of effort > >> on > >>>> integrating > >>>> hive, and hive integration is happened in different areas, like > >> catalog, > >>>> function and > >>>> maybe ddl in the future. I think a dedicated chapter can make users > who > >>> are > >>>> interested > >>>> in this topic easier to find the information they need. > >>>> > >>>> 3. Add a chapter about how to manage, monitor or tune the Table & SQL > >>> jobs, > >>>> and > >>>> might adding something like how to migrate old version jobs to new > >>> version > >>>> in the future. > >>>> > >>>> Best, > >>>> Kurt > >>>> > >>>> > >>>> On Mon, Sep 2, 2019 at 4:17 PM vino yang > >> wrote: > >>>>> Agree with Dawid's suggestion about function. > >>>>> > >>>>> Having a Functions section to unify the built-in function and UDF > >> would > >>>> be > >>>>> better. > >>>>> > >>>>> Dawid Wysakowicz 于2019年8月30日周五 下午7:43写道: > >>>>> > >>>>>> +1 to the idea of restructuring the docs. > >>>>>> > >>>>>> My only suggestion to consider is how about moving the > >>>>>> User-Defined-Extensions subpages to corresponding broader topics? > >>>>>> > >>>>>> Sources & Sinks >> Connect to external systems > >>>>>> > >>>>>> Catalogs >> Connect to external systems > >>>>>> > >>>>>> and then have a Functions sections with subsections: > >>>>>> > >>>>>> functions > >>>>>> > >>>>>> |- built in functions > >>>>>> > >>>>>> |- user defined functions > >>>>>> > >>>>>> > >>>>>> Best, > >>>>>> > >>>>>> Dawid > >>>>>> > >>>>>> On 30/08/2019 10:59, Timo Walther wrote: > >>>>>>> Hi everyone, > >>>>>>> > >>>>>>> the Table API & SQL documentation was already in a very good > >> shape > >>> in > >>>>>>> Flink 1.8. However, in the past it was mostly presented as an > >>>> addition > >>>>>>> to DataStream API. As the Table and SQL world is growing quickly, > >>>>>>> stabilizes in its concepts, and is considered as another > >> top-level > >>>> API > >>>>>>> and closed ecosystem, it is time to restructure the docs a little > >>> bit > >>>>>>> to represent the vision of FLIP-32. > >>>>>>> > >>>>>>> Current state: > >>>>>>> > >> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/ > >>>>>>> We would like to propose the following FLIP-60 for a new > >> structure: > >> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=127405685 > >>>>>>> > >>>>>>> Looking forward to feedback. > >>>>>>> > >>>>>>> Thanks, > >>>>>>> > >>>>>>> Timo > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2019/38
Dear community, happy to share this week's community update with a FLIP for the Pulsar Connector contribution, three FLIPs for the SQL Ecosystem (plugin system, computed columns, extended support for views), and a bit more. Enjoy! Flink Development == * [connectors] After some discussion on the mailing list over the last weeks, Sijie has opened a FLIP to add an exactly-once Pulsar Connector (DataStream API, Table API, Catalog API) to Flink. [1] * [sql] The discussion on supporting Hive built-in function in Flink SQL lead to FLIP-69 to extend the core table system with modular plugins. [2] While focusing on function modules as a first step, the FLIP proposes a more general plugin system also covering user defined types, operators, rules, etc. As part of this FLIP the existing functions in Flink SQL would also be migrated into a "CorePlugin". [3] * [sql] Danny proposes to add support for computed columns in Flink SQL (as FLIP-10). [4] * [sql] Zhenghua has started a discussion on extending support for VIEWs in Flink SQL (as FLIP-71). He proposes to add support to store views in a catalog and to add support for "SHOW VIEWS" and "DESCRIBE VIEW". [5] [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector [2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Modular+Plugins [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-68-Extend-Core-Table-System-with-Modular-Plugins-tp33161.html [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-70-Support-Computed-Column-for-Flink-SQL-tp33126.html [5] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-71-E2E-View-support-in-Flink-SQL-tp33131.html Notable Bugs == * [FLINK-14010] [ 1.9.0] [1.8.2] [1.7.2] [yarn] When the Flink Yarn Application Manager receives a shut down request by the YARN Resource Manager, the Flink cluster can get into an inconsistent state, where leaderhship for JobManager, ResourceManager and Dispatcher components is split between two master processes. Tison is working on a fix. [6] * [FLINK-14107] [ 1.9.0] [ 1.8.2] [kinesis] When using event time alignment with the Kinsesis Consumer the consumer might deadlock in one corner case. Fixed for 1.9.1 and 1.8.3. [7] [6] https://issues.apache.org/jira/browse/FLINK-14010 [7] https://issues.apache.org/jira/browse/FLINK-14107 Events, Blog Posts, Misc === * Upcoming Meetups * *Enrico Canzonieri* of Yelp and *David Massart* of Tentative will share their Apache Flink user stories of Yelp and BNP Paribas at the next *Bay Area Apache Flink Meetup* 24th of September. [8] * *Ana Esguerra* has published a blog post on how to run Flink on YARN with Kerberos for Kafka & YARN. [9] [8] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/262680261/ [9] https://medium.com/@minyodev/apache-flink-on-yarn-with-kerberos-authentication-adeb62ef47d2 Cheers, Konstantin -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2019/39
Dear community, happy to share this week's community update with news about Flink 1.10 & 1.9.1, two FLIPs for better programmatic job and cluster control, improvements to the web user interface and a bit more. Flink Development == * [releases] Jark has started a discussion on releasing a first patch release for Flink 1.9. Looking at the discussion I expect a first release candidate soonish. [1] * [releases] Yu Li has shared a small progress report on Flink 1.10, which gives a very good overview of ongoing FLIPs including their status. [2] * [ui] Yadong has started a discussion on contributing improvements to the Flink Web UI. It includes a variety of pretty cool changes such as better overview over the cluster resources, flame graphs, additional backpressure monitoring and much more. Best you check it out yourself. [3] * [python] FLIP-58 aims to support stateless Python UDFs in the Table API. Wei has shared a design document to support the usage of Python dependencies inside such UDFs and is looking for feedback. [4] * [sql] There a currently three ongoing votes for previously covered FLIPs in the SQL ecosystem (FLIP-66 (Time-Attribute in SQL DDL), FLIP-57 (Function Catalog), FLIP-68 (Modular Plugins)). [5,6,7] * [client] A couple of weeks ago Zili had started a discussion on how to improve Flink's API for job submission and job and cluster management. As a result of this discussion Kostas has now shared a design document to refactor the (Stream)ExecutionEnvironments. The basic idea is to move the responsibility of job submission out of the ExecutionEnvironment into Executors. This would result in only one ExecutionEnvironment per API in the future, and one Executor per environment (YARN, Standalone, ...). The ExecutionEnvironment would then choose the correct Executor based on its configuration. [8] * [client] The second FLIP that originated from this discussion was also shared by Zili this week and proposes to expose JobClient, ClusterClient and ClusterDescriptor to users to programmatically control Flink Jobs and Clusters during their runtime. [9] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Releasing-Flink-1-9-1-tp33343.html [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Progress-of-Apache-Flink-1-10-1-tp33570.html [3] https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit# [4] https://docs.google.com/document/d/1vq5J3TSyhscQXbpRhz-Yd3KCX62PBJeC_a_h3amUvJ4/edit#heading=h.lvy7nudjmhjd [5] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-66-Support-Time-Attribute-in-SQL-DDL-2-tp33513.html [6] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-57-Rework-FunctionCatalog-tp33373.html [7] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-68-Extend-Core-Table-System-with-Modular-Plugins-tp33372.html [8] https://cwiki.apache.org/confluence/display/FLINK/FLIP-73%3A+Introducing+Executors+for+job+submission [9] https://cwiki.apache.org/confluence/display/FLINK/FLIP-74:+Flink+JobClient+API Notable Bugs == [FLINK-14145] [1.9.0] When preferring checkpoints (as opposed to savepoints) for recovery, the wrong checkpoint could be chosen for recovery in certain situations. Fixed for 1.9.1. [10] [FLINK-13708] [1.9.0] When calling "execute" twice on the same table environment, transformations from the first execution will also be part of the second execution. PR pending, might get into 1.9.1. [11] [10] https://issues.apache.org/jira/browse/FLINK-14145 [11] https://issues.apache.org/jira/browse/FLINK-13708 Events, Blog Posts, Misc === * Flink Forward Europe is getting close, 7th - 9th of October. Keynotes by AWS, Cloudera, Cloud Foundry Foundation, Google and Ververica. [12] * Upcoming Meetups * Tomorrow * Dean Shaw* and *Max McKittrick* will talk about click stream analysis at scale at Capital One at the next dataCouncil.ai NYC Data Engineering meetup. [13] [12] https://europe-2019.flink-forward.org/ [13] https://www.meetup.com/DataCouncil-AI-NYC-Data-Engineering-Science/events/264748638/ Cheers, Konstantin -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [DISCUSS] FLIP-74: Flink JobClient API
ent/d/1E-8UjOLz4QPUTxetGWbU23OlsIH9VIdodpTsxwoQTs0/edit?disco=DnLLvM8 > >>>> [2] > >>>> > >> > https://lists.apache.org/x/thread.html/dc3a541709f96906b43df4155373af1cd09e08c3f105b0bd0ba3fca2@%3Cdev.flink.apache.org%3E > >>>> > >>>> > >>>> > >>>> > >>>> Kostas Kloudas 于2019年9月25日周三 下午9:29写道: > >>>> > >>>>> Hi Tison, > >>>>> > >>>>> Thanks for the FLIP and launching the discussion! > >>>>> > >>>>> As a first note, big +1 on providing/exposing a JobClient to the > users! > >>>>> > >>>>> Some points that would be nice to be clarified: > >>>>> 1) You mention that we can get rid of the DETACHED mode: I agree that > >>>>> at a high level, given that everything will now be asynchronous, > there > >>>>> is no need to keep the DETACHED mode but I think we should specify > >>>>> some aspects. For example, without the explicit separation of the > >>>>> modes, what happens when the job finishes. Does the client > >>>>> periodically poll for the result always or the result is pushed when > >>>>> in NON-DETACHED mode? What happens if the client disconnects and > >>>>> reconnects? > >>>>> > >>>>> 2) On the "how to retrieve a JobClient for a running Job", I think > >>>>> this is related to the other discussion you opened in the ML about > >>>>> multi-layered clients. First of all, I agree that exposing different > >>>>> "levels" of clients would be a nice addition, and actually there have > >>>>> been some discussions about doing so in the future. Now for this > >>>>> specific discussion: > >>>>> i) I do not think that we should expose the > >>>>> ClusterDescriptor/ClusterSpecification to the user, as this ties us > to > >>>>> a specific architecture which may change in the future. > >>>>> ii) I do not think it should be the Executor that will provide a > >>>>> JobClient for an already running job (only for the Jobs that it > >>>>> submits). The job of the executor should just be to execute() a > >>>>> pipeline. > >>>>> iii) I think a solution that respects the separation of concerns > >>>>> could be the addition of another component (in the future), something > >>>>> like a ClientFactory, or ClusterFactory that will have methods like: > >>>>> ClusterClient createCluster(Configuration), JobClient > >>>>> retrieveJobClient(Configuration , JobId), maybe even (although not > >>>>> sure) Executor getExecutor(Configuration ) and maybe more. This > >>>>> component would be responsible to interact with a cluster manager > like > >>>>> Yarn and do what is now being done by the ClusterDescriptor plus some > >>>>> more stuff. > >>>>> > >>>>> Although under the hood all these abstractions (Environments, > >>>>> Executors, ...) underneath use the same clients, I believe their > >>>>> job/existence is not contradicting but they simply hide some of the > >>>>> complexity from the user, and give us, as developers some freedom to > >>>>> change in the future some of the parts. For example, the executor > will > >>>>> take a Pipeline, create a JobGraph and submit it, instead of > requiring > >>>>> the user to do each step separately. This allows us to, for example, > >>>>> get rid of the Plan if in the future everything is DataStream. > >>>>> Essentially, I think of these as layers of an onion with the clients > >>>>> being close to the core. The higher you go, the more functionality is > >>>>> included and hidden from the public eye. > >>>>> > >>>>> Point iii) by the way is just a thought and by no means final. I also > >>>>> like the idea of multi-layered clients so this may spark up the > >>>>> discussion. > >>>>> > >>>>> Cheers, > >>>>> Kostas > >>>>> > >>>>> On Wed, Sep 25, 2019 at 2:21 PM Aljoscha Krettek < > aljos...@apache.org> > >>>>> wrote: > >>>>>> > >>>>>> Hi Tison, > >>>>>> > >>>>>> Thanks for proposing the document! I had some comments on the > >> document. > >>>>>> > >>>>>> I think the only complex thing that we still need to figure out is > >> how > >>>>> to get a JobClient for a job that is already running. As you > mentioned > >> in > >>>>> the document. Currently I’m thinking that its ok to add a method to > >>>>> Executor for retrieving a JobClient for a running job by providing an > >> ID. > >>>>> Let’s see what Kostas has to say on the topic. > >>>>>> > >>>>>> Best, > >>>>>> Aljoscha > >>>>>> > >>>>>>> On 25. Sep 2019, at 12:31, Zili Chen wrote: > >>>>>>> > >>>>>>> Hi all, > >>>>>>> > >>>>>>> Summary from the discussion about introducing Flink JobClient > >> API[1] > >>>>> we > >>>>>>> draft FLIP-74[2] to > >>>>>>> gather thoughts and towards a standard public user-facing > >> interfaces. > >>>>>>> > >>>>>>> This discussion thread aims at standardizing job level client API. > >>>>> But I'd > >>>>>>> like to emphasize that > >>>>>>> how to retrieve JobClient possibly causes further discussion on > >>>>> different > >>>>>>> level clients exposed from > >>>>>>> Flink so that a following thread will be started later to > >> coordinate > >>>>>>> FLIP-73 and FLIP-74 on > >>>>>>> expose issue. > >>>>>>> > >>>>>>> Looking forward to your opinions. > >>>>>>> > >>>>>>> Best, > >>>>>>> tison. > >>>>>>> > >>>>>>> [1] > >>>>>>> > >>>>> > >> > https://lists.apache.org/thread.html/ce99cba4a10b9dc40eb729d39910f315ae41d80ec74f09a356c73938@%3Cdev.flink.apache.org%3E > >>>>>>> [2] > >>>>>>> > >>>>> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-74%3A+Flink+JobClient+API > >>>>>> > >>>>> > >>>> > > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [DISCUSS] FLIP-74: Flink JobClient API
Hi Thomas, maybe there is a misunderstanding. There is no plan to deprecate anything in the REST API in the process of introducing the JobClient API, and it shouldn't. Since "cancel with savepoint" was already deprecated in the REST API and CLI, I am just raising the question whether to add it to the JobClient API in the first place. Best, Konstantin On Mon, Sep 30, 2019 at 1:16 AM Thomas Weise wrote: > I did not realize there was a plan to deprecate anything in the REST API? > > The REST API is super important for tooling written in non JVM languages, > that does not include a Flink client (like FlinkK8sOperator). The REST API > should continue to support all job management operations, including job > submission. > > Thomas > > > On Sun, Sep 29, 2019 at 1:37 PM Konstantin Knauf > > wrote: > > > Hi Zili, > > > > thanks for working on this topic. Just read through the FLIP and I have > two > > questions: > > > > * should we add "cancelWithSavepeoint" to a new public API, when we have > > deprecated the corresponding REST API/CLI methods? In my understanding > > there is no reason to use it anymore. > > * should we call "stopWithSavepoint" simply "stop" as "stop" always > > performs a savepoint? > > > > Best, > > > > Konstantin > > > > > > > > On Fri, Sep 27, 2019 at 10:48 AM Aljoscha Krettek > > wrote: > > > > > Hi Flavio, > > > > > > I agree that this would be good to have. But I also think that this is > > > outside the scope of FLIP-74, I think it is an orthogonal feature. > > > > > > Best, > > > Aljoscha > > > > > > > On 27. Sep 2019, at 10:31, Flavio Pompermaier > > > wrote: > > > > > > > > Hi all, > > > > just a remark about the Flink REST APIs (and its client as well): > > almost > > > > all the times we need a way to dynamically know which jobs are > > contained > > > in > > > > a jar file, and this could be exposed by the REST endpoint under > > > > /jars/:jarid/entry-points (a simple way to implement this would be to > > > check > > > > the value of Main-class or Main-classes inside the Manifest of the > jar > > if > > > > they exists [1]). > > > > > > > > I understand that this is something that is not strictly required to > > > > execute Flink jobs but IMHO it would ease A LOT the work of UI > > developers > > > > that could have a way to show the users all available jobs inside a > > jar + > > > > their configurable parameters. > > > > For example, right now in the WebUI, you can upload a jar and then > you > > > have > > > > to set (without any autocomplete or UI support) the main class and > > their > > > > params (for example using a string like --param1 xx --param2 yy). > > > > Adding this functionality to the REST API and the respective client > > would > > > > enable the WebUI (and all UIs interacting with a Flink cluster) to > > > prefill > > > > a dropdown list containing the list of entry-point classes (i.e. > Flink > > > > jobs) and, once selected, their required (typed) parameters. > > > > > > > > Best, > > > > Flavio > > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-10864 > > > > > > > > On Fri, Sep 27, 2019 at 9:16 AM Zili Chen > > wrote: > > > > > > > >> modify > > > >> > > > >> /we just shutdown the cluster on the exit of client that running > > inside > > > >> cluster/ > > > >> > > > >> to > > > >> > > > >> we just shutdown the cluster on both the exit of client that running > > > inside > > > >> cluster and the finish of job. > > > >> Since client is running inside cluster we can easily wait for the > end > > of > > > >> two both in ClusterEntrypoint. > > > >> > > > >> > > > >> Zili Chen 于2019年9月27日周五 下午3:13写道: > > > >> > > > >>> About JobCluster > > > >>> > > > >>> Actually I am not quite sure what we gains from DETACHED > > configuration > > > on > > > >>> cluster side. > > > >>> We don't have a NON-DETACHED JobCluster in fact in our codebase, > > right? >
[ANNOUNCE] Weekly Community Update 2019/40
Dear community, happy to share this week's community update including Flink 1.9.1, unaligned checkpoints, a new type inference system for UDFs in the Table API and a bit more. Enjoy. Flink Development == * [releases] The vote for *Flink 1.9.1 RC1* has timed out without a vote. I guess, Flink Forward Europe preparations in combination with Chinese national holiday... [1] * [runtime] Arvid has published a FLIP to support *unaligned checkpoints*. Unaligned checkpoints could significantly reduce & stabilize checkpointing times in high backpressure and high load scenarios and result in lower latencies under checkpointing in general. [2] * [runtime] Piotr has started a *survey *on dropping the *"non credit-based flow control"-mode* in Flink's network stack. No replies so far. So, if you are relying on it, now would be the time to raise your hand. [3] * [logging] Gyula has started a discussion on adding *contextual information* (taskmanager, container, job id) to *Flink's logs* and is looking for feedback. [4] * [sql] Timo has published FLIP-65, which proposes a *new type inference mechanism for Table API* user-defined functions. It is based on three levels of abstraction: automatic type inference, type hints via annotations and manual declaration. [5] * [code-style] Zili proposes to drop "final" from method parameters under the assumption that all method parameters should be final anyway. Discussion ongoing. [6] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-9-1-release-candidate-1-tp33637.html [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-76-Unaligned-checkpoints-tp33651.html [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Dropping-non-Credit-based-Flow-Control-tp33714.html [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improve-Flink-logging-with-contextual-information-tp33729.html [5] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-65-New-type-inference-for-Table-API-UDFs-tp33756.html [6] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/CODE-STYLE-Parameters-of-method-are-always-final-tp33686.html Notable Bugs == Not a lot of action here, only 12 bug tickets updated this week. * [FLINK-14315] [1.9.0] [1.8.3] A race condition in the JobManager (JobMaster) can result in a NPE during Flink master failover. [7] [7] https://issues.apache.org/jira/browse/FLINK-14315 Events, Blog Posts, Misc === * *Parag Kesar* & *Ben Liu* of Pinterest have published a blog post on their Flink-based real-time experimentation platform. [8] * *Oran Hirsch* of DynamicYield describes their data architecture including Apache Flink for stream processing in this recently published blogpost. [9] * There will be full-day meetup with six talks in the Bangalore Kafka Group on the 2nd of November including at least three Flink talks by *Timo Walter* (Ververica), *Shashank Agarwal* (Razorpay) and *Rasyid Hakim* (GoJek). [10] * *Flink Forward Europe *starts tomorrow in Berlin. Hope to see your there. [] [8] https://medium.com/pinterest-engineering/real-time-experiment-analytics-at-pinterest-using-apache-flink-841c8df98dc2 [9] https://www.dynamicyield.com/blog/turning-messy-data-into-a-gold-mine/ [10] https://www.meetup.com/Bangalore-Apache-Kafka-Group/events/265285812/ [11] https://europe-2019.flink-forward.org Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2019/41
Dear community, happy to share a small community update this week with Flink Forward Europe, "Stateful Functions" and a bit more. Flink Development == * [api] Stephan proposes to contribute* Stateful Function (statefun.io <http://statefun.io>)* to Apache Flink. Stateful Functions were announced by Ververica earlier this week at Flink Forward Europe. It is an Actor-like API, which makes it easier to to write general purpose event-driven application on top of Flink. There has been a lot of positive feedback so far. [1] * [releases] The vote/verification of *Flink 1.9.1 RC1* is still ongoing. [2] * [python] Dian Fu started a discussion to *drop Python 2 support* in Flink 1.10. It looks like there is a consensus for this due to Python 2's EOL in January 2020 and additional external dependencies (e.g. Beam) added for the the 1.10 release increasing the burden of continued support. [3] * [config] In a previous update, I covered FLIP-54 "Evolve ConfigOption and Configuration". In the course of the discussion this FLIP has turned out to be intertwined with too many other areas and ongoing developments. Hence, Timo proposed to split it up into three FLIPS. The first of these is *FLIP-77 "Introduce ConfigOptions with Data Types"*. It proposes add information about the described type to ConfigOptions. [4] * [checkpointing] Biao Liu has started a survey about the usage of the " *ExternallyInducedSource*" and "*WithMasterCheckpoint*" interfaces. He is currently working on these and is looking for feedback from existing users. [5] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/PROPOSAL-Contribute-Stateful-Functions-to-Apache-Flink-tp33913.html [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-9-1-release-candidate-1-tp33637.html [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Drop-Python-2-support-for-1-10-tp33824.html [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-77-Introduce-ConfigOptions-with-Data-Types-tp33902.html [5] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-How-do-you-use-ExternallyInducedSource-or-WithMasterCheckpointHook-tp33864.html Notable Bugs == Nothing came to my attention :) Events, Blog Posts, Misc === * This week the community gathered in Berlin for Flink Forward Europe. The recordings of the talks will probably be available next week already. Check https://twitter.com/flinkforward for updates. * There will be full-day meetup with six talks in the Bangalore Kafka Group on the 2nd of November including at least three Flink talks by *Timo Walter* (Ververica), *Shashank Agarwal* (Razorpay) and *Rasyid Hakim* (GoJek). [6] [6] https://www.meetup.com/Bangalore-Apache-Kafka-Group/events/265285812/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2019/42
Dear community, happy to share this week's community update with the release of Flink 1.9.1, a couple of threads about our development process, the sql ecosystem and more. Flink Development == * [releases] *Apache Flink 1.9.1* was released. [1,2] * [statefun] Stephan has started a separate discussion on whether to maintain *Stateful Functions* in a separate repository or the Flink main repository after its contribution to Apache Flink. The majority seems to prefer a separate repository e.g. to enable quicker iterations on the new code base and not to overwhelm new contributors to Stateful Functions. [3] * [development process] The *NOTICE* file and the directory for licenses of bundled dependencies for binary releases is now auto-generated during the release process. [4] * [development process] According to our *FLIP process* the introduction of a new config option requires a FLIP (and vote). Aljoscha has started a discussion to clarify this point, as this is currently not always the case. Looks like the majority leans towards a vote for every configuration change, but possibly making it more lightweight than a proper FLIP. [5] * [development process] Xiyuan gave an update on *Flink's ARM support.* Travis ARM Support is in alpha now (alternative to previously proposed OpenLab), and regardless of the CI system Xiyuan points the community to a list of PRs/Jiras, which need to be solved/reviewed. [6] * [configuration] The discussion on FLIP-59 to make the execution configuration (ExecutionConfig et al.) configurable via the Flink Configuration has been revived a bit and focuses on alignment with FLIP-73 and naming of the different configurations now. [7] * [sql] Based on feedback from the user community, Timo proposes to rename the "ANY" datatype "OPAQUE" highlighting that a field of type "ANY" does not hold any type, but a data type that is unknown to Flink. [8] * [sql] Jark has started a discussion on FLIP-80 [9] about how to de/serialize expressions in catalogs. [10] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Apache-Flink-1-9-1-released-tp34170.html [2] https://flink.apache.org/news/2019/10/18/release-1.9.1.html [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Stateful-Functions-in-which-form-to-contribute-same-or-different-repository-tp34034.html [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/NOTICE-Binary-licensing-is-now-auto-generated-tp34121.html [5] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-policy-for-introducing-config-option-keys-tp34011p34045.html [6] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ARM-support-Travis-ARM-CI-is-now-in-Alpha-Release-tp34039.html [7] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-59-Enable-execution-configuration-from-Configuration-object-tp32359.html [8] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Rename-the-SQL-ANY-type-to-OPAQUE-type-tp34162.html [9] https://docs.google.com/document/d/1LxPEzbPuEVWNixb1L_USv0gFgjRMgoZuMsAecS_XvdE/edit?usp=sharing [10] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISUCSS-FLIP-80-Expression-String-Serializable-and-Deserializable-tp34146.html Notable Bugs == * [FLINK-14429] [1.9.1] [1.8.2] When you run a batch job on YARN in non-detached mode, it will be reported as SUCCEEDED if when it actually FAILED. [11] [11] https://issues.apache.org/jira/browse/FLINK-14429 Events, Blog Posts, Misc === * This discussion on the dev@ mailing list might be interesting to follow for people using the StreamingFileSink or BucketingSink with S3. [12] * There will be full-day meetup with six talks in the Bangalore Kafka Group on the 2nd of November including at least three Flink talks by *Timo Walter* (Ververica), *Shashank Agarwal* (Razorpay) and *Rasyid Hakim* (GoJek). [13] [12] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/performances-of-S3-writing-with-many-buckets-in-parallel-tp34021p34050.html [13] https://www.meetup.com/Bangalore-Apache-Kafka-Group/events/265285812/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2019/43
go-area-Hadoop-User-Group-CHUG/events/265675851/ [12] https://www.meetup.com/Athens-Big-Data/events/265957761/ [13] https://www.meetup.com/Bangalore-Apache-Kafka-Group/events/265285812/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [VOTE] Accept Stateful Functions into Apache Flink
gt;>>>>>>>>>>> > > > > > >>>>>>>>>>>>>> Best, > > > > > >>>>>>>>>>>>>> Kurt > > > > > >>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>> On Tue, Oct 22, 2019 at 12:56 AM Fabian Hueske < > > > > > >>>>>>>>> fhue...@gmail.com> > > > > > >>>>>>>>>>>>> wrote: > > > > > >>>>>>>>>>>>>>> +1 (binding) > > > > > >>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>> Am Mo., 21. Okt. 2019 um 16:18 Uhr schrieb Thomas > > > Weise < > > > > > >>>>>>>>>>>> t...@apache.org > > > > > >>>>>>>>>>>>>> : > > > > > >>>>>>>>>>>>>>>> +1 (binding) > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> On Mon, Oct 21, 2019 at 7:10 AM Timo Walther < > > > > > >>>>>>>>> twal...@apache.org > > > > > >>>>>>>>>>>>> wrote: > > > > > >>>>>>>>>>>>>>>>> +1 (binding) > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> Thanks, > > > > > >>>>>>>>>>>>>>>>> Timo > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> On 21.10.19 15:59, Till Rohrmann wrote: > > > > > >>>>>>>>>>>>>>>>>> +1 (binding) > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> Cheers, > > > > > >>>>>>>>>>>>>>>>>> Till > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> On Mon, Oct 21, 2019 at 12:13 PM Robert Metzger > < > > > > > >>>>>>>>>>>> rmetz...@apache.org > > > > > >>>>>>>>>>>>>>>>> wrote: > > > > > >>>>>>>>>>>>>>>>>>> +1 (binding) > > > > > >>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> On Mon, Oct 21, 2019 at 12:06 PM Stephan Ewen < > > > > > >>>>>>>>>> se...@apache.org > > > > > >>>>>>>>>>>>>>>> wrote: > > > > > >>>>>>>>>>>>>>>>>>>> This is the official vote whether to accept > the > > > > > >>>>>> Stateful > > > > > >>>>>>>>>>>> Functions > > > > > >>>>>>>>>>>>>>>> code > > > > > >>>>>>>>>>>>>>>>>>>> contribution to Apache Flink. > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> The current Stateful Functions code, > > > documentation, > > > > > >>>>>> and > > > > > >>>>>>>>>> website > > > > > >>>>>>>>>>>> can > > > > > >>>>>>>>>>>>>>>> be > > > > > >>>>>>>>>>>>>>>>>>>> found here: > > > > > >>>>>>>>>>>>>>>>>>>> https://statefun.io/ > > > > > >>>>>>>>>>>>>>>>>>>> > https://github.com/ververica/stateful-functions > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> This vote should capture whether the Apache > > Flink > > > > > >>>>>>>> community > > > > > >>>>>>>>>> is > > > > > >>>>>>>>>>>>>>>>> interested > > > > > >>>>>>>>>>>>>>>>>>>> in accepting, maintaining, and evolving > Stateful > > > > > >>>>>>>> Functions. > > > > > >>>>>>>>>>>>>>>>>>>> Reiterating my original motivation, I believe > > that > > > > > >>>>>> this > > > > > >>>>>>>>>> project > > > > > >>>>>>>>>>>> is > > > > > >>>>>>>>>>>>>>> a > > > > > >>>>>>>>>>>>>>>>>>> great > > > > > >>>>>>>>>>>>>>>>>>>> match for Apache Flink, because it helps Flink > > to > > > > > >>>>> grow > > > > > >>>>>>>> the > > > > > >>>>>>>>>>>>>>> community > > > > > >>>>>>>>>>>>>>>>>>> into a > > > > > >>>>>>>>>>>>>>>>>>>> new set of use cases. We see current users > > > > > >>>>> interested > > > > > >>>>>> in > > > > > >>>>>>>>> such > > > > > >>>>>>>>>>> use > > > > > >>>>>>>>>>>>>>>>> cases, > > > > > >>>>>>>>>>>>>>>>>>>> but they are not well supported by Flink as it > > > > > >>>>>> currently > > > > > >>>>>>>>> is. > > > > > >>>>>>>>>>>>>>>>>>>> I also personally commit to put time into > making > > > > > >>>>> sure > > > > > >>>>>>>> this > > > > > >>>>>>>>>>>>>>> integrates > > > > > >>>>>>>>>>>>>>>>>>> well > > > > > >>>>>>>>>>>>>>>>>>>> with Flink and that we grow contributors and > > > > > >>>>>> committers > > > > > >>>>>>>> to > > > > > >>>>>>>>>>>> maintain > > > > > >>>>>>>>>>>>>>>>> this > > > > > >>>>>>>>>>>>>>>>>>>> new component well. > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> This is a "Adoption of a new Codebase" vote as > > per > > > > > >>>>> the > > > > > >>>>>>>>> Flink > > > > > >>>>>>>>>>>> bylaws > > > > > >>>>>>>>>>>>>>>>> [1]. > > > > > >>>>>>>>>>>>>>>>>>>> Only PMC votes are binding. The vote will be > > open > > > at > > > > > >>>>>>>> least > > > > > >>>>>>>>> 6 > > > > > >>>>>>>>>>> days > > > > > >>>>>>>>>>>>>>>>>>>> (excluding weekends), meaning until Tuesday > > > Oct.29th > > > > > >>>>>>>> 12:00 > > > > > >>>>>>>>>> UTC, > > > > > >>>>>>>>>>>> or > > > > > >>>>>>>>>>>>>>>>> until > > > > > >>>>>>>>>>>>>>>>>>> we > > > > > >>>>>>>>>>>>>>>>>>>> achieve the 2/3rd majority. > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> Happy voting! > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> Best, > > > > > >>>>>>>>>>>>>>>>>>>> Stephan > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> [1] > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >> > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026 > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>> > > > > > >>>> > > > > > >>>> -- > > > > > >>>> Best, Jingsong Lee > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2019/44
Dear community, happy to share this week's community digest with updates on the ongoing release cycle, end-to-end performance testing, changes to the Table API and a discussion on the Flink Per-Job Mode (aka Application Clusters). Flink Development == * [releases] We have about one month until the planned feature freeze for *Flink 1.10* and Gary has shared another status update. Check it out for a good overview of the ongoing development threads. [1] * [development process] Yu proposes to integrate *end-to-end performance tests *into our build process. For this he has published FLIP-83 [2] which describes two benchmark jobs and a list of configuration scenarios to consider. In a first step the tests would focus on throughput in a small standalone cluster. [3] * [development process] Yu reminds all FLIP authors to keep the* FLIP status *in Confluence up-to-date to facilitate release planning [4] * [sql] Terry Wang proposes a few changes to the* API of the TableEnvironment *(TableEnvironment#execute/sqlQuery/sqlUpdate etc.) to better support asynchronous submission and multi-line sql statements and to make the API more consistent overall. [5] * [clients] Tison has started a discussion on *Flink's per-job mode *and highlights two shortcomings of the current implementation. First, the per-job mode only allows a single JobGraph to be executed. Second, on YARN, the JobGraph is compiled on the client side, not in the Flink Master as in the Standalone Per-Job mode. There has already been quite some feedback on the proposal, which mostly emphasizes the need for multiple JobGraphs in per-job mode and advantages of the current client-side compilation of the JobGraph. [6] * [connectors] *flink-orc* has been moved from "flink-connectors" to "flink-formats". [7,8] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Progress-of-Apache-Flink-1-10-2-tp34585.html [2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-83-Flink-End-to-end-Performance-Testing-Framework-tp34517.html [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Reminder-please-update-FLIP-document-to-latest-status-tp34555.html [5] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-84-Improve-Refactor-API-of-Table-Module-tp34537.html [6] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Semantic-and-implementation-of-per-job-mode-tp34502p34520.html [7] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Move-flink-orc-to-flink-formats-from-flink-connectors-tp34438.html [8] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Move-flink-orc-to-flink-formats-from-flink-connectors-tp34496.html Notable Bugs == * [FLINK-14546] [1.9.1] [1.8.2] The *RabbitMQSource *might leave open consumers around after a job is cancelled or stopped. PR available. [9] [9] https://issues.apache.org/jira/browse/FLINK-14546 Events, Blog Posts, Misc === * *Becket Qin* is now a member of the Apache Flink PMC. Congratulations! [10] * *Euroa Nova* has published a Flink Forward Europe recap on their blog including key takeaways and summaries of selected talks. [11] * *Preetdeep Kumar *has published the first part of a series of articles on Streaming ETL with Apache Flink on DZone. [12] * Upcoming Meetups * There will be Flink/Spark talk at the next Chicago Big Data [13] on the 7th of November. No idea what it will be about (can't join the group) :) * At the next Athens Big Data Group on the 14th of November *Chaoran Yu *of Lightbend will talk about Flink and Spark on Kubernetes. [14] * We will have our next Apache Flink Meetup in Munich on November 27th with talks by *Heiko Udluft & Giuseppe Sirigu*, Airbus, and *Konstantin Knauf* (on Stateful Functions). [15] [10] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Becket-Qin-joins-the-Flink-PMC-tp34400p34452.html [11] https://research.euranova.eu/flink-forward-the-key-takeaways/ [12] https://dzone.com/articles/introduction-to-streaming-etl-with-apache-flink [13] https://www.meetup.com/Chicago-area-Hadoop-User-Group-CHUG/events/265675851/ [14] https://www.meetup.com/Athens-Big-Data/events/265957761/ [15] https://www.meetup.com/Apache-Flink-Meetup-Munich/events/266072196/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2019/45
Dear community, happy to share this week's community digest with updates on stateful functions & Flink 1.8.3, a discussion on a connector for Hortonworks/Cloudera's schema registry, a couple of meetups and a bit more. Flink Development == * [stateful functions] After the successful vote to accept* Stateful Function*s into Flink Igal has started a thread to discuss a few details of the contribution like repository name, mailing lists, component name, etc. [1] * [releases] Jingcheng has started a conversation about the release of *Flink 1.8.3*. Probably still waiting for a few fixes to come in, but looks like there could be a first RC soon. [2] * [connectors] Őrhidi Mátyás and Gyula propose to contribute a connector for *Hortonworks/Cloudera Schema Registry*, which can be used during de-/serialization in Flink's Kafka Connector. [3] * [python] Apache Flink will start Python processes to execute *Python UDFs* in the Table API, planned for 1.10. Dian Fu has published a proposal how to integrate the resource requirements of these Python processes into the unified memory configuration framework, which is currently introduced in FLIP-49. [4] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Stateful-Functions-Contribution-Details-tp34737.html [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Releasing-Flink-1-8-3-tp34811.html [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Avro-Cloudera-Registry-FLINK-14577-tp34647.html [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-PyFlink-User-Defined-Function-Resource-Management-tp34631.html Notable Bugs == * [FLINK-14382] [1.9.1] [1.8.2] [yarn] In Flink 1.9+ filesystem dependencies can be loaded via a plugin mechanism, each with its own classloader. This does currently not work on YARN, where the plugin directory is directly added to the classpath instead. [5] [5] https://issues.apache.org/jira/browse/FLINK-14382 Events, Blog Posts, Misc === * *Jark Wu *is now a member of the Apache Flink PMC. Congratulations! [6] * This blog post by *Sreekanth Krishnavajjala & Vinod Kataria (AWS)* includes a hands-on introduction to Apache Flink on AWS EMR. [7] * Upcoming Meetups * At the next Athens Big Data Group on the 14th of November *Chaoran Yu *of Lightbend will talk about Flink and Spark on Kubernetes. [8] * *Bowen Li* will speak about "The Rise of Apache and Stream Processing" at the next Big Data Bellevue in Seattle on the 20th of November. [9] * The next edition of the Bay Area Apache Flink meetup will happen on the 20th of November with talks by *Gyula Fora (Cloudera)* and *Lakshmi Rao (Lyft)*.[10] * We will have our next Apache Flink Meetup in Munich on November 27th with talks by *Heiko Udluft & Giuseppe Sirigu*, Airbus, and *Konstantin Knauf* (on Stateful Functions). [11] [6] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Jark-Wu-is-now-part-of-the-Flink-PMC-tp34768.html [7] https://idk.dev/extract-oracle-oltp-data-in-real-time-with-goldengate-and-query-from-amazon-athena/ [8] https://www.meetup.com/Athens-Big-Data/events/265957761/ [9] https://www.meetup.com/Big-Data-Bellevue-BDB/events/fxbnllyzpbbc/ [10] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/266226960/ [11] https://www.meetup.com/Apache-Flink-Meetup-Munich/events/266072196/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [DISCUSS] FLIP-86: Improve Connector Properties
Hi Dawid, Hi Jark, in my experience it is very important to be able to forward arbitrary properties to the underlying KafkaClient. This seems to be possible in both cases. I am leaning towards Jark's original suggestion. Flink's documentation would only need to state that it forwards everything under "connector.properties" to the KafkaClient. If the external system has a different formatting scheme, we could still document this on a per-connector basis (use Flink's formatting scheme and document how it is translated to the external system's native properties) or if possible use the external system's scheme under "connector.properties". I don't really know the limitations on our side w.r.t this. Cheers, Konstantin On Wed, Nov 13, 2019 at 3:36 PM Dawid Wysakowicz wrote: > Hi Jark, > > Majority of the changes make sense to me. I think they will be helpful for > the users. I have a slight concern about the > > kafka's connector.properties property :) . I wonder if we should have > them under a single key: > > connector.properties: > `zookeeper.connect`:`localhost:2181`;`bootstrap.servers`:`localhost:9092` > > As far as I understand it, these are mostly the properties that are > forwarded straight to the underlying KafkaClient. Such properties are > mostly defined and documented by the kafka itself rather than flink. They > might also follow a different formatting scheme than we have for our > properties. Moreover how do we decide which properties are put into the > Properties object and which are not? I would be happy to hear what others > think about that part, as I am not convinced myself about that part. > > Best, > > Dawid > On 13/11/2019 13:22, Jark Wu wrote: > > Hi everyone, > > Connector properties is a very basic component which is used to construct a > connector table via YAML, descriptor API, or DDL. It is also the > serialization representation when persisting into catalog. However, we have > encountered some problems when using connector properties, especially in > DDL. This FLIP is aiming to solve following two issues: > > - FLINK-14645: Data types defined in DDL loses precision and nullability > when converting to properties. > - FLINK-14649: Some properties structure is hard to define in DDL, e.g. > List and Map structure. > > This FLIP proposes new properties to represent data types in schema and to > flatten properties in existing connectors. > > FLIP-86:https://docs.google.com/document/d/14MlXFB45NUmUtesFMFZFjFhd4pqmYamxbfNwUPrfdL4/edit# > > Thanks for any feedback! > > Best, > Jark > > > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [DISCUSSION] Kafka Metrics Reporter
Hi Gyula, thank you for proposing this. +1 for adding a KafkaMetricsReporter. In terms of the dependency we could go a similar route as for the "universal" Flink Kafka Connector which to my knowledge always tracks the latest Kafka version as of the Flink release and relies on compatibility of the underlying KafkaClient. JSON sounds good to me. Cheers, Konstantin On Sun, Nov 17, 2019 at 1:46 PM Gyula Fóra wrote: > Hi all! > > Several users have asked in the past about a Kafka based metrics reporter > which can serve as a natural connector between arbitrary metric storage > systems and a straightforward way to process Flink metrics downstream. > > I think this would be an extremely useful addition but I would like to hear > what others in the dev community think about it before submitting a proper > proposal. > > There are at least 3 questions to discuss here: > > > *1. Do we want the Kafka metrics reporter in the Flink repo?*As it is > much more generic than other metrics reporters already included, I would > say yes. Also as almost everyone uses Flink with Kafka it would be a > natural reporter choice for a lot of users. > *2. How should we handle the Kafka dependency of the connector?* > I think it would be an overkill to add different Kafka versions here, > so I would use Kafka 2.+ which has the best compatibility and is future > proof > *3. What message format should we use?* > I would go with JSON for readability and compatibility > > There is a relevant JIRA open for this already. > https://issues.apache.org/jira/browse/FLINK-14531 > > We at Cloudera also promote this as a scalable way of pushing metrics to > other systems so we are very happy to contribute an implementation or > cooperate with others on building it. > > Please let me know what you think! > > Cheers, > Gyula > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Weekly Community Update 2019/46
Dear community, happy to share this week's community update. While slowly approaching the planned feature freeze for Flink 1.10, things have calmed down a bit on the dev mailing list. Stil there are couple of interesting topics to cover. This week's digest includes an update on the Flink 1.8.3 release, a proposal for a KafkaMetricsReporter, a FLIP to improve the configuration of Table API connectors and a bit more. Flink Development == * [releases] With respect to the release of *Flink 1.8.3* the main questions seems to be whether to wait for flink-shaded 9.0. If not, a first release candidate can be expected within the next week. [1] * [releases] Chesnay proposes to release *flink-shaded 9.0* soon and is looking for someone to manage this release. [2] * [connectors] Stephan has shared a quick update on the progress of *FLIP-27 (New Source Interface)*: a first version will likely be available in 1.10, but it will probably take another release until connectors are migrated and things settle down a bit. [3] * [connectors, sql] Jark has started a FLIP discussion to* improve the properties format* of Table API/SQL connectors. It contains a whole list of smaller and larger improvements and it seems this will be targeted for the 1.11 release. [4] * [metrics] Gyula has started a discussion on contributing a *MetricsReporter* to write Flink's metrics to *Apache Kafka*. [5] * [development process] Dian Fu proposes to introduce a *security@f.a.o* mailing list for users to report security-related issues. There is a lot of positive feedback, but also some concerns, e.g. because there is already a cross-project secur...@apache.org mailing list. [6] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Releasing-Flink-1-8-3-tp34811.html [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Release-flink-shaded-9-0-tp35041.html [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-27-Refactor-Source-Interface-tp24952.html [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-86-Improve-Connector-Properties-tp34922.html [5] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSSION-Kafka-Metrics-Reporter-tp35067.html [6] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Expose-or-setup-a-security-flink-apache-org-mailing-list-for-security-report-and-discussion-tp34950p34951.html Notable Bugs == Not much happening here. So, let's look at two edge cases, which might help one or the other user. * [FLINK-13184] [1.9.1] [1.8.2] If you are deploying to YARN and start a lot of Taskmanagers (1000s), the Resourcemanager might be blocked/unresponsive quite some time, so that heartbeats of Taskmanagers start timing out. Yang Wang is working on a fix, which might get into 1.8.3. [7] * [FLINK-14066] If you try to *build* PyFlink on Windows, this will fail as we use a UNIX specific path to the local Flink distribution's build target. The fix is contained in this ticket. [8] [7] https://issues.apache.org/jira/browse/FLINK-13184 [8] https://issues.apache.org/jira/browse/FLINK-14066 Events, Blog Posts, Misc === * A missed this excellent *Flink Foward Europe *recap by my colleagues *Robert* and* Fabian* published 1st of November on the Ververica blog in the last weeks, so here it is. [9] * Upcoming Meetups * *Bowen Li* will speak about "The Rise of Apache Flink and Stream Processing" at the next Big Data Bellevue in Seattle on the 20th of November. [10] * The next edition of the Bay Area Apache Flink meetup will happen on the 20th of November with talks by *Gyula Fora (Cloudera)* and *Lakshmi Rao (Lyft)*.[11] * We will have our next Apache Flink Meetup in Munich on November 27th with talks by *Heiko Udluft & Giuseppe Sirigu*, Airbus, and *Konstantin Knauf* (on Stateful Functions). [12] * There will be an introduction to Apache Flink, use cases and best practices at the next Uber Engineering meeup in Toronto. If you live in Toronto, its an excellent opportunity to get started with Flink or to meet local Flink users. [13] [9] https://www.ververica.com/blog/flink-forward-europe-2019-recap [10] https://www.meetup.com/Big-Data-Bellevue-BDB/events/fxbnllyzpbbc/ [11] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/266226960/ [12] https://www.meetup.com/Apache-Flink-Meetup-Munich/events/266072196/ [13] https://www.meetup.com/Uber-Engineering-Events-Toronto/events/266264176/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg:
[ANNOUNCE] Weekly Community Update 2019/47
Dear community, happy to share this week's weekly community digests with flink-packages.org, an update on the release of Flink 1.8.3, Flink Forward San Francisco CfP and a couple of smaller discussions and proposals. Flink Development == * [ecosystem] Ververica has announced the launch of flink-packages.org. It is a registry of Apache Flink ecosystem projects (connecters, metrics reporters, operational tools, ...) to make it easier to find and promote these projects in the community. Everyone can add their own projects and vote on listed projects. [1] * [releases] Now that the vote for flink-shaded 9.0 has passed [2], I expect to see a first release candidate for Flink 1.8.3 as well soon [3]. * [releases] The feature freeze for Flink 1.10 was announced for December 8th. The release branch will be cut that date. [4] * [connectors] As part of Pulsar connector contribution, Yijie Shen has shared a first design document for the integration of Pulsar's catalog with Apache Flink [5] * [sql] Dawid has published a FLIP (FlIP-87) to support primary key constraints in Apache Flink SQL. Primary Keys (and unique constraints) can be used for query optimization and primary keys are natural keys for upsert streams. [6] * [checkpointing] Shuwen Zhou has started a discussion on a more flexible configuration checkpointing times (e.g. cron style). [7] * [webui] Chesnay proposes to remove the old web user interface. It was replaced by a new implementation in Flink 1.9 and was kept around as a backup. [8] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Launch-of-flink-packages-org-A-website-to-foster-the-Flink-Ecosystem-tp35091.html [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-flink-shaded-9-0-release-candidate-1-tp35146.html [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Releasing-Flink-1-8-3-tp34811.html [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Feature-freeze-for-Apache-Flink-1-10-0-release-tp35139.html [5] https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit [6] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Primary-keys-in-Table-API-tp35138.html [7] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Cron-style-for-checkpoint-tp35194.html [8] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Remove-old-WebUI-tp35218.html Notable Bugs == Nothing worth mentioning came to my attention. Events, Blog Posts, Misc === * The Call for Presentation for Flink Forward San Francisco is now open. The conference will happen March 23rd - 25th (two days conference, one day training). It is the first time that we will have two full conference days in San Francisco. [9] * Upcoming Meetups * We will have our next Apache Flink Meetup in Munich on November 27th with talks by *Heiko Udluft & Giuseppe Sirigu*, Airbus, and *Konstantin Knauf* (on Stateful Functions). [10] * There will be an introduction to Apache Flink, use cases and best practices at the next Uber Engineering meeup in Toronto. If you live in Toronto, its an excellent opportunity to get started with Flink or to meet local Flink users. [11] * The first meetup of the Apache Flink Meetup Chicago on 5th of December comes with four talks(!) highlighting different deployment methods of Apache Flink (AWS EMR, AWS Kinesis Analytics, Verveirca Platform, IBM Kubernetes). Talks by *Trevor Grant*, *Seth Wiesman*, *Joe Olson* and *Margarita Uk*. [12] [9] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Flink-Forward-North-America-2020-Call-for-Presentations-open-until-January-12th-2020-tp35187.html [10] https://www.meetup.com/Apache-Flink-Meetup-Munich/events/266072196/ [11] https://www.meetup.com/Uber-Engineering-Events-Toronto/events/266264176/ [12] https://www.meetup.com/Chicago-Apache-Flink-Meetup-CHAF/events/266609828/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2019/48
Dear community, happy to share a short community update this week. With one week to go to the planned feature freeze for Flink 1.10 and Flink Forward Asia in Beijing the dev@ mailing list pretty quiet these days. Flink Development == * [releases] Hequn has started the vote on RC1 for Flink 1.8.3, which unfortunately has already received a -1 due to wrong/missing license information. Expecting a new RC soon. [1] * [sql] In the past timestamp fields in Flink SQL were internally represented as longs and it was recommended to use longs directly in user-defined functions. With the introduction of a new TimestampType the situation has changed and conversion between long and TIMESTAMP will be disabled. [2] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-8-3-release-candidate-1-tp35401p35407.html [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Disable-conversion-between-TIMESTAMP-and-Long-in-parameters-and-results-of-UDXs-tp35269p35271.html Notable Bugs == * [FLINK-14930] [1.9.1] The OSS filesystem did not allow the configuration of additional credentials providers due to a shading-related bug. Resolved for 1.9.2. [3] [3] https://issues.apache.org/jira/browse/FLINK-14930 Events, Blog Posts, Misc === * Flink Forward Asia took place this week at the National Congress Center in Beijing organized by Alibaba. Talks by Ververica, Tencent, Baidu, Alibaba, Dell, Lyft, Netflix, Cloudera and many other members of the Chinese Apache Flink community, and more than 1500 attendees as far as I heard. Impressive! [4] * At Flink Forward Asia Alibaba announced it has open sourced Alink, a machine learning library on top of Apache Flink[5,6] * Upcoming Meetups * The first meetup of the Apache Flink Meetup Chicago on 5th of December comes with four talks(!) highlighting different deployment methods of Apache Flink (AWS EMR, AWS Kinesis Analytics, Verveirca Platform, IBM Kubernetes). Talks by *Trevor Grant*, *Seth Wiesman*, *Joe Olson* and *Margarita Uk*. [7] * On December 17th there will be the second Apache Flink meetup in Seoul. Maybe Dongwon can share the list of speakers in this thread, my Korean is a bit rusty. [8] [4] https://m.aliyun.com/markets/aliyun/developer/ffa2019 [5] https://technode.com/2019/11/28/alibaba-cloud-machine-learning-platform-open-source/ [6] https://github.com/alibaba/Alink/blob/master/README.en-US.md [7] https://www.meetup.com/Chicago-Apache-Flink-Meetup-CHAF/events/266609828/ [8] https://www.meetup.com/Seoul-Apache-Flink-Meetup/events/266824815/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [DISCUSS] Drop vendor specific deployment documentation.
+1 from my side to drop. On Mon, Dec 2, 2019 at 6:34 PM Seth Wiesman wrote: > Hi all, > > I'd like to discuss dropping vendor-specific deployment documentation from > Flink's official docs. To be clear, I am *NOT* suggesting we drop any of > the filesystem documentation, but the following three pages. > > AWS: > > https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/aws.html > Google Compute Engine: > > https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/gce_setup.html > MapR: > > https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/mapr_setup.html > > Unlike the filesystems, these docs do not refer to components maintained by > the Apache Flink community, but external commercial services and products. > None of these pages are well maintained and I do not think the open-source > community can reasonably be expected to keep them up to date. In > particular, > > >- The AWS page contains sparse information and mostly just links to the >official EMR docs. >- The Google Compute Engine page is out of date and the commands do not >work. >- MapR contains some relevant information but the community has already >dropped the MapR filesystem so I am not sure that deployment would work > (I >have not tested). > > There is also a larger question of which vendor products should be included > and which should not. That is why I would like to suggest dropping these > pages and referring users to vendor maintained documentation whenever they > are using one of these services. > > Seth Wiesman > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [DISCUSS] Drop Kafka 0.8/0.9
Hi Chesnay, +1 for dropping. I have not heard from any user using 0.8 or 0.9 for a long while. Cheers, Konstantin On Wed, Dec 4, 2019 at 1:57 PM Chesnay Schepler wrote: > Hello, > > What's everyone's take on dropping the Kafka 0.8/0.9 connectors from the > Flink codebase? > > We haven't touched either of them for the 1.10 release, and it seems > quite unlikely that we will do so in the future. > > We could finally close a number of test stability tickets that have been > lingering for quite a while. > > > Regards, > > Chesnay > > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2019/49
erverica blog including a short summary of the keynotes. [14] * Upcoming Meetups * On December 17th there will be the second Apache Flink meetup in Seoul. [15] *Dongwon* has shared a detailed agenda in last weeks community update. [16] [14] https://www.ververica.com/blog/flink-forward-asia-2019-summary [15] https://www.meetup.com/Seoul-Apache-Flink-Meetup/events/266824815/ [16] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Weekly-Community-Update-2019-48-td35423.html Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [DISCUSS] Improve documentation / tooling around security of Flink
Hi Robert, we could also add a warning (or a general "security" section) to the "production readiness checklist" in the documentation. Generally, I like d) in combination with an informative log message. Do you think this would cause a lot of friction? Cheers, Konstantin On Fri, Dec 13, 2019 at 2:06 PM Chesnay Schepler wrote: > Another proposal that was brought up was to provide a script for > generating an SSL certificate with the distribution. > > On 12/12/2019 17:45, Robert Metzger wrote: > > Hi all, > > > > There was recently a private report to the Flink PMC, as well as publicly > > [1] about Flink's ability to execute arbitrary code. In scenarios where > > Flink is accessible by somebody unauthorized, this can lead to issues. > > The PMC received a similar report in November 2018. > > > > I believe it would be good to warn our users a bit more prominently about > > the risks of accidentally opening up Flink to the public internet, or > other > > unauthorized entities. > > > > I have collected the following potential solutions discussed so far: > > > > a) Add a check-security.sh script, or a check into the frontend if the > > JobManager can be reached on the public internet > > b) Add a prominent warning to the download page > > c) add an opt-out warning to the Flink logs / UI that can be disabled via > > the config. > > d) Bind the REST endpoint to localhost only, by default > > > > > > I'm curious to hear if others have other ideas what to do. > > I personally like to kick things off with b). > > > > > > Best, > > Robert > > > > > > [1] https://twitter.com/pyn3rd/status/1197397475897692160 > > > > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2019/50
Dear community, happy to share this week's brief community digest with updates on Flink 1.8.3 and Flink 1.10, a discussion on how to facilitate easier Flink/Hive setups, a couple of blog posts and a bit more. *Personal Note:* Thank you for reading these updates since I started them early this year. I will take a three week Christmas break and will be back with a Holiday season community update on the 12th of January. Flink Development == * [releases] Apache Flink 1.8.3 was released on Wednesday. [1,2] * [releases] The feature freeze for Apache Flink took place on Monday. The community is now working on testing, bug fixes and improving the documentation in order to create a first release candidate soon. [3] * [development process] Seth has revived the discussion on a past PR by Marta, which added a documentation style guide to the contributor guide. Please check it [4] out, if you are contributing documentation to Apache Flink. [5] * [security] Following a recent report to the Flink PMC of "exploiting" the Flink Web UI for remote code execution, Robert has started a discussion on how to improve the tooling/documentation to make users aware of this possibility and recommend securing this interface in production setups. [6] * [sql] Bowen has started a discussion on how to simplify the Flink-Hive setup for new users as currently users need to add some additional dependencies to the classpath manually. The discussion seems to conclude towards providing a single additional hive-uber jar, which contains all the required dependencies. [7] [1] https://flink.apache.org/news/2019/12/11/release-1.8.3.html [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Apache-Flink-1-8-3-released-tp35868.html [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Feature-freeze-for-Apache-Flink-1-10-0-release-tp35139.html [4] https://github.com/apache/flink-web/pull/240 [5] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Docs-Style-Guide-Review-tp35758.html [6] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improve-documentation-tooling-around-security-of-Flink-tp35898.html [7] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-have-separate-Flink-distributions-with-built-in-Hive-dependencies-tp35918.html Notable Bugs == [FLINK-15152] [1.9.1] When a "stop" action on a job fails, because not all tasks are in "RUNNING" state the job is not checkpointing afterwards. [8] [8] https://issues.apache.org/jira/browse/FLINK-15152 Events, Blog Posts, Misc === * Zhu Zhu is now an Apache Flink Comitter. Congratulations! [9] * Gerred Dillon has published a blog post on the Apache Flink blog on how to run Flink on Kubernetes with a KUDO Flink operator. [10] * In this blog post Apache Flink PMC Sun Jincheng outlines the reasons and motivation for his and his colleague's work to provide a world-class Python support for Apache Flink's Table API. [11] * Upcoming Meetups * On December 17th there will be the second Apache Flink meetup in Seoul. [12] *Dongwon* has shared a detailed agenda in last weeks community update. [13] * On December 18th Alexander Fedulov will talk about Stateful Stream Processing with Apache Flink at the Java Professionals Meetup in Minsk. [14] [9] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Zhu-Zhu-becomes-a-Flink-committer-tp35944.html [10] https://flink.apache.org/news/2019/12/09/flink-kubernetes-kudo.html [11] https://developpaper.com/why-will-apache-flink-1-9-0-support-the-python-api/ [12] https://www.meetup.com/Seoul-Apache-Flink-Meetup/events/266824815/ [13] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Weekly-Community-Update-2019-48-td35423.html [14] https://www.meetup.com/Apache-Flink-Meetup-Minsk/events/267134296/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [ANNOUNCE] Weekly Community Update 2019/50
Hi Hequn, thanks, and thanks for the offer. Of course, you can cover the holiday break, i.e. the next three weeks. Looking forward to your updates! Cheers, Konstantin On Mon, Dec 16, 2019 at 5:53 AM Hequn Cheng wrote: > Hi Konstantin, > > Happy holidays and thanks a lot for your great job on the updates > continuously. > With the updates, it is easier for us to catch up with what's going on in > the community, which I think is quite helpful. > > I'm wondering if I can do some help and cover this during your vocation. :) > > Best, > Hequn > > On Sun, Dec 15, 2019 at 11:36 PM Konstantin Knauf < > konstan...@ververica.com> wrote: > >> Dear community, >> >> happy to share this week's brief community digest with updates on Flink >> 1.8.3 and Flink 1.10, a discussion on how to facilitate easier Flink/Hive >> setups, a couple of blog posts and a bit more. >> >> *Personal Note:* Thank you for reading these updates since I started >> them early this year. I will take a three week Christmas break and will be >> back with a Holiday season community update on the 12th of January. >> >> Flink Development >> == >> >> * [releases] Apache Flink 1.8.3 was released on Wednesday. [1,2] >> >> * [releases] The feature freeze for Apache Flink took place on Monday. >> The community is now working on testing, bug fixes and improving the >> documentation in order to create a first release candidate soon. [3] >> >> * [development process] Seth has revived the discussion on a past PR by >> Marta, which added a documentation style guide to the contributor guide. >> Please check it [4] out, if you are contributing documentation to Apache >> Flink. [5] >> >> * [security] Following a recent report to the Flink PMC of "exploiting" >> the Flink Web UI for remote code execution, Robert has started a discussion >> on how to improve the tooling/documentation to make users aware of this >> possibility and recommend securing this interface in production setups. [6] >> >> * [sql] Bowen has started a discussion on how to simplify the Flink-Hive >> setup for new users as currently users need to add some additional >> dependencies to the classpath manually. The discussion seems to conclude >> towards providing a single additional hive-uber jar, which contains all the >> required dependencies. [7] >> >> [1] https://flink.apache.org/news/2019/12/11/release-1.8.3.html >> [2] >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Apache-Flink-1-8-3-released-tp35868.html >> [3] >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Feature-freeze-for-Apache-Flink-1-10-0-release-tp35139.html >> [4] https://github.com/apache/flink-web/pull/240 >> [5] >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Docs-Style-Guide-Review-tp35758.html >> [6] >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improve-documentation-tooling-around-security-of-Flink-tp35898.html >> [7] >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-have-separate-Flink-distributions-with-built-in-Hive-dependencies-tp35918.html >> >> Notable Bugs >> == >> >> [FLINK-15152] [1.9.1] When a "stop" action on a job fails, because not >> all tasks are in "RUNNING" state the job is not checkpointing afterwards. >> [8] >> >> [8] https://issues.apache.org/jira/browse/FLINK-15152 >> >> Events, Blog Posts, Misc >> === >> >> * Zhu Zhu is now an Apache Flink Comitter. Congratulations! [9] >> >> * Gerred Dillon has published a blog post on the Apache Flink blog on how >> to run Flink on Kubernetes with a KUDO Flink operator. [10] >> >> * In this blog post Apache Flink PMC Sun Jincheng outlines the reasons >> and motivation for his and his colleague's work to provide a world-class >> Python support for Apache Flink's Table API. [11] >> >> * Upcoming Meetups >> * On December 17th there will be the second Apache Flink meetup in >> Seoul. [12] *Dongwon* has shared a detailed agenda in last weeks >> community update. [13] >> * On December 18th Alexander Fedulov will talk about Stateful Stream >> Processing with Apache Flink at the Java Professionals Meetup in Minsk. [14] >> >> [9] >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Zhu-Zhu-becomes-a-Flink-committer-tp35944.html >> [10] https://flink.apache.org/news/2019/12/09/flin
Re: [DISCUSS] Flink docs vendor table
+1 This gives a better overview of the deployment targets and shows our prospective users that they can rely on a broad set of vendors, if help is needed. I guess, Robert means if the vendor offers a managed service (like AWS Kinesis Analytics), or licenses software (like Ververica Platform). This would be beneficial, but on the other hand the categories/terms (managed, hosted, "serverless", self-managed) are not so well-defined in my experience. On Tue, Dec 17, 2019 at 10:46 PM Seth Wiesman wrote: > Happy to see there seems to be a consensus. > > Robert, can you elaborate on what you mean by "deployment model"? > > Seth > > On Tue, Dec 17, 2019 at 12:19 PM Robert Metzger > wrote: > > > +1 to the general idea > > > > Maybe we could add "Deployment Model" in addition to "Supported > > Environments" as properties for the vendors. > > I'd say Cloudera, Eventador and Huawei [1] are missing from this page > > > > [1]https://www.huaweicloud.com/en-us/product/cs.html > > > > On Tue, Dec 17, 2019 at 5:05 PM Stephan Ewen wrote: > > > > > +1 for your proposed solution, Seth! > > > > > > On Tue, Dec 17, 2019 at 3:05 PM Till Rohrmann > > > wrote: > > > > > > > Thanks for continuing this discussion Seth. I like the mockup and I > > think > > > > this is a good improvement. Modulo the completeness check, +1 for > > > offering > > > > links to 3rd party integrations. > > > > > > > > Cheers, > > > > Till > > > > > > > > On Mon, Dec 16, 2019 at 6:04 PM Seth Wiesman > > > wrote: > > > > > > > > > This discussion is a follow up to the previous thread on dropping > > > > > vendor-specific documentation[1]. > > > > > > > > > > The conversation ended unresolved on the question of what we should > > > > provide > > > > > on the Apache Flink docs. The consensus seemed to be moving towards > > > > > offering a table with links to 3rd parties. After an offline > > > conversation > > > > > with Robert, I have drafted a mock-up of what that might look > > like[2]. > > > > > Please note that I included a few vendors that I could think of off > > the > > > > top > > > > > of my head, the list in this picture is not complete but that is > not > > > the > > > > > conversation we are having here. > > > > > > > > > > There are three competing goals that we are trying to achieve here. > > > > > > > > > > 1) Provide information to users that vendor support is available as > > it > > > > can > > > > > be important in growing adoption within enterprises > > > > > 2) Be maintainable by the open-source Flink community > > > > > 3) Remain neutral > > > > > > > > > > Please let me know what you think > > > > > > > > > > Seth > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Drop-vendor-specific-deployment-documentation-td35457.html > > > > > [2] > > > > > > > > > > > > > > > > > > > > https://gist.githubusercontent.com/sjwiesman/bb90f0765148c15051bcc91092367851/raw/42c0a1e9240f1c5808a053f8ff5965828cca96d5/mockup.png > > > > > > > > > > > > > > > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [DISCUSS] Integrate Flink Docker image publication into Flink release process
work with the > > > > Docker Hub maintainers to make sure we continue to work within their > > > > guidelines and expectations. > > > > > > > > While it seems intuitive that integrating these images into the Flink > > > > release process is a good thing, I don’t believe it is strictly > > > necessary, > > > > since the images only package approved and signed Flink releases, and > > do > > > > not themselves build Flink from source. However, there are some > > concrete > > > > advantages: > > > > > > > >- > > > > > > > >Putting the Docker images on (almost) equal footing with Flink > > binary > > > >release artifacts will help the legitimacy of and user confidence > in > > > >running Flink in containerized environments > > > >- > > > > > > > >By publishing release candidate (and possibly nightly) images, the > > > >release testing and automated testing processes could be improved > > > >- > > > > > > > >The delay between Flink releases and when the corresponding Docker > > > >images are available will be reduced > > > > > > > > > > > > Considering all of this, I propose the following: > > > > > > > >- > > > > > > > >We move the Git repository containing the Dockerfiles from the > > > >docker-flink GitHub organization to Apache, placing it under > control > > > of > > > > the > > > >Flink PMC > > > >- > > > > > > > >We codify updating these Dockerfiles and notifying Docker Hub into > > the > > > >Flink release process > > > >- > > > > > > > > For release candidates, Dockerfiles should be added to a > special > > > > directory which will be automatically built and pushed to the > > > > Apache Docker > > > > Hub organization[7], e.g. apache/flink-rc:1.10.0-rc1 > > > > - > > > > > > > > Upon release, the appropriate “release” Dockerfiles are added > > (e.g. > > > > under the 1.10 directory) and release candidate Dockerfiles > > > removed, > > > > and > > > > then a pull request opened on the > docker-library/official-images > > > > repository > > > > - > > > > > > > >Optionally, we introduce “nightly” builds, with an automated > process > > > >building and pushing images to the Apache Docker Hub organization, > > > e.g. > > > >apache/flink-dev:1.10-SNAPSHOT > > > > > > > > > > > > If we choose to move forward in this direction, there are some > further > > > > steps we could take to improve the experience of both developing and > > > using > > > > Flink with Docker (these are actually mostly orthogonal to the > proposed > > > > changes above, but I think this is a natural first step and should > make > > > the > > > > following ideas easier to implement). > > > > > > > > First, there are important differences between images meant for > running > > > > Flink and those meant for development: the former should strictly > > package > > > > only released distributions of software and be as thin of a layer as > > > > possible over the software itself, while the latter can be used > during > > > > development and testing, and can easily be rebuilt from a “working > > copy” > > > of > > > > the software’s source code. > > > > > > > > By standardizing on defining such “production” images in the > > docker-flink > > > > repository and “development” image(s) in the Flink repository itself, > > it > > > is > > > > much clearer to developers and users what the right Dockerfile or > image > > > > they should use for a given purpose. To that end, we could introduce > > one > > > or > > > > more documented Maven goals or Make targets for building a Docker > image > > > > from the current source tree or a specific release (including > > unreleased > > > or > > > > unsupported versions). > > > > > > > > Additionally, there has been discussion among Flink contributors for > > some > > > > time about the confusing state of Dockerfiles within the Flink > > > repository, > > > > each meant for a different way of running Flink. I’m not completely > up > > to > > > > speed about these different efforts, but we could possibly solve this > > by > > > > either building additional “official” images with different > entrypoints > > > for > > > > these various purposes, or by developing an improved entrypoint > script > > > that > > > > conveniently supports all cases. I defer to Till Rohrmann, Konstantin > > > > Knauf, or Stephan Ewen for further discussion on this point. > > > > > > > > I apologize again for the wall of text, but if you made it this far, > > > thank > > > > you! These improvements have been a long time coming, and I hope we > can > > > > find a solution that serves the Flink and Docker communities well. > > Please > > > > don’t hesitate to ask any questions. > > > > > > > > -- > > > > > > > > Patrick Lucas > > > > > > > > [1] https://hub.docker.com/_/flink > > > > > > > > [2] > > > > > > > > > > > > > > https://lists.apache.org/thread.html/c50297f8659aaa59d4f2ae327b69c4d46d1ab8ecc53138e659e4fe91%40%3Cdev.flink.apache.org%3E > > > > > > > > [3] On page 2 at the time we went to press: > > > > https://hub.docker.com/search?q=&type=image&image_filter=official > > > > > > > > [4] https://github.com/docker-flink/docker-flink > > > > > > > > [5] > > > > > > > > > > > > > > https://github.com/docker-library/official-images/pulls?q=is%3Apr+label%3Alibrary%2Fflink > > > > > > > > [6] I looked at the 25 most popular “official” images (see [3]) as > well > > > as > > > > “official” images of Apache software from the top 125; all use a > > > dedicated > > > > repo > > > > [7] https://hub.docker.com/u/apache > > > > > > > > > > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2020/02
Dear community, happy new year from my side, too, and thanks a lot to Hequn for helping out with the weekly updates during the last three weeks! I enjoyed reading these myself for a change. This week's community digest features an update on Flink 1.10 release testing, a proposal for a SQL catalog to read the schema of relational databases and the Call for Presentations of Flink Forward San Francisco. Flink Development == * [releases] The community is still testing and fixing bugs for* Flink 1.10*. You can follow the effort on the release burndown board [1]. Should not be too long until a first RC is ready. * [sql] Bowen proposes to add a *JDBC and Postgres Catalog *to the Table API. By this, Flink could automatically create tables corresponding to the tables in relational databases. Currently, users need to manually create corresponding tables (incl. schema) on the Flink-side. [2,3] * [configuration] Xintong proposes to change some of the default values for Flink's memory configuration following his work on *FLIP-49 *and is looking for feedback [4] * [datastream api] Congxian proposes to unify the handling of adding "null" values to *AppendingState* across statebackends. The proposed behavior is to make all statebackends refuse "null" values. [5] [1] https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=349&projectKey=FLINK [2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-92%3A+JDBC+catalog+and+Postgres+catalog [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-92-JDBC-catalog-and-Postgres-catalog-tp36505.html [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discuss-Tuning-FLIP-49-configuration-default-values-td36528.html [5] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Make-AppendingState-add-refuse-to-add-null-element-tp36493.html Notable Bugs == A lot of activity due to release testing, but I did not catch any new notable bugs for already released versions. Events, Blog Posts, Misc === * *Flink Forward San Francisco Call for Presentations *is ending soon, but you still have a chance to submit your talk to the one (and possibly only) Apache Flink community conference in North America. In case of questions or if you are unsure whether to submit, feel free to reach out to me personally. [6] * Upcoming Meetups * On January 18th *Preetdeep Kumar* will share some basic Flink DataStream processing API, followed by a hands-on demo. This will be an online event. Check more details within the meetup link. [7] * On January 22 my colleague *Alexander Fedulov *will talk about Fraud Detection with Apache Flink at the Apache Flink Meetup in Madrid [8]. [6] https://www.flink-forward.org/sf-2020 [7] https://www.meetup.com/Hyderabad-Apache-Flink-Meetup-Group/events/267610014/ [8]https://www.meetup.com/Meetup-de-Apache-Flink-en-Madrid/events/267744681/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2020/03
Dear community, happy to share this week's weekly community digest with a release candidate for Flink 1.10, a Pulsar Catalog for Flink, a 50% discount code for Flink Forward SF and bit more. Flink Development == * [releases] The first (preview)* release candidate for Flink 1.10* has been created. Every help in testing the release candidate is highly appreciated. [1] * [sql] I believe I have never covered the contribution of a *Pulsar Catalog* to Flink's SQL ecosystem in the past. So here it is. Apache Pulsar includes a schema registry out-of-the-box. With Flink's Pulsar catalog, Pulsar topics will automatically available as tables in Flink. Pulsar namespaces are mapped to databases in Flink. [2,3] * [deployment] End of last year Patrick Lucas had started a discussion on integrating the *Flink Docker images *into the Apache Flink project. The discussion has stalled a bit by now, but there seems to be a consensus that a) the Flink Docker images should move to Apache Flink and b) the Dockerfiles in apache/flink need to be consolidated. [4] * [statebackends] When using the *RocksDbStatebackend Timer* state is currently - by default - still stored on the Java heap. Stephan proposes to change this behaviour to store Timers in RocksDB by default and has received a lot of positive feedback. [5] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Apache-Flink-1-10-0-release-candidate-0-tp36770.html [2] https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-72-Introduce-Pulsar-Connector-tp33283.html [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Integrate-Flink-Docker-image-publication-into-Flink-release-process-tp36139.html [5] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Change-default-for-RocksDB-timers-Java-Heap-in-RocksDB-tp36720.html Notable Bugs == * [FLINK-15577] [1.9.1] When using two different windows within on SQL query, Flink might generate an invalid physical execution plan as it incorrectly considers the two window transformations equivalent. More details in the ticket. [6] [6] https://issues.apache.org/jira/browse/FLINK-15577 Events, Blog Posts, Misc === * *Dian Fu* is now an Apache Flink Comitter. Congratulations! [7] * *Bird *has published an in-depth blog post on how they use Apache Flink to detect offline scooters. [8] * On the Flink Blog, Alexander Fedulov has published a first blog post of a series on how to implement a *financial fraud detection use case* with Apache Flink. [9] * The extended *Call for Presentations for Flink Forward San Francisco* ends today. Make sure to submit your talk in time. [10,11] * Still on the fence whether to attend *Flink Forward San Francisco*? Let me help you with that: when registering use discount code *FFSF20-MailingList* to get a * 50% discount* on your conference ticket. * Upcoming Meetups * On January 22 my colleague *Alexander Fedulov *will talk about Fraud Detection with Apache Flink at the Apache Flink Meetup in Madrid [12]. * On February 6 *Alexander Fedulov *will talk about Stateful Stream Processing with Apache Flink at the R-Ladies meetup in Kiew. [13] [7] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Dian-Fu-becomes-a-Flink-committer-tp36696p36760.html [8] https://medium.com/bird-engineering/replayable-process-functions-in-flink-time-ordering-and-timers-28007a0210e1 [9] https://flink.apache.org/news/2020/01/15/demo-fraud-detection.html [10] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Flink-Forward-San-Francisco-2020-Call-for-Presentation-extended-tp36595.html [11] https://www.flink-forward.org/sf-2020 [12] https://www.meetup.com/Meetup-de-Apache-Flink-en-Madrid/events/267744681/ [13] https://www.meetup.com/rladies-kyiv/events/267948988/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2020/04
Dear community, happy to share a brief community digest after a rather quite week including updates on Flink 1.9.2 and Flink 1.10, and the ongoing votes on FLIP-27 (New Source Interface) and FLIP-92 (N-Ary Stream Operator). Flink Development == * [releases] Hequn has published the first release candidate for Apache Flink 1.9.2 and is waiting for votes and feedback. [1] * [releases] No feedback on the RC0 for Flink 1.10 so far. Still, I assume a first release (non-preview) candidate for Flink 1.10 is just around the corner as only a few blockers remain on the board for 1.10. [2] * [runtime] Piotr has started a vote on adding an N-Ary Stream Operator (FLIP-92). [3] * [connectors] Becket has resumed the vote on highly anticipated FLIP-27, the new source interface. [4] [1] https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=349&projectKey=FLINK [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-9-2-release-candidate-1-tp36943.html [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-92-Add-N-Ary-Stream-Operator-in-Flink-tp36539.html [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-27-Refactor-Source-Interface-tp35569p36850.html Notable Bugs == [FLINK-15575] [1.9.1] Incomplete shading/relocation of some dependencies of the Azure filesystem can result in classloading conflicts. This only applies to users using Application Clusters (Job Clusters) prior to Flink 1.10 as these do not use a dedicated classloader for user code. Resolved in 1.9.2 and 1.10. [5] [FLINK-13758] [1.8.3] [1.9.1] As a user you can use Flink's distributed cache to make files available to all user defined functions running in the cluster. This did not work for files stored in HDFS for a while. Resolved in 1.8.4, 1.9.2 and 1.10. [6] [5] https://issues.apache.org/jira/browse/FLINK-15575 [6] https://issues.apache.org/jira/browse/FLINK-13758 Events, Blog Posts, Misc === * Yu Li is now an Apache Flink Comitter. Congratulations! [7] * Apache has published their "Apache in 2019 - By the Digits" blog post. Apache Flink ranking *3rd in #commits, 1st in user@ mailing list activity, *and *2nd in dev@* *mailing list activity*. Congrats to the community! [8] * Upcoming Meetups * On January 30th *Preetdeep Kumar* is hosting an introductory online meetup in the Hyderabab Apache Flink Meetup group on Windows and Function in a Apache Flink. <https://www.meetup.com/Hyderabad-Apache-Flink-Meetup-Group/events/268080082/attendees/> [9] * On February 6 *Alexander Fedulov *will talk about Stateful Stream Processing with Apache Flink at the R-Ladies meetup in Kiew. [10] [7] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Yu-Li-became-a-Flink-committer-tp36904.html [8] https://blogs.apache.org/foundation/entry/apache-in-2019-by-the [9] https://www.meetup.com/Hyderabad-Apache-Flink-Meetup-Group/events/268080082/ [10] https://www.meetup.com/rladies-kyiv/events/267948988/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [VOTE] Integrate Flink Docker image publication into Flink release process
+1 (non-binding) On Mon, Jan 27, 2020 at 5:50 PM Ufuk Celebi wrote: > Hey all, > > there is a proposal to contribute the Dockerfiles and scripts of > https://github.com/docker-flink/docker-flink to the Flink project. The > discussion corresponding to this vote outlines the reasoning for the > proposal and can be found here: [1]. > > The proposal is as follows: > * Request a new repository apache/flink-docker > * Migrate all files from docker-flink/docker-flink to apache/flink-docker > * Update the release documentation to describe how to update > apache/flink-docker for new releases > > Please review and vote on this proposal as follows: > [ ] +1, Approve the proposal > [ ] -1, Do not approve the proposal (please provide specific comments) > > The vote will be open for at least 3 days, ending the earliest on: January > 30th 2020, 17:00 UTC. > > Cheers, > > Ufuk > > PS: I'm treating this proposal similar to a "Release Plan" as mentioned in > the project bylaws [2]. Please let me know if you consider this a different > category. > > [1] > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Integrate-Flink-Docker-image-publication-into-Flink-release-process-td36139.html > [2] > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026 > -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
[ANNOUNCE] Weekly Community Update 2020/05
Dear community, comparably quiet times on the dev@ mailing list, so I will keep it brief and mostly share release related updates today. Flink Development == * [releases] Apache Flink 1.9.2 was released on Thursday. Check out the release blog post [1] for details. [2] * [releases] Gary has published and started a vote on the first release candidate for Flink 1.10. The vote has failed due to incorrect license information in one of the newly added modules. The community also found a handful of additional issues, some of which might need to be fixed for the next RC. [3] * [docker] The Apache Flink community has unanimously approved the integration of the Docker image publication into the Apache Flink release process, which means https://github.com/docker-flink/docker-flink will move under the Apache Flink project and the Apache Flink Docker images will become official Flink releases approved by the Apache Flink PMC. [4] [1] https://flink.apache.org/news/2020/01/30/release-1.9.2.html [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Apache-Flink-1-9-2-released-tp37102.html [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-10-0-release-candidate-1-tp36985.html [4] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/RESULT-VOTE-Integrate-Flink-Docker-image-publication-into-Flink-release-process-tp37096.html Notable Bugs == I am not aware of any notable user-facing bug in any of the released versions. Events, Blog Posts, Misc === * My colleague *Seth Wiesman* has published an excellent blog post on *state evolution* in Flink covering state schema evolution, the State Processor API and a look ahead. [5] * Upcoming Meetups * On February 6 *Alexander Fedulov *will talk about Stateful Stream Processing with Apache Flink at the R-Ladies meetup in Kiew. [6] [5] https://flink.apache.org/news/2020/01/29/state-unlocked-interacting-with-state-in-apache-flink.html [6] https://www.meetup.com/rladies-kyiv/events/267948988/ Cheers, Konstantin (@snntrable) -- Konstantin Knauf | Solutions Architect +49 160 91394525 Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng
Re: [DISCUSS] Standard / Convention for common connector metrics
Hi Becket, I like the idea of providing a standard set of metrics, which sources can choose to implement/expose. In addition, I think, sources, like the Kafka or Kinesis Source, should continue to forward the original consumer metrics under their original names, so that users familiar with Kafka/Kinesis can relate to them easily. Cheers, Konstantin On Fri, Feb 1, 2019 at 3:54 AM Becket Qin wrote: > Thanks for the connector metric url, Chesnay :) > > @Robert, as you can see, the metrics from different connectors are quite > different. And there are different names for similar metrics, which is a > little frustrating when users want to do monitoring / alerting. > > Thanks, > > Jiangjie (Becket) Qin > > On Thu, Jan 31, 2019 at 6:41 PM Chesnay Schepler > wrote: > > > @Robert: > > > > > https://ci.apache.org/projects/flink/flink-docs-master/monitoring/metrics.html#connectors > > > > On 31.01.2019 11:03, Robert Metzger wrote: > > > Hey Becket, > > > thanks a lot for your proposal! > > > > > > Do you have an overview over the current situation of the metrics in > the > > > connectors? > > > Which connectors expose metrics at all? > > > Are they different? > > > > > > On Thu, Jan 31, 2019 at 8:44 AM Becket Qin > wrote: > > > > > >> Hi folks, > > >> > > >> I was trying to add some metrics to Kafka connectors and realized that > > >> right now Flink does not have a common metric definition for the > > >> connectors. This complicates the monitoring and operation because the > > >> monitoring / alerts need to be set case by case. > > >> > > >> To address this issue, I would like to see if is possible to have a > set > > of > > >> standardized common metrics for all sources and sinks. The following > doc > > >> describes the proposal. Feedback is very welcome. > > >> > > >> > > >> > > > https://docs.google.com/document/d/1q86bgj_3T6WFbSUoxLDJJXmUcBOUcvWfh2RZvHG-nPU/edit# > > >> > > >> Thanks, > > >> > > >> Jiangjie (Becket) Qin > > >> > > > > > -- -- Konstantin Knauf | Solution Architect -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Data Artisans GmbH | Stresemannstr. 121A,10963 Berlin, Germany <https://maps.google.com/?q=Stresemannstr.+121A,10963+Berlin,+Germany&entry=gmail&source=g> data Artisans, Inc. | 1161 Mission Street, San Francisco, CA-94103, USA <https://maps.google.com/?q=1161+Mission+Street,+San+Francisco,+CA-94103,+USA&entry=gmail&source=g> -- Data Artisans GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
Re: One source is much slower than the other side when join history data
Hi, this topic has been discussed a lot recently in the community as "Event Time Alignment/Synchronization" [1,2]. These discussion should provide a starting point. Cheers, Konstantin [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Sharing-state-between-subtasks-td24489.html [2] https://issues.apache.org/jira/browse/FLINK-10886 On Wed, Feb 27, 2019 at 3:03 AM 刘建刚 wrote: > When consuming history data in join operator with eventTime, reading > data from one source is much slower than the other. As a result, the join > operator will cache much data from the faster source in order to wait the > slower source. > The question is that how can I make the difference of consumers' > speed small? > -- Konstantin Knauf | Solutions Architect +49 160 91394525 <https://www.ververica.com/> Follow us @VervericaData -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Data Artisans GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
Re: [DISCUSS] Remove forceAvro() and forceKryo() from the ExecutionConfig
Hi Stephan, I am in favor of renaming forceKryo() instead of removing it, because users might plugin their Protobuf/Thrift serializers via Kryo as advertised in our documentation [1]. For this, Kryo needs to be used for POJO types as well, if I am not mistaken. Cheers, Konstantin [1] https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/custom_serializers.html On Tue, Mar 26, 2019 at 10:03 AM Stephan Ewen wrote: > Compatibility is really important for checkpointed state. > For that, you can always directly specify GenericTypeInfo or AvroTypeInfo > if you want to continue to treat a type via Kryo or Avro. > > Alternatively, once https://issues.apache.org/jira/browse/FLINK-11917 is > implemented, this should happen automatically. > > On Tue, Mar 26, 2019 at 8:33 AM Yun Tang wrote: > >> Hi Stephan >> >> I prefer to remove 'enableForceKryo' since Kryo serializer does not work >> out-of-the-box well for schema evolution stories due to its mutable >> properties, and our built-in POJO serializer has already supported schema >> evolution. >> >> On the other hand, what's the backward compatibility plan for >> enableForceAvro() and enableForceKryo()? I think if >> https://issues.apache.org/jira/browse/FLINK-11917 merged, we could >> support to migrate state which was POJO but serialized using Kryo. >> >> Best >> Yun Tang >> -- >> *From:* Stephan Ewen >> *Sent:* Tuesday, March 26, 2019 2:31 >> *To:* dev; user >> *Subject:* [DISCUSS] Remove forceAvro() and forceKryo() from the >> ExecutionConfig >> >> Hi all! >> >> The ExecutionConfig has some very old settings: forceAvro() and >> forceKryo(), which are actually misleadingly named. They cause POJOs to use >> Avro or Kryo rather than the POJO serializer. >> >> I think we do not have a good case any more to use Avro for POJOs. POJOs >> that are also Avro types go through the Avro serializer anyways. >> >> There may be a case to use Kryo for POJOs if you don't like the Flink >> POJO serializer. >> >> I would suggest to remove the "forceAvro()" option completely. >> For "forceKryo()", I am torn between removing it completely or renaming >> it to "setUseKryoForPOJOs()". >> >> What are the opinion on that out there? >> >> Best, >> Stephan >> >> -- Konstantin Knauf | Solutions Architect +49 160 91394525 <https://www.ververica.com/> Follow us @VervericaData -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Data Artisans GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
Re: [Discuss] Semantics of event time for state TTL
>> real >>time. For data privacy use case, it might be better because we want >> state >>to be unavailable in particular real moment of time since the >> associated >>piece of data was created in event time. For long term approximate >> garbage >>collection, it might be not a problem as well. For quick expiration, >> the >>time skew between event and processing time can lead again to premature >>deletion of late data and user cannot delay it. >> >> We could also make this behaviour configurable. Another option is to make >> time provider pluggable for users. The interface can give users context >> (currently processed record, watermark etc) and ask them which timestamp >> to >> use. This is more complicated though. >> >> Looking forward for your feedback. >> >> Best, >> Andrey >> >> [1] https://issues.apache.org/jira/browse/FLINK-12005 >> [2] >> >> https://docs.google.com/document/d/1SI_WoXAfOd4_NKpGyk4yh3mf59g12pSGNXRtNFi-tgM >> [3] >> >> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/State-TTL-in-Flink-1-6-0-td22509.html >> > -- Konstantin Knauf | Solutions Architect +49 160 91394525 <https://www.ververica.com/> Follow us @VervericaData -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Data Artisans GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
Re: [DISCUSS] Apache Flink at Season of Docs
Hi all, I am happy to start our application process and help as a mentor. For our application, we need a second organization administrator (assuming I am the first). The tasks as an administrator are listed in [1] (basically only guiding the organization through the SoD process). Is anyone interested? We need at least two mentors. The responsibilities as a mentor are laid out in [2]. I understood, David would be up to it. @Fabian, @Stephan: Should we put you down as mentors as well? I guess it can not hurt to have more than two... Is anyone else interested to mentor? Please reach out soon, so that we can start working on the list of project ideas. Best, Konstantin [1] https://developers.google.com/season-of-docs/docs/admin-guide [2] https://developers.google.com/season-of-docs/docs/mentor-guide On Wed, Apr 10, 2019 at 9:29 AM Stephan Ewen wrote: > Hi all! > > This is a really great initiative, I think the Flink docs could benefit > big time from this. > Aside from some nice tutorials, I think that especially the docs about > concepts, architecture, and deployment need a lot of work, so users > understand how powerful Flink really is. > > I quickly connected offline with Konstantin and David who contributed to > docs in the past and work a lot with Flink users as part of training > (giving them a good insight into where docs could be improved) - both > seemed excited about this opportunity. > > I would be happy to help as well, probably not have time to be a good > mentor, but to help out with ideas what to improve and to brief the writers > on Flink concepts. > > Best, > Stephan > > > On Tue, Apr 9, 2019 at 6:49 PM Fabian Hueske wrote: > >> Hi Aizhamal, hi everyone, >> >> Thanks for sharing this opportunity with our community! >> I can think of a few projects that could be done as part of GSoD. >> >> 1) Flink's user documentation is (IMO) fairly extensive and mostly >> complete, but its structure could be improved in some areas (event-time, >> state management). >> 2) We are lacking some good tutorials to get started with Flink >> (Docker-based setup, SQL + SQL Client, Table API, ...) >> 3) More documentation on Flink's internals (mostly relevant for people >> contributing to Flink) would be great to have. >> >> Looking at the timeline for GSoD, I see that the documentation work would >> start in September. >> Although there is a good chance that some of these issues will still be >> present in five months from now, I don't think we would hold them back in >> case somebody wants to start an effort to work on them. >> There have been some efforts and also a PR [1] to improve some aspects of >> the documentation. Unfortunately, the linked PR was not merged yet due to >> lack of committer involvement. >> >> Obviously, we would need at least two mentors (I think at least one should >> be a committer) who dedicate a good amount of their time for the GSoD >> project. >> Is there somebody in the community, who would be interested in being a >> mentor for GSoD? >> >> Best, >> Fabian >> >> [1] https://github.com/apache/flink/pull/6481 >> >> Am Sa., 6. Apr. 2019 um 00:44 Uhr schrieb Aizhamal Nurmamat kyzy >> : >> >> > Hello everyone, >> > >> > TL;DR If you need some improvements for Flink documentation, apply to >> > Season of Docs before April 23rd. >> > >> > Background: >> > >> > Season of Docs is like Google Summer of Code, but for documentation. >> > Projects write ideas on how they would like to improve their >> documentation, >> > then if they are accepted to the program, they will get a professional >> tech >> > writer to work on the project’s documentation for 3 months. Technical >> > writer’s get stipend from Google. >> > >> > If you think that Apache Flink could benefit from it, please submit the >> > application before April 23rd. >> > >> > The program requires two administrators, to manage the organization's >> > participation in SoD, and at least two mentors to onboard tech writers >> to >> > the project, and work with them closely during 3 months period [2]. To >> be a >> > mentor in this program, you don't have to be a technical writer, but you >> > must know Flink and the open source well to onboard/introduce tech >> writers >> > to the project, and be able to support them during the whole process. >> > >> > I am an administrator for 2 Apache projects, and will be more than >> happy to >> > share my knowledge on this, if yo
Re: [DISCUSS] Apache Flink at Season of Docs
documentation, see FLIP-35 [1]. > > > > > > I think it's a good idea to propose Chinese translation as a project. > > It's > > > a good chance to improve the localization user experience of Flink > > > documentation. > > > I can help as a mentor if we want to submit such a project. > > > > > > Thanks, > > > Jark Wu > > > > > > [1]: > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-35%3A+Support+Chinese+Documents+and+Website > > > > > > On Thu, 11 Apr 2019 at 02:11, Aizhamal Nurmamat kyzy > > > wrote: > > > > > >> Hello everyone, > > >> > > >> @Fabian Hueske - SoD setup is a little bit > > different. > > >> The ASF determined that each project would be allowed to apply > > >> individually[1], rather than applying as a single large organization. > > >> > > >> Each project applies as an org, with two organizers (administers) and > at > > >> least two mentors. As Konstantin pointed out, one can be both an > > >> administer > > >> and a mentor. You don't need to coordinate with other projects or ASF > at > > >> all. If accepted to the program, you will receive project proposals > from > > >> tech writers [2]. You will choose one or two proposals that you want > to > > >> mentor[3]. > > >> > > >> @Ken - as for the language, there isn’t any limitations in that > regard, > > so > > >> work on the Chinese translation for the website is definitely an > > >> acceptable > > >> project. > > >> > > >> Thanks, > > >> > > >> Aizhamal > > >> > > >> > > >> [1] > > >> > > >> > > > https://lists.apache.org/thread.html/67e1c2e6041cff1e7f198b615407401f032795130e796adfaacf8071@%3Cdev.community.apache.org%3E > > >> > > >> [2] https://developers.google.com/season-of-docs/docs/ > > >> > > >> [3] > > https://developers.google.com/season-of-docs/docs/faq#slot-allocation > > >> > > >> > > >> On Wed, Apr 10, 2019 at 8:32 AM Ken Krugler < > > kkrugler_li...@transpac.com> > > >> wrote: > > >> > > >> > Hi Aizhamal, > > >> > > > >> > I assume SoD is language-agnostic, so one possible project would be > to > > >> get > > >> > a tech writer for the Chinese versions of all of the Flink > > >> documentation, > > >> > yes? > > >> > > > >> > Regards, > > >> > > > >> > — Ken > > >> > > > >> > > On Apr 5, 2019, at 3:43 PM, Aizhamal Nurmamat kyzy > > >> > wrote: > > >> > > > > >> > > Hello everyone, > > >> > > > > >> > > TL;DR If you need some improvements for Flink documentation, apply > > to > > >> > > Season of Docs before April 23rd. > > >> > > > > >> > > Background: > > >> > > > > >> > > Season of Docs is like Google Summer of Code, but for > documentation. > > >> > > Projects write ideas on how they would like to improve their > > >> > documentation, > > >> > > then if they are accepted to the program, they will get a > > professional > > >> > tech > > >> > > writer to work on the project’s documentation for 3 months. > > Technical > > >> > > writer’s get stipend from Google. > > >> > > > > >> > > If you think that Apache Flink could benefit from it, please > submit > > >> the > > >> > > application before April 23rd. > > >> > > > > >> > > The program requires two administrators, to manage the > > organization's > > >> > > participation in SoD, and at least two mentors to onboard tech > > >> writers to > > >> > > the project, and work with them closely during 3 months period > [2]. > > To > > >> > be a > > >> > > mentor in this program, you don't have to be a technical writer, > but > > >> you > > >> > > must know Flink and the open source well to onboard/introduce tech > > >> > writers > > >> > > to the project, and be able to support them during the whole > > process. > > >> > > > > >> > > I am an administrator for 2 Apache projects, and will be more than > > >> happy > > >> > to > > >> > > share my knowledge on this, if you, as an organization decide to > > >> apply. > > >> > > > > >> > > I think it will be great if Flink participates in it too! > > >> > > > > >> > > Thanks, > > >> > > Aizhamal > > >> > > > > >> > > [1] https://developers.google.com/season-of-docs/ > > >> > > [2] https://developers.google.com/season-of-docs/docs/timeline > > >> > > > >> > -- > > >> > Ken Krugler > > >> > +1 530-210-6378 > > >> > http://www.scaleunlimited.com > > >> > Custom big data solutions & training > > >> > Flink, Solr, Hadoop, Cascading & Cassandra > > >> > > > >> > > > >> > > > > > > -- Konstantin Knauf | Solutions Architect +49 160 91394525 <https://www.ververica.com/> Follow us @VervericaData -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Data Artisans GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
Re: [DISCUSS] Apache Flink at Season of Docs
Hi all, thanks Ken & Jark for your comments regarding the Chinese translation proposal. I seems there is a agreement that project (1) (Improve English documentation) has a higher priority for us than project (2) (Chinese translation). We could still write two proposals and indicate priorities accordingly in our application. What do you think? Regarding (1): I agree with both Stephan & Fabian and I think, we can achieve both. We shouldn't split it up, but still try to be as specific as possible in the subsections. We can then add a sentence in the sense of "The priorities between the proposed subsections or a focus on a specific subsection will be agreed upon between the organization and technical writer prior to the start of the project according to the preferences and background of the technical writer." to our proposal. Best, Konstantin On Fri, Apr 12, 2019 at 11:57 AM Fabian Hueske wrote: > Yes, I think we would get at most one project accepted. > Having all options in a rather generic proposal gives us the most > flexibility to decide what to work on once the proposal is accepted. > On the other hand, a more concrete proposal might look more attractive for > candidates. > I'm fine either way, but my gut feeling is that a well scoped proposal > gives better changes of finding a writer (which might be the biggest > challenge). > > Am Fr., 12. Apr. 2019 um 11:39 Uhr schrieb Stephan Ewen >: > > > I would suggest to make one proposal and have the subsections only in the > > project plan. > > My understanding is that we need to indicate priorities between proposals > > and might get only one, so it would be good to not subdivide. > > > > On Fri, Apr 12, 2019 at 9:58 AM Fabian Hueske wrote: > > > > > Hi everyone, > > > > > > I think we can split the first project that Stephan proposed into > smaller > > > ones: > > > > > > > (1) Create or rework setup / tutorials / concepts docs > > > > (2) Complete (or advance) the Chinese translation > > > > > > 1.1 Improving (extracting) the documentation of stream processing > > concepts: > > > Event-time, Timers, State, State Backends, Checkpointing, Savepoints > > > Right now, the relevant information is scattered across several pages > and > > > mixed with the implementation / APIs / configuration options. > > > > > > 1.2 Improving & extending the documentation of deployments > > > > > > 1.3 Adding documentation in the internals: Distributed architecture, > > > recovery, operators, job translation, execution, etc. > > > This documentation would be targeted to Flink developers. > > > > > > I thought again about the idea of improving the tutorials, and I'm no > > > longer sure if this would fit SoD well. > > > The reason is that creating good tutorials requires a good portion of > > > coding / configuration (creating Docker images, example programs, > etc.). > > > Also I'd like to start improving the situation of tutorials earlier > than > > > September. > > > > > > What do others think? > > > > > > Cheers, Fabian > > > > > > > > > Am Fr., 12. Apr. 2019 um 04:29 Uhr schrieb Jark Wu : > > > > > > > Hi Konstantin, Ken, > > > > > > > > I agree that Chinese documentation is mainly a translation. > > > > > > > > > Does anyone from the Blink team have input on whether there is > > > existing, > > > > original Chinese documentation which should be translated to English? > > > > > > > > There is a public Blink documentation [1] which is English. > > > > We have a Blink Chinese documentation in internal, but I think we > need > > to > > > > rewrite it in English and restructure it when contributing Blink to > > > Flink. > > > > > > > > Regarding the Chinese translation project, I agree with Ken's > opinion. > > > > > > > > From the point of my translation experience, translation is a work > that > > > > need to understand the original English sentence correctly and then > > > express > > > > it in Chinese in an easily understandable way. This is not a simple > > work > > > > that translate word by word. The one we need is not a "professional > > > > translator", > > > > but a "technical writer who are familiar with both languages". I > also > > > > agree that the writer who write the initial documentation has the &g
Re: [DISCUSS] Apache Flink at Season of Docs
> Event-time, Timers, State, State Backends, Checkpointing, > Savepoints > > > > > Right now, the relevant information is scattered across several > pages > > > and > > > > > mixed with the implementation / APIs / configuration options. > > > > > > > > > > 1.2 Improving & extending the documentation of deployments > > > > > > > > > > 1.3 Adding documentation in the internals: Distributed > architecture, > > > > > recovery, operators, job translation, execution, etc. > > > > > This documentation would be targeted to Flink developers. > > > > > > > > > > I thought again about the idea of improving the tutorials, and I'm > no > > > > > longer sure if this would fit SoD well. > > > > > The reason is that creating good tutorials requires a good portion > of > > > > > coding / configuration (creating Docker images, example programs, > > > etc.). > > > > > Also I'd like to start improving the situation of tutorials earlier > > > than > > > > > September. > > > > > > > > > > What do others think? > > > > > > > > > > Cheers, Fabian > > > > > > > > > > > > > > > Am Fr., 12. Apr. 2019 um 04:29 Uhr schrieb Jark Wu < > imj...@gmail.com > > >: > > > > > > > > > > > Hi Konstantin, Ken, > > > > > > > > > > > > I agree that Chinese documentation is mainly a translation. > > > > > > > > > > > > > Does anyone from the Blink team have input on whether there is > > > > > existing, > > > > > > original Chinese documentation which should be translated to > > English? > > > > > > > > > > > > There is a public Blink documentation [1] which is English. > > > > > > We have a Blink Chinese documentation in internal, but I think we > > > need > > > > to > > > > > > rewrite it in English and restructure it when contributing Blink > to > > > > > Flink. > > > > > > > > > > > > Regarding the Chinese translation project, I agree with Ken's > > > opinion. > > > > > > > > > > > > From the point of my translation experience, translation is a > work > > > that > > > > > > need to understand the original English sentence correctly and > then > > > > > express > > > > > > it in Chinese in an easily understandable way. This is not a > simple > > > > work > > > > > > that translate word by word. The one we need is not a > "professional > > > > > > translator", > > > > > > but a "technical writer who are familiar with both languages". I > > > also > > > > > > agree that the writer who write the initial documentation has the > > > > better > > > > > > ability than > > > > > > translator to distill complex technical concepts. > > > > > > > > > > > > [1]: https://flink-china.org/doc/blink > > > > > > > > > > > > On Fri, 12 Apr 2019 at 02:40, Ken Krugler < > > > kkrugler_li...@transpac.com > > > > > > > > > > > wrote: > > > > > > > > > > > > > Hi Konstantin, > > > > > > > > > > > > > > Comments inline below… > > > > > > > > > > > > > > — Ken > > > > > > > > > > > > > > > On Apr 11, 2019, at 9:05 AM, Konstantin Knauf < > > > > > > konstan...@ververica.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > I will start going through the registration process tomorrow > > > (CET). > > > > > > > > Jincheng (cc) reached out to me directly and offered to be > the > > > > second > > > > > > > > organization administrator. So, we are all set in that > regard. > > > > > > > > > > > > > > > > In terms of mentors, we now have > > > > > > > > > > > > > > > > * myself > &
Re: [DISCUSS] Apache Flink at Season of Docs
Hi everyone, thanks @Aizhamal Nurmamat kyzy . As we only have one week left until the application deadline, I went ahead and created a document for the project ideas [1]. I have added the description for the "stream processing concepts" as well as the "deployment & operations documentation" project idea. Please let me know what you think, edit & comment. We also need descriptions for the other two projects (Table API/SQL & Flink Internals). @Fabian/@Jark/@Stephan can you chime in? Any more project ideas? Best, Konstantin [1] https://docs.google.com/document/d/1Up53jNsLztApn-mP76AB6xWUVGt3nwS9p6xQTiceKXo/edit?usp=sharing On Fri, Apr 12, 2019 at 6:50 PM Aizhamal Nurmamat kyzy wrote: > Hello everyone, > > @Konstantin Knauf - yes, you are correct. > Between steps 1 and 2 though, the open source organization, in this case > Flink, has to be selected by SoD as one of the participating orgs *fingers > crossed*. > > One tip about organizing ideas is that you want to communicate potential > projects to the tech writers that are applying. Just make sure the scope of > the project is clear to them. The SoD wants to set up the tech writers for > success by making sure the work can be done in the allotted time. > > Hope it helps. > > Aizhamal > > > > On Fri, Apr 12, 2019 at 7:37 AM Konstantin Knauf > wrote: > >> Hi all, >> >> I read through the SoD documentation again, and now I think, it would >> actually make sense to split (1) up into multiple project ideas. Let me >> summarize the overall process: >> >> 1. We create & publish a list of project ideas, e.g. in a blog post. >> (This can be any number of ideas.) >> 2. Potential technical writers look at our list of ideas and sent a >> proposal for a particular project to Google. During that time they can >> reach out to us for clarification. >> 3. Google forwards all proposals for our project ideas to us and we sent >> back a prioritized list of proposals, which we would like to accept. >> 4. Of all these proposals, Google accepts 50 proposals for SoD 2019. Per >> organization Google will only accept a maximum of two proposals. >> >> @Aizhamal Nurmamat kyzy Please correct me! >> >> For me this means we should splits this up in a way, that each project is >> a) still relevant in September b) makes sense as a 3 month project. Based >> on the ideas we have right now these could for example be: >> >> (I) Rework/Extract/Improve the documentation of stream processing concepts >> (II) Improve & extend Apache Flink's documentation for deployment, >> operations (incl. configuration) >> (III) Add documentation for Flink internals >> (IV) Rework Table API / SQL documentation >> >> We would then get proposals potentially for all of these topics and could >> decide which of these proposals, we would sent back to Google. My feeling >> is that a technical writer could easily spent three months on any of these >> projects. What do others think? Any other project ideas? >> >> Cheers, >> >> Konstantin >> >> >> >> >> On Fri, Apr 12, 2019 at 1:47 PM Jark Wu wrote: >> >>> Hi all, >>> >>> I'm fine with only preparing the first proposal. I think it's reasonable >>> because the first proposal is more attractive >>> and maybe there is not enough Chinese writer. We can focus on one project >>> to come up with a concrete and >>> attractive project plan. >>> >>> One possible subproject could be rework Table SQL docs. >>> (1). Improve concepts in Table SQL. >>> (2). A more detailed introduction of built-in functions, currently we >>> only >>> have a simple explanation for each function. >>> We should add more descriptions, especially more concrete examples, >>> and maybe some notes. We can take >>> MySQL doc [1] as a reference. >>> (3). As Flink SQL is evolving rapidly and features from Blink is being >>> merged, for example, SQL DDL, Hive integration, >>> Python Table API, Interactive Programing, SQL optimization and >>> tuning, etc... We can redesign the doc structure of >>> Table SQL in a higher vision. >>> >>> Cheers, >>> Jark >>> >>> [1]: >>> >>> https://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_bin >>> >>> >>> >>> On Fri, 12 Apr 2019 at 18:19, jincheng sun >>> wrote: >>> >>> > I am honored to have the opportunity to do a second organization >>&g
Re: [DISCUSS] A more restrictive JIRA workflow
gt; >> wrote: > > > > > > >>>>> Hi devs, > > > > > > >>>>> > > > > > > >>>>> Just now I find that one not a contributor can file issue > and > > > > > > >>> participant > > > > > > >>>>> discussion. > > > > > > >>>>> One becomes contributor can additionally assign an issue > to a > > > > > person > > > > > > >> and > > > > > > >>>>> modify fields of any issues. > > > > > > >>>>> > > > > > > >>>>> For a more restrictive JIRA workflow, maybe we achieve it > by > > > > making > > > > > > >> it a > > > > > > >>>>> bit more > > > > > > >>>>> restrictive granting contributor permission? > > > > > > >>>>> > > > > > > >>>>> Best, > > > > > > >>>>> tison. > > > > > > >>>>> > > > > > > >>>>> > > > > > > >>>>> Robert Metzger 于2019年2月27日周三 > 下午9:53写道: > > > > > > >>>>> > > > > > > >>>>>> I like this idea and I would like to try it to see if it > > > solves > > > > > the > > > > > > >>>>>> problem. > > > > > > >>>>>> > > > > > > >>>>>> I can also offer to add a functionality to the Flinkbot to > > > > > > >>> automatically > > > > > > >>>>>> close pull requests which have been opened against a > > > unassigned > > > > > JIRA > > > > > > >>>>>> ticket. > > > > > > >>>>>> Being rejected by an automated system, which just applies > a > > > rule > > > > > is > > > > > > >>> nicer > > > > > > >>>>>> than being rejected by a person. > > > > > > >>>>>> > > > > > > >>>>>> > > > > > > >>>>>> On Wed, Feb 27, 2019 at 1:45 PM Stephan Ewen < > > > se...@apache.org> > > > > > > >> wrote: > > > > > > >>>>>>> @Chesnay - yes, this is possible, according to infra. > > > > > > >>>>>>> > > > > > > >>>>>>> On Wed, Feb 27, 2019 at 11:09 AM ZiLi Chen < > > > > wander4...@gmail.com > > > > > > > > > > > > >>>>> wrote: > > > > > > >>>>>>>> Hi, > > > > > > >>>>>>>> > > > > > > >>>>>>>> @Hequn > > > > > > >>>>>>>> It might be hard to separate JIRAs into conditional and > > > > > > >> unconditional > > > > > > >>>>>>> ones. > > > > > > >>>>>>>> Even if INFRA supports such separation, we meet the > > problem > > > > that > > > > > > >>>>>> whether > > > > > > >>>>>>>> a contributor is granted to decide the type of a JIRA. > If > > > so, > > > > > then > > > > > > >>>>>>>> contributors might > > > > > > >>>>>>>> tend to create JIRAs as unconditional; and if not, we > > > fallback > > > > > > >> that a > > > > > > >>>>>>>> contributor > > > > > > >>>>>>>> ask a committer for setting the JIRA as unconditional, > > which > > > > is > > > > > no > > > > > > >>>>>> better > > > > > > >>>>>>>> than > > > > > > >>>>>>>> ask a committer for assigning to the contributor. > > > > > > >>>>>>>> > > > > > > >>>>>>>> @Timo > > > > > > >>>>>>>> "More discussion before opening a PR" sounds good. >