Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

2020-02-09 Thread Till Rohrmann
Hi Yadong,

I think it would be fine to simply link to this discussion thread to keep
the discussion history. Maybe an easier way would be to create top-level
FLIPs for the individual changes proposed in FLIP-75. The reason I'm
proposing this is that it would be easier to vote on it and to implement it
because the scope is smaller. But maybe I'm wrong here and others could
chime in to voice their opinion.

Cheers,
Till

On Fri, Feb 7, 2020 at 9:58 AM Yadong Xie  wrote:

> Hi Till
>
> FLIP-75 has been open since September, and the design doc has been iterated
> over 3 versions and more than 20 patches.
> I had a try, but it is hard to split the design docs into sub FLIP and keep
> all the discussion history at the same time.
>
> Maybe it is better to start another discussion to talk about the individual
> sub FLIP voting? and make the next FLIP follow the new practice if
> possible.
>
> Till Rohrmann  于2020年2月3日周一 下午6:28写道:
>
> > I think there is no such description because we never did it before. I
> just
> > figured that FLIP-75 could actually be a good candidate to start this
> > practice. We would need a community discussion first, though.
> >
> > Cheers,
> > Till
> >
> > On Mon, Feb 3, 2020 at 10:28 AM Yadong Xie  wrote:
> >
> > > Hi Till
> > > I didn’t find how to create of sub flip at cwiki.apache.org
> > > do you mean to create 9 more FLIPS instead of FLIP-75?
> > >
> > > Till Rohrmann  于2020年1月30日周四 下午11:12写道:
> > >
> > > > Would it be easier if FLIP-75 would be the umbrella FLIP and we would
> > > vote
> > > > on the individual improvements as sub FLIPs? Decreasing the scope
> > should
> > > > make things easier.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Thu, Jan 30, 2020 at 2:35 PM Robert Metzger 
> > > > wrote:
> > > >
> > > > > Thanks a lot for this work! I believe the web UI is very important,
> > in
> > > > > particular to new users. I'm very happy to see that you are putting
> > > > effort
> > > > > into improving the visibility into Flink through the proposed
> > changes.
> > > > >
> > > > > I can not judge if all the changes make total sense, but the
> > discussion
> > > > has
> > > > > been open since September, and a good number of people have
> commented
> > > in
> > > > > the document.
> > > > > I wonder if we can move this FLIP to the VOTing stage?
> > > > >
> > > > > On Wed, Jan 22, 2020 at 6:27 PM Till Rohrmann <
> trohrm...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Thanks for the update Yadong. Big +1 for the proposed
> improvements
> > > for
> > > > > > Flink's web UI. I think they will be super helpful for our users.
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > >
> > > > > > On Tue, Jan 7, 2020 at 10:00 AM Yadong Xie 
> > > > wrote:
> > > > > >
> > > > > > > Hi everyone
> > > > > > >
> > > > > > > We have spent some time updating the documentation since the
> last
> > > > > > > discussion.
> > > > > > >
> > > > > > > In short, the latest FLIP-75 contains the following
> > > > proposal(including
> > > > > > both
> > > > > > > frontend and RestAPI)
> > > > > > >
> > > > > > >1. Job Level
> > > > > > >   - better job backpressure detection
> > > > > > >   - load more feature in job exception
> > > > > > >   - show attempt history in the subtask
> > > > > > >   - show attempt timeline
> > > > > > >   - add pending slots
> > > > > > >2. Task Manager Level
> > > > > > >   - add more metrics
> > > > > > >   - better log display
> > > > > > >3. Job Manager Level
> > > > > > >   - add metrics tab
> > > > > > >   - better log display
> > > > > > >
> > > > > > > To help everyone better understand the proposal, we spent
> efforts
> > > on
> > > > > > making
> > > > > > > an online POC .
> > > > > > >
> > > > > > > Now you can compare the difference between the new and old
> > > > Web/RestAPI
> > > > > > (the
> > > > > > > link is inside the doc)!
> > > > > > >
> > > > > > > Here is the latest FLIP-75 doc:
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > > > > >
> > > > > > > Looking forward to your feedback
> > > > > > >
> > > > > > >
> > > > > > > Best,
> > > > > > > Yadong
> > > > > > >
> > > > > > > lining jing  于2019年10月24日周四 下午2:11写道:
> > > > > > >
> > > > > > > > Hi all, I have updated the backend design in FLIP-75
> > > > > > > > <
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > > > > >
> > > > > > > > .
> > > > > > > >
> > > > > > > > Here are some brief introductions:
> > > > > > > >
> > > > > > > >- Add metric for manage memory FLINK-14406
> > > > > > > >.
> > > > > > > >- Expose TaskExecutor resource configurations to REST API
> > > > > > FLINK-14422
> >

Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-09 Thread Yu Li
Thanks for the efforts Aihua! These could definitely improve our RC test
coverage!

Just to confirm, that the stability tests were executed with the same test
suite for Alibaba production usage, and the e2e performance one was
executed with the test suite proposed in FLIP-83 [1] and FLINK-14917 [2],
and the result could also be observed from our performance code-speed
center [3], right?

Thanks.

Best Regards,
Yu

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
[2] https://issues.apache.org/jira/browse/FLINK-14917
[3] https://s.apache.org/nglhm


On Sun, 9 Feb 2020 at 11:20, aihua li  wrote:

> +1 (non-binging)
>
> I ran stability tests and end-to-end performance tests in branch
> release-1.10.0-rc3,both of them passed.
>
> Stability test: It mainly checks The flink job can revover from  various
> abnormal situations which concluding disk full,
> network interruption, zk unable to connect, rpc message timeout, etc.
> If job can't be recoverd it means test failed.
> The test passed after running 5 hours.
>
> End-to-end performance test: It containes 32 test scenarios which designed
> in FLIP-83.
> Test results: The performance regressions about 3% from 1.9.1 if uses
> default parameters;
> The result:
>  if skips FLIP-49 (add parameters:taskmanager.memory.managed.fraction:
> 0,taskmanager.memory.flink.size: 1568m in flink-conf.yaml),
>  the performance improves about 5% from 1.9.1. The result:
>
> I confirm it with @Xintong Song
>  that the
> result  makes sense.
>
> 2020年2月8日 上午5:54,Gary Yao  写道:
>
> Hi everyone,
> Please review and vote on the release candidate #3 for the version 1.10.0,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org [2], which are signed with the key with
> fingerprint BB137807CEFBE7DD2616556710B12A1F89C115E8 [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "release-1.10.0-rc3" [5],
> * website pull request listing the new release and adding announcement blog
> post [6][7].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Yu & Gary
>
> [1]
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845
> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc3/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4] https://repository.apache.org/content/repositories/orgapacheflink-1333
> [5] https://github.com/apache/flink/releases/tag/release-1.10.0-rc3
> [6] https://github.com/apache/flink-web/pull/302
> [7] https://github.com/apache/flink-web/pull/301
>
>
>


Re: [DISCUSS] FLINK-15831: Add Docker image publication to release documentation

2020-02-09 Thread Till Rohrmann
Sounds good to me Patrick. +1 for these changes.

Cheers,
Till

On Fri, Feb 7, 2020 at 3:25 PM Patrick Lucas  wrote:

> Hi all,
>
> For FLINK-15831[1], I think the way to start is for the flink-docker
> repo[2] itself to sufficiently document the workflow for publishing new
> Dockerfiles, and then update the Flink release guide in the wiki to refer
> to this documentation and to include this step in the "Finalize the
> release" checklist.
>
> To the first point, I have opened a PR[3] on flink-docker to improve its
> documentation.
>
> And for updating the release guide, I propose the following changes:
>
> 1. Add a new subsection to "Finalize the release", prior to "Checklist to
> proceed to the next step" with the following content:
>
> Publish the Dockerfiles for the new release
> >
> > Note: the official Dockerfiles fetch the binary distribution of the
> target
> > Flink version from an Apache mirror. After publishing the binary release
> > artifacts, mirrors can take some hours to start serving the new
> artifacts,
> > so you may want to wait to do this step until you are ready to continue
> > with the "Promote the release" steps below.
> >
> > Follow the instructions in the [flink-docker] repo to build the new
> > Dockerfiles and send an updated manifest to Docker Hub so the new images
> > are built and published.
> >
>
> 2. Add an entry to the "Checklist to proceed to the next step" subsection
> of "Finalize the release":
>
> >
> >- Dockerfiles in flink-docker updated for the new Flink release and
> >pull request opened on the Docker official-images with an updated
> manifest
> >
> > Please let me know if you have any questions or suggestions to improve
> this proposal.
>
> Thanks,
> Patrick
>
> [1]https://issues.apache.org/jira/browse/FLINK-15831
> [2]https://github.com/apache/flink-docker
> [3]https://github.com/apache/flink-docker/pull/5
>


[VOTE] Release flink-shaded 10.0, release candidate #1

2020-02-09 Thread Chesnay Schepler

Hi everyone,
Please review and vote on the release candidate #1 for the version 10.0, 
as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org 
[2], which are signed with the key with fingerprint 11D464BA [3],

* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-10.0-rc1 [5],
* website pull request listing the new release [6].

The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.


Thanks,
Chesnay

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12346746

[2] https://dist.apache.org/repos/dist/dev/flink/flink-shaded-10.0-rc1/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1334
[5] 
https://gitbox.apache.org/repos/asf?p=flink-shaded.git;a=tag;h=refs/tags/release-10.0-rc1

[6] https://github.com/apache/flink-web/pull/304



Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

2020-02-09 Thread Yadong Xie
Hi Till
I got your point, will create sub FLIPs and votings according to the
FLIP-75 and previous discussion soon.

Till Rohrmann  于2020年2月9日周日 下午5:27写道:

> Hi Yadong,
>
> I think it would be fine to simply link to this discussion thread to keep
> the discussion history. Maybe an easier way would be to create top-level
> FLIPs for the individual changes proposed in FLIP-75. The reason I'm
> proposing this is that it would be easier to vote on it and to implement it
> because the scope is smaller. But maybe I'm wrong here and others could
> chime in to voice their opinion.
>
> Cheers,
> Till
>
> On Fri, Feb 7, 2020 at 9:58 AM Yadong Xie  wrote:
>
> > Hi Till
> >
> > FLIP-75 has been open since September, and the design doc has been
> iterated
> > over 3 versions and more than 20 patches.
> > I had a try, but it is hard to split the design docs into sub FLIP and
> keep
> > all the discussion history at the same time.
> >
> > Maybe it is better to start another discussion to talk about the
> individual
> > sub FLIP voting? and make the next FLIP follow the new practice if
> > possible.
> >
> > Till Rohrmann  于2020年2月3日周一 下午6:28写道:
> >
> > > I think there is no such description because we never did it before. I
> > just
> > > figured that FLIP-75 could actually be a good candidate to start this
> > > practice. We would need a community discussion first, though.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Mon, Feb 3, 2020 at 10:28 AM Yadong Xie 
> wrote:
> > >
> > > > Hi Till
> > > > I didn’t find how to create of sub flip at cwiki.apache.org
> > > > do you mean to create 9 more FLIPS instead of FLIP-75?
> > > >
> > > > Till Rohrmann  于2020年1月30日周四 下午11:12写道:
> > > >
> > > > > Would it be easier if FLIP-75 would be the umbrella FLIP and we
> would
> > > > vote
> > > > > on the individual improvements as sub FLIPs? Decreasing the scope
> > > should
> > > > > make things easier.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Thu, Jan 30, 2020 at 2:35 PM Robert Metzger <
> rmetz...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Thanks a lot for this work! I believe the web UI is very
> important,
> > > in
> > > > > > particular to new users. I'm very happy to see that you are
> putting
> > > > > effort
> > > > > > into improving the visibility into Flink through the proposed
> > > changes.
> > > > > >
> > > > > > I can not judge if all the changes make total sense, but the
> > > discussion
> > > > > has
> > > > > > been open since September, and a good number of people have
> > commented
> > > > in
> > > > > > the document.
> > > > > > I wonder if we can move this FLIP to the VOTing stage?
> > > > > >
> > > > > > On Wed, Jan 22, 2020 at 6:27 PM Till Rohrmann <
> > trohrm...@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks for the update Yadong. Big +1 for the proposed
> > improvements
> > > > for
> > > > > > > Flink's web UI. I think they will be super helpful for our
> users.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Till
> > > > > > >
> > > > > > > On Tue, Jan 7, 2020 at 10:00 AM Yadong Xie <
> vthink...@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > > > Hi everyone
> > > > > > > >
> > > > > > > > We have spent some time updating the documentation since the
> > last
> > > > > > > > discussion.
> > > > > > > >
> > > > > > > > In short, the latest FLIP-75 contains the following
> > > > > proposal(including
> > > > > > > both
> > > > > > > > frontend and RestAPI)
> > > > > > > >
> > > > > > > >1. Job Level
> > > > > > > >   - better job backpressure detection
> > > > > > > >   - load more feature in job exception
> > > > > > > >   - show attempt history in the subtask
> > > > > > > >   - show attempt timeline
> > > > > > > >   - add pending slots
> > > > > > > >2. Task Manager Level
> > > > > > > >   - add more metrics
> > > > > > > >   - better log display
> > > > > > > >3. Job Manager Level
> > > > > > > >   - add metrics tab
> > > > > > > >   - better log display
> > > > > > > >
> > > > > > > > To help everyone better understand the proposal, we spent
> > efforts
> > > > on
> > > > > > > making
> > > > > > > > an online POC .
> > > > > > > >
> > > > > > > > Now you can compare the difference between the new and old
> > > > > Web/RestAPI
> > > > > > > (the
> > > > > > > > link is inside the doc)!
> > > > > > > >
> > > > > > > > Here is the latest FLIP-75 doc:
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > > > > > >
> > > > > > > > Looking forward to your feedback
> > > > > > > >
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yadong
> > > > > > > >
> > > > > > > > lining jing  于2019年10月24日周四 下午2:11写道:
> > > > > > > >
> > > > > > > > > Hi all, I have updated the backend design in FLIP-75
> > > > > > > > > <
> > > > > > > > >
> > > > >

Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-09 Thread aihua li
Yes, but the results you see in the Performance Code Speed Center [3] skip 
FLIP-49.
 The results of the default configurations are overwritten by the latest 
results.

> 2020年2月9日 下午5:29,Yu Li  写道:
> 
> Thanks for the efforts Aihua! These could definitely improve our RC test 
> coverage!
> 
> Just to confirm, that the stability tests were executed with the same test 
> suite for Alibaba production usage, and the e2e performance one was executed 
> with the test suite proposed in FLIP-83 [1] and FLINK-14917 [2], and the 
> result could also be observed from our performance code-speed center [3], 
> right?
> 
> Thanks.
> 
> Best Regards,
> Yu
> 
> [1] 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
>  
> 
> [2] https://issues.apache.org/jira/browse/FLINK-14917 
> 
> [3] https://s.apache.org/nglhm 
> 
> On Sun, 9 Feb 2020 at 11:20, aihua li  > wrote:
> +1 (non-binging)
> 
> I ran stability tests and end-to-end performance tests in branch 
> release-1.10.0-rc3,both of them passed.
> 
> Stability test: It mainly checks The flink job can revover from  various 
> abnormal situations which concluding disk full, 
> network interruption, zk unable to connect, rpc message timeout, etc. 
> If job can't be recoverd it means test failed.
> The test passed after running 5 hours.
> 
> End-to-end performance test: It containes 32 test scenarios which designed in 
> FLIP-83.
> Test results: The performance regressions about 3% from 1.9.1 if uses default 
> parameters;
> The result:
> 
>  if skips FLIP-49 (add parameters:taskmanager.memory.managed.fraction: 
> 0,taskmanager.memory.flink.size: 1568m in flink-conf.yaml),
>  the performance improves about 5% from 1.9.1. The result:
> 
> 
> I confirm it with @Xintong Song 
>  that the result  
> makes sense.
> 
>> 2020年2月8日 上午5:54,Gary Yao mailto:g...@apache.org>> 写道:
>> 
>> Hi everyone,
>> Please review and vote on the release candidate #3 for the version 1.10.0,
>> as follows:
>> [ ] +1, Approve the release
>> [ ] -1, Do not approve the release (please provide specific comments)
>> 
>> 
>> The complete staging area is available for your review, which includes:
>> * JIRA release notes [1],
>> * the official Apache source release and binary convenience releases to be
>> deployed to dist.apache.org  [2], which are signed 
>> with the key with
>> fingerprint BB137807CEFBE7DD2616556710B12A1F89C115E8 [3],
>> * all artifacts to be deployed to the Maven Central Repository [4],
>> * source code tag "release-1.10.0-rc3" [5],
>> * website pull request listing the new release and adding announcement blog
>> post [6][7].
>> 
>> The vote will be open for at least 72 hours. It is adopted by majority
>> approval, with at least 3 PMC affirmative votes.
>> 
>> Thanks,
>> Yu & Gary
>> 
>> [1]
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845
>>  
>> 
>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc3/ 
>> 
>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS 
>> 
>> [4] https://repository.apache.org/content/repositories/orgapacheflink-1333 
>> 
>> [5] https://github.com/apache/flink/releases/tag/release-1.10.0-rc3 
>> 
>> [6] https://github.com/apache/flink-web/pull/302 
>> 
>> [7] https://github.com/apache/flink-web/pull/301 
>> 
> 



Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-09 Thread Zhu Zhu
The commit info is shown as  on the web UI and in logs.
Not sure if it's a common issue or just happens to my build only.

Thanks,
Zhu Zhu

aihua li  于2020年2月9日周日 下午7:42写道:

> Yes, but the results you see in the Performance Code Speed Center [3] skip
> FLIP-49.
>  The results of the default configurations are overwritten by the latest
> results.
>
> > 2020年2月9日 下午5:29,Yu Li  写道:
> >
> > Thanks for the efforts Aihua! These could definitely improve our RC test
> coverage!
> >
> > Just to confirm, that the stability tests were executed with the same
> test suite for Alibaba production usage, and the e2e performance one was
> executed with the test suite proposed in FLIP-83 [1] and FLINK-14917 [2],
> and the result could also be observed from our performance code-speed
> center [3], right?
> >
> > Thanks.
> >
> > Best Regards,
> > Yu
> >
> > [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> >
> > [2] https://issues.apache.org/jira/browse/FLINK-14917 <
> https://issues.apache.org/jira/browse/FLINK-14917>
> > [3] https://s.apache.org/nglhm 
> >
> > On Sun, 9 Feb 2020 at 11:20, aihua li  liaihua1...@gmail.com>> wrote:
> > +1 (non-binging)
> >
> > I ran stability tests and end-to-end performance tests in branch
> release-1.10.0-rc3,both of them passed.
> >
> > Stability test: It mainly checks The flink job can revover from  various
> abnormal situations which concluding disk full,
> > network interruption, zk unable to connect, rpc message timeout, etc.
> > If job can't be recoverd it means test failed.
> > The test passed after running 5 hours.
> >
> > End-to-end performance test: It containes 32 test scenarios which
> designed in FLIP-83.
> > Test results: The performance regressions about 3% from 1.9.1 if uses
> default parameters;
> > The result:
> >
> >  if skips FLIP-49 (add parameters:taskmanager.memory.managed.fraction:
> 0,taskmanager.memory.flink.size: 1568m in flink-conf.yaml),
> >  the performance improves about 5% from 1.9.1. The result:
> >
> >
> > I confirm it with @Xintong Song <
> https://cwiki.apache.org/confluence/display/~xintongsong> that the
> result  makes sense.
> >
> >> 2020年2月8日 上午5:54,Gary Yao mailto:g...@apache.org>>
> 写道:
> >>
> >> Hi everyone,
> >> Please review and vote on the release candidate #3 for the version
> 1.10.0,
> >> as follows:
> >> [ ] +1, Approve the release
> >> [ ] -1, Do not approve the release (please provide specific comments)
> >>
> >>
> >> The complete staging area is available for your review, which includes:
> >> * JIRA release notes [1],
> >> * the official Apache source release and binary convenience releases to
> be
> >> deployed to dist.apache.org  [2], which are
> signed with the key with
> >> fingerprint BB137807CEFBE7DD2616556710B12A1F89C115E8 [3],
> >> * all artifacts to be deployed to the Maven Central Repository [4],
> >> * source code tag "release-1.10.0-rc3" [5],
> >> * website pull request listing the new release and adding announcement
> blog
> >> post [6][7].
> >>
> >> The vote will be open for at least 72 hours. It is adopted by majority
> >> approval, with at least 3 PMC affirmative votes.
> >>
> >> Thanks,
> >> Yu & Gary
> >>
> >> [1]
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845
> <
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845
> >
> >> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc3/ <
> https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc3/>
> >> [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
> https://dist.apache.org/repos/dist/release/flink/KEYS>
> >> [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1333 <
> https://repository.apache.org/content/repositories/orgapacheflink-1333>
> >> [5] https://github.com/apache/flink/releases/tag/release-1.10.0-rc3 <
> https://github.com/apache/flink/releases/tag/release-1.10.0-rc3>
> >> [6] https://github.com/apache/flink-web/pull/302 <
> https://github.com/apache/flink-web/pull/302>
> >> [7] https://github.com/apache/flink-web/pull/301 <
> https://github.com/apache/flink-web/pull/301>
> >
>
>


[ANNOUNCE] Weekly Community Update 2020/06

2020-02-09 Thread Konstantin Knauf
Dear community,

happy to share this week's community update. Activity on the dev@ mailing
list has picked up quite a bit this week and more and more concrete design
proposals for Flink 1.11 are brought up for discussion. Besides that, Flink
1.10 and flink-shaded 10.0 are both close to being released.

Flink Development
==

* [releases] Since Friday the community is voting on release candidate #3
for *Flink 1.10*. No blockers found so far. [1]

* [releases] Chesnay has kicked off the release of *flink-shaded 10.0*.
This release will allow Apache Flink to support Zookeeper 3.5 alongside
Zookeeper 3.4 (currently supported version). [2,3]

* [connectors] As part of his work on a JDBC exactly-once sink, Roman
proposes to redesign the *TwoPhaseCommitSinkFunction* abstraction in
FLIP-94 . The new abstraction would work for both the current exactly-once
KafkaProducer, the new JDBC sink and could potentially also subsume
existing WAL sinks like the Cassandra sink. No feedback so far. [4,5]

* [ecosystem] Gyula has started an interesting discussion on *Apache Atlas
integration* for Flink. The integration itself would live in Apache Atlas,
but Flink would need to provide the required hooks and expose job metadata
(particularly of sources & sinks) [6]

* [web ui] FLIP-75, a collection of *improvements to Flink's web user
interface*, will be split up into multiple "sub-FLIPs" for voting. [7]

* [python, sql] In Flink 1.10, the Python Table API will support Python
UDFs for the first time. Xingbo has prepared a proposal for *User Defined
Table Function *support in the* Python* Table API. [8]

* [python, sql] Dian Fu proposes to support scalar, *vectorized* *UDFs* in
the *Python* Table API. Instead of exchanging serialized rows between Java
and Python process, the Flink operator would send batches of rows in a
columnar format to the Python process. Thereby, the proposal aims to
improve performance and increase interoperability with libraries such as
Pandas. [9]

* [python] Hequn proposes to support the *Flink's machine learning API* on
top of the *Python* Table API. No feedback so far. [10]

* [distribution] Hequn has started a discussion about moving the binaries
of Flink's new *machine learning *modules into *opt/ *directory of the
Flink distribution. While everyone seems to agree, that a) we need a long
term plan of how to distribute components of the ever growing Flink project
and b) the Flink distribution should be rather lean, it is not yet clear
how to deal with the machine learning modules at this point in time. [11]

* [development process] Removing a deprecated interface does not require a
FLIP. [12]

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-10-0-release-candidate-3-tp37405.html
[2]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-flink-shaded-10-0-release-candidate-1-tp37417.html
[3]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Release-flink-shaded-10-0-tp37306.html
[4]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-94-Rework-2-phase-commit-abstractions-tp37164.html
[5]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-94%3A+Rework+2-phase+commit+abstractions
[6]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-Job-generation-submission-hooks-Atlas-integration-tp37298.html
[7]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-75-Flink-Web-UI-Improvement-Proposal-tp33540.html
[8]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-User-Defined-Table-Function-in-PyFlink-tp37149.html
[9]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-scalar-vectorized-Python-UDF-in-PyFlink-tp37264.html
[10]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-Python-ML-Pipeline-API-tp37291.html
[11]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Include-flink-ml-api-and-flink-ml-lib-in-opt-tp36476.html
[12]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Does-removing-deprecated-interfaces-needs-another-FLIP-tp37365p37370.html

Notable Bugs
==

* [FLINK-15918] [1.9.2] The "uptime" metrics is not reset during restarts,
when jobmanager.execution.failover-strategy: region is configured (which is
the default). [13]

[13] https://issues.apache.org/jira/browse/FLINK-15918

Events, Blog Posts, Misc
===

* *Kartik Khare* has published a guide on unit testing Apache Flink on the
Apache Flink blog. [14]

* The first set of speakers for Flink Forward San Francisco has been
announced [15]. For a 50% discount check out this thread [16].

* Upcoming Meetups
   * On February 19th, Apache Flink Meetup London, "Monitoring and
Analysing Communication & Trade Events as Graphs", hosted by *Christos
Hadjinikolis* [17]

[14]
https://flink.apache.org/news/2020/02/07/a-guide-for-unit-testing-in-apache-flink.html

[jira] [Created] (FLINK-15960) support creating Hive tables, views, functions within Flink

2020-02-09 Thread Bowen Li (Jira)
Bowen Li created FLINK-15960:


 Summary: support creating Hive tables, views, functions within 
Flink
 Key: FLINK-15960
 URL: https://issues.apache.org/jira/browse/FLINK-15960
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Reporter: Bowen Li
Assignee: Jingsong Lee
 Fix For: 1.11.0


support creating Hive tables, views, functions within Flink, to achieve higher 
interoperability between Flink and Hive, and not requiring users to switch 
between Flink and Hive CLIs.

Have heard such requests from multiple Flink-Hive users

 

cc [~ykt836] [~lirui]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Improve TableFactory to add Context

2020-02-09 Thread Zhenghua Gao
+1 (non-binding). Thanks for driving this.

*Best Regards,*
*Zhenghua Gao*


On Fri, Feb 7, 2020 at 5:05 PM Leonard Xu  wrote:

> +1(non-binding), nice design!
> after read full discussion mail list.
>
> Best,
> Leonard Xu
>
> > 在 2020年2月6日,23:12,Timo Walther  写道:
> >
> > +1
> >
> > On 06.02.20 05:54, Bowen Li wrote:
> >> +1, LGTM
> >> On Tue, Feb 4, 2020 at 11:28 PM Jark Wu  wrote:
> >>> +1 form my side.
> >>> Thanks for driving this.
> >>>
> >>> Btw, could you also attach a JIRA issue with the changes described in
> it,
> >>> so that users can find the issue through the mailing list in the
> future.
> >>>
> >>> Best,
> >>> Jark
> >>>
> >>> On Wed, 5 Feb 2020 at 13:38, Kurt Young  wrote:
> >>>
>  +1 from my side.
> 
>  Best,
>  Kurt
> 
> 
>  On Wed, Feb 5, 2020 at 10:59 AM Jingsong Li 
>  wrote:
> 
> > Hi all,
> >
> > Interface updated.
> > Please re-vote.
> >
> > Best,
> > Jingsong Lee
> >
> > On Tue, Feb 4, 2020 at 1:28 PM Jingsong Li 
>  wrote:
> >
> >> Hi all,
> >>
> >> I would like to start the vote for the improve of
> >> TableFactory, which is discussed and
> >> reached a consensus in the discussion thread[2].
> >>
> >> The vote will be open for at least 72 hours. I'll try to close it
> >> unless there is an objection or not enough votes.
> >>
> >> [1]
> >>
> >
> 
> >>>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improve-TableFactory-td36647.html
> >>
> >> Best,
> >> Jingsong Lee
> >>
> >
> >
> > --
> > Best, Jingsong Lee
> >
> 
> >>>
> >
>
>


Re: [VOTE] Improve TableFactory to add Context

2020-02-09 Thread Benchao Li
+1 (non-binding). This is really helpful, thanks Jingsong for driving this.

Zhenghua Gao  于2020年2月10日周一 上午9:47写道:

> +1 (non-binding). Thanks for driving this.
>
> *Best Regards,*
> *Zhenghua Gao*
>
>
> On Fri, Feb 7, 2020 at 5:05 PM Leonard Xu  wrote:
>
> > +1(non-binding), nice design!
> > after read full discussion mail list.
> >
> > Best,
> > Leonard Xu
> >
> > > 在 2020年2月6日,23:12,Timo Walther  写道:
> > >
> > > +1
> > >
> > > On 06.02.20 05:54, Bowen Li wrote:
> > >> +1, LGTM
> > >> On Tue, Feb 4, 2020 at 11:28 PM Jark Wu  wrote:
> > >>> +1 form my side.
> > >>> Thanks for driving this.
> > >>>
> > >>> Btw, could you also attach a JIRA issue with the changes described in
> > it,
> > >>> so that users can find the issue through the mailing list in the
> > future.
> > >>>
> > >>> Best,
> > >>> Jark
> > >>>
> > >>> On Wed, 5 Feb 2020 at 13:38, Kurt Young  wrote:
> > >>>
> >  +1 from my side.
> > 
> >  Best,
> >  Kurt
> > 
> > 
> >  On Wed, Feb 5, 2020 at 10:59 AM Jingsong Li  >
> >  wrote:
> > 
> > > Hi all,
> > >
> > > Interface updated.
> > > Please re-vote.
> > >
> > > Best,
> > > Jingsong Lee
> > >
> > > On Tue, Feb 4, 2020 at 1:28 PM Jingsong Li  >
> >  wrote:
> > >
> > >> Hi all,
> > >>
> > >> I would like to start the vote for the improve of
> > >> TableFactory, which is discussed and
> > >> reached a consensus in the discussion thread[2].
> > >>
> > >> The vote will be open for at least 72 hours. I'll try to close it
> > >> unless there is an objection or not enough votes.
> > >>
> > >> [1]
> > >>
> > >
> > 
> > >>>
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improve-TableFactory-td36647.html
> > >>
> > >> Best,
> > >> Jingsong Lee
> > >>
> > >
> > >
> > > --
> > > Best, Jingsong Lee
> > >
> > 
> > >>>
> > >
> >
> >
>


-- 

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: libenc...@gmail.com; libenc...@pku.edu.cn


[jira] [Created] (FLINK-15961) Introduce Python Physical Correlate RelNodes which are containers for Python Python TableFunction

2020-02-09 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-15961:


 Summary: Introduce Python Physical Correlate RelNodes  which are 
containers for Python Python TableFunction
 Key: FLINK-15961
 URL: https://issues.apache.org/jira/browse/FLINK-15961
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python
Reporter: Huang Xingbo
 Fix For: 1.11.0


Dedicated Python Physical Correlate RelNodes should be introduced for Python 
TableFunction execution. These nodes exists as containers for Python 
TableFunctions which could be executed in a batch and then we can employ 
PythonTableFunctionOperator for Python TableFunction execution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLINK-15831: Add Docker image publication to release documentation

2020-02-09 Thread Yang Wang
+1 to make flink-docker repository self-contained, including the document.
And others refer
to it.


Best,
Yang

Till Rohrmann  于2020年2月9日周日 下午5:35写道:

> Sounds good to me Patrick. +1 for these changes.
>
> Cheers,
> Till
>
> On Fri, Feb 7, 2020 at 3:25 PM Patrick Lucas 
> wrote:
>
> > Hi all,
> >
> > For FLINK-15831[1], I think the way to start is for the flink-docker
> > repo[2] itself to sufficiently document the workflow for publishing new
> > Dockerfiles, and then update the Flink release guide in the wiki to refer
> > to this documentation and to include this step in the "Finalize the
> > release" checklist.
> >
> > To the first point, I have opened a PR[3] on flink-docker to improve its
> > documentation.
> >
> > And for updating the release guide, I propose the following changes:
> >
> > 1. Add a new subsection to "Finalize the release", prior to "Checklist to
> > proceed to the next step" with the following content:
> >
> > Publish the Dockerfiles for the new release
> > >
> > > Note: the official Dockerfiles fetch the binary distribution of the
> > target
> > > Flink version from an Apache mirror. After publishing the binary
> release
> > > artifacts, mirrors can take some hours to start serving the new
> > artifacts,
> > > so you may want to wait to do this step until you are ready to continue
> > > with the "Promote the release" steps below.
> > >
> > > Follow the instructions in the [flink-docker] repo to build the new
> > > Dockerfiles and send an updated manifest to Docker Hub so the new
> images
> > > are built and published.
> > >
> >
> > 2. Add an entry to the "Checklist to proceed to the next step" subsection
> > of "Finalize the release":
> >
> > >
> > >- Dockerfiles in flink-docker updated for the new Flink release and
> > >pull request opened on the Docker official-images with an updated
> > manifest
> > >
> > > Please let me know if you have any questions or suggestions to improve
> > this proposal.
> >
> > Thanks,
> > Patrick
> >
> > [1]https://issues.apache.org/jira/browse/FLINK-15831
> > [2]https://github.com/apache/flink-docker
> > [3]https://github.com/apache/flink-docker/pull/5
> >
>


[Question] Anyone know where I can find performance test result?

2020-02-09 Thread 闫旭
Hi there, 

I am just exploring the apache flink git repo and found the performance test. I 
have already test on my local machine, I’m wondering if we got online result? 

Thanks

Regards

Xu Yan

Re: [DISCUSS] Support Python ML Pipeline API

2020-02-09 Thread jincheng sun
Hi Hequn,

Thanks for bring up this discussion.

+1 for add Python ML Pipeline API, even though the Java pipeline API may
change.

I would like to suggest create a FLIP for this API changes. :)

Best,
Jincheng


Hequn Cheng  于2020年2月5日周三 下午5:24写道:

> Hi everyone,
>
> FLIP-39[1] rebuilds the Flink ML pipeline on top of TableAPI and introduces
> a new set of Java APIs. As Python is widely used in ML areas, providing
> Python ML Pipeline APIs for Flink can not only make it easier to write ML
> jobs for Python users but also broaden the adoption of Flink ML.
>
> Given this, Jincheng and I discussed offline about the support of Python ML
> Pipeline API and drafted a design doc[2]. We'd like to achieve three goals
> for supporting Python Pipeline API:
> - Add Python pipeline API according to Java pipeline API(we will adapt the
> Python pipeline API if Java pipeline API changes).
> - Support native Python Transformer/Estimator/Model, i.e., users can write
> not only Python Transformer/Estimator/Model wrappers for calling Java ones
> but also can write native Python Transformer/Estimator/Models.
> - Ease of use. Support keyword arguments when defining parameters.
>
> More details can be found in the design doc and we are looking forward to
> your feedback.
>
> Best,
> Hequn
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
> [2]
>
> https://docs.google.com/document/d/1fwSO5sRNWMoYuvNgfQJUV6N2n2q5UEVA4sezCljKcVQ/edit?usp=sharing
>


Re: [DISCUSS] Support scalar vectorized Python UDF in PyFlink

2020-02-09 Thread jincheng sun
Hi Dian,

Thanks for bring up this discussion. This is very important for the
ecological of PyFlink. Add support Pandas greatly enriches the available
UDF library of PyFlink and greatly improves the usability of PyFlink!

+1 for Support scalar vectorized Python UDF.

I think we should to create a FLIP for this big enhancements. :)

What do you think?

Best,
Jincheng



dianfu  于2020年2月5日周三 下午6:01写道:

> Hi Jingsong,
>
> Thanks a lot for the valuable feedback.
>
> 1. The configurations "python.fn-execution.bundle.size" and
> "python.fn-execution.arrow.batch.size" are used for separate purposes and I
> think they are both needed. If they are unified, the Python operator has to
> wait the execution results of the previous batch of elements before
> processing the next batch. This means that the Python UDF execution can not
> be pipelined between batches. With separate configuration, there will be no
> such problems.
> 2. It means that the Java operator will convert input elements to Arrow
> memory format and then send them to the Python worker, vice verse.
> Regarding to the zero-copy benefits provided by Arrow, we can gain them
> automatically using Arrow.
> 3. Good point! As all the classes of Python module is written in Java and
> it's not suggested to introduce new Scala classes, so I guess it's not easy
> to do so right now. But I think this is definitely a good improvement we
> can do in the future.
> 4. You're right and we will add a series of Arrow ColumnVectors for each
> type supported.
>
> Thanks,
> Dian
>
> > 在 2020年2月5日,下午4:57,Jingsong Li  写道:
> >
> > Hi Dian,
> >
> > +1 for this, thanks driving.
> > Documentation looks very good. I can imagine a huge performance
> improvement
> > and better integration to other Python libraries.
> >
> > A few thoughts:
> > - About data split: "python.fn-execution.arrow.batch.size", can we unify
> it
> > with "python.fn-execution.bundle.size"?
> > - Use of Apache Arrow as the exchange format: Do you mean Arrow support
> > zero-copy between Java and Python?
> > - ArrowFieldWriter seems we can implement it by code generation. But it
> is
> > OK to initial version with virtual function call.
> > - ColumnarRow for vectorization reading seems that we need implement
> > ArrowColumnVectors.
> >
> > Best,
> > Jingsong Lee
> >
> > On Wed, Feb 5, 2020 at 12:45 PM dianfu  wrote:
> >
> >> Hi all,
> >>
> >> Scalar Python UDF has already been supported in the coming release 1.10
> >> (FLIP-58[1]). It operates one row at a time. It works in the way that
> the
> >> Java operator serializes one input row to bytes and sends them to the
> >> Python worker; the Python worker deserializes the input row and
> evaluates
> >> the Python UDF with it; the result row is serialized and sent back to
> the
> >> Java operator.
> >>
> >> It suffers from the following problems:
> >> 1) High serialization/deserialization overhead
> >> 2) It’s difficult to leverage the popular Python libraries used by data
> >> scientists, such as Pandas, Numpy, etc which provide high performance
> data
> >> structure and functions.
> >>
> >> Jincheng and I have discussed offline and we want to introduce
> vectorized
> >> Python UDF to address the above problems. This feature has also been
> >> mentioned in the discussion thread about the Python API plan[2]. For
> >> vectorized Python UDF, a batch of rows are transferred between JVM and
> >> Python VM in columnar format. The batch of rows will be converted to a
> >> collection of Pandas.Series and given to the vectorized Python UDF which
> >> could then leverage the popular Python libraries such as Pandas, Numpy,
> etc
> >> for the Python UDF implementation.
> >>
> >> Please refer the design doc[3] for more details and welcome any
> feedback.
> >>
> >> Regards,
> >> Dian
> >>
> >> [1]
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table
> >> [2]
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-What-parts-of-the-Python-API-should-we-focus-on-next-tt36119.html
> >> [3]
> >>
> https://docs.google.com/document/d/1ls0mt96YV1LWPHfLOh8v7lpgh1KsCNx8B9RrW1ilnoE/edit#heading=h.qivada1u8hwd
> >>
> >>
> >
> > --
> > Best, Jingsong Lee
>
>


Re: [VOTE] Improve TableFactory to add Context

2020-02-09 Thread Jingsong Li
Thanks everyone,

The vote has passed. The result is following:

+1 (Binding): 4, Kurt, Jark, Bowen, Timo.
+1 (Non-Binding): 3, Leonard, Zhenghua, Benchao.

Best,
Jingsong Lee

On Mon, Feb 10, 2020 at 10:12 AM Benchao Li  wrote:

> +1 (non-binding). This is really helpful, thanks Jingsong for driving this.
>
> Zhenghua Gao  于2020年2月10日周一 上午9:47写道:
>
> > +1 (non-binding). Thanks for driving this.
> >
> > *Best Regards,*
> > *Zhenghua Gao*
> >
> >
> > On Fri, Feb 7, 2020 at 5:05 PM Leonard Xu  wrote:
> >
> > > +1(non-binding), nice design!
> > > after read full discussion mail list.
> > >
> > > Best,
> > > Leonard Xu
> > >
> > > > 在 2020年2月6日,23:12,Timo Walther  写道:
> > > >
> > > > +1
> > > >
> > > > On 06.02.20 05:54, Bowen Li wrote:
> > > >> +1, LGTM
> > > >> On Tue, Feb 4, 2020 at 11:28 PM Jark Wu  wrote:
> > > >>> +1 form my side.
> > > >>> Thanks for driving this.
> > > >>>
> > > >>> Btw, could you also attach a JIRA issue with the changes described
> in
> > > it,
> > > >>> so that users can find the issue through the mailing list in the
> > > future.
> > > >>>
> > > >>> Best,
> > > >>> Jark
> > > >>>
> > > >>> On Wed, 5 Feb 2020 at 13:38, Kurt Young  wrote:
> > > >>>
> > >  +1 from my side.
> > > 
> > >  Best,
> > >  Kurt
> > > 
> > > 
> > >  On Wed, Feb 5, 2020 at 10:59 AM Jingsong Li <
> jingsongl...@gmail.com
> > >
> > >  wrote:
> > > 
> > > > Hi all,
> > > >
> > > > Interface updated.
> > > > Please re-vote.
> > > >
> > > > Best,
> > > > Jingsong Lee
> > > >
> > > > On Tue, Feb 4, 2020 at 1:28 PM Jingsong Li <
> jingsongl...@gmail.com
> > >
> > >  wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> I would like to start the vote for the improve of
> > > >> TableFactory, which is discussed and
> > > >> reached a consensus in the discussion thread[2].
> > > >>
> > > >> The vote will be open for at least 72 hours. I'll try to close
> it
> > > >> unless there is an objection or not enough votes.
> > > >>
> > > >> [1]
> > > >>
> > > >
> > > 
> > > >>>
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improve-TableFactory-td36647.html
> > > >>
> > > >> Best,
> > > >> Jingsong Lee
> > > >>
> > > >
> > > >
> > > > --
> > > > Best, Jingsong Lee
> > > >
> > > 
> > > >>>
> > > >
> > >
> > >
> >
>
>
> --
>
> Benchao Li
> School of Electronics Engineering and Computer Science, Peking University
> Tel:+86-15650713730
> Email: libenc...@gmail.com; libenc...@pku.edu.cn
>


-- 
Best, Jingsong Lee


Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-09 Thread Hequn Cheng
Hi Rong,

Thanks a lot for joining the discussion!

It would be great if we can have a long term plan. My intention is to
provide a way for users to add dependencies of Flink ML, either through the
opt or download page. This would be more and more critical along with the
improvement of the Flink ML, as you said there are multiple PRs under
review and I'm also going to support Python Pipeline API recently[1].

Meanwhile, it also makes sense to include the API into the opt, so it would
probably not break the long term plan.
However, even find something wrong in the future, we can revisit this
easily instead of blocking the improvement for users. What do you think?

Best,
Hequn

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-Python-ML-Pipeline-API-td37291.html

On Sat, Feb 8, 2020 at 1:57 AM Rong Rong  wrote:

> CC @Xu Yang 
>
> Thanks for starting the discussion @Hequn Cheng  and
> sorry for joining the discussion late.
>
> I've mainly helped merging the code in flink-ml-api and flink-ml-lib in
> the past several months.
> IMO the flink-ml-api are an extension on top of the table API and agree
> that it should be treated as a part of the "core" core.
>
> However, I think given the fact that there are multiple PRs still under
> review [1], is it a better idea to come up with a long term plan first
> before make the decision to moving it to /opt now?
>
>
> --
> Rong
>
> [1]
> https://github.com/apache/flink/pulls?utf8=%E2%9C%93&q=is%3Apr+is%3Aopen+label%3Acomponent%3DLibrary%2FMachineLearning+
>
> On Fri, Feb 7, 2020 at 5:54 AM Hequn Cheng  wrote:
>
>> Hi,
>>
>> @Till Rohrmann  Thanks for the great inputs. I
>> agree
>> with you that we should have a long term plan for this. It definitely
>> deserves another discussion.
>> @Jeff Zhang  Thanks for your reports and ideas. It's a
>> good idea to improve the error messages. Do we have any JIRAs for it or
>> maybe we can create one for it.
>>
>> Thank you again for your feedback and suggestions. I will go on with the
>> PR. Thanks!
>>
>> Best,
>> Hequn
>>
>> On Thu, Feb 6, 2020 at 11:51 PM Jeff Zhang  wrote:
>>
>> > I have another concern which may not be closely related to this thread.
>> > Since flink doesn't include all the necessary jars, I think it is
>> critical
>> > for flink to display meaningful error message when any class is missing.
>> > e.g. Here's the error message when I use kafka but miss
>> > including flink-json.  To be honest, the kind of error message is hard
>> to
>> > understand for new users.
>> >
>> >
>> > Reason: No factory implements
>> > 'org.apache.flink.table.factories.DeserializationSchemaFactory'. The
>> > following properties are requested:
>> > connector.properties.bootstrap.servers=localhost:9092
>> > connector.properties.group.id=testGroup
>> > connector.properties.zookeeper.connect=localhost:2181
>> > connector.startup-mode=earliest-offset connector.topic=generated.events
>> > connector.type=kafka connector.version=universal format.type=json
>> > schema.0.data-type=VARCHAR(2147483647) schema.0.name=status
>> > schema.1.data-type=VARCHAR(2147483647) schema.1.name=direction
>> > schema.2.data-type=BIGINT schema.2.name=event_ts update-mode=append The
>> > following factories have been considered:
>> > org.apache.flink.table.catalog.hive.factories.HiveCatalogFactory
>> > org.apache.flink.table.module.hive.HiveModuleFactory
>> > org.apache.flink.table.module.CoreModuleFactory
>> > org.apache.flink.table.catalog.GenericInMemoryCatalogFactory
>> > org.apache.flink.table.sources.CsvBatchTableSourceFactory
>> > org.apache.flink.table.sources.CsvAppendTableSourceFactory
>> > org.apache.flink.table.sinks.CsvBatchTableSinkFactory
>> > org.apache.flink.table.sinks.CsvAppendTableSinkFactory
>> > org.apache.flink.table.planner.delegation.BlinkPlannerFactory
>> > org.apache.flink.table.planner.delegation.BlinkExecutorFactory
>> > org.apache.flink.table.planner.StreamPlannerFactory
>> > org.apache.flink.table.executor.StreamExecutorFactory
>> > org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactory
>> at
>> >
>> >
>> org.apache.flink.table.factories.TableFactoryService.filterByFactoryClass(TableFactoryService.java:238)
>> > at
>> >
>> >
>> org.apache.flink.table.factories.TableFactoryService.filter(TableFactoryService.java:185)
>> > at
>> >
>> >
>> org.apache.flink.table.factories.TableFactoryService.findSingleInternal(TableFactoryService.java:143)
>> > at
>> >
>> >
>> org.apache.flink.table.factories.TableFactoryService.find(TableFactoryService.java:113)
>> > at
>> >
>> >
>> org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactoryBase.getDeserializationSchema(KafkaTableSourceSinkFactoryBase.java:277)
>> > at
>> >
>> >
>> org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactoryBase.createStreamTableSource(KafkaTableSourceSinkFactoryBase.java:161)
>> > at
>> >
>> >
>> org.apache.flink.table.factories.StreamTableSourceFactory.createTableSource(StreamTableSourceFactor

Re: [DISCUSS] Support scalar vectorized Python UDF in PyFlink

2020-02-09 Thread Hequn Cheng
Hi Dian,

Thanks a lot for bringing up the discussion!

It is great to see the Pandas UDFs feature is going to be introduced. I
think this would improve the performance and also the usability of
user-defined functions (UDFs) in Python.
One little suggestion: maybe it would be nice if we can add some
performance explanation in the document? (I just very curious:))

+1 to create a FLIP for this big enhancement.

Best,
Hequn

On Mon, Feb 10, 2020 at 11:15 AM jincheng sun 
wrote:

> Hi Dian,
>
> Thanks for bring up this discussion. This is very important for the
> ecological of PyFlink. Add support Pandas greatly enriches the available
> UDF library of PyFlink and greatly improves the usability of PyFlink!
>
> +1 for Support scalar vectorized Python UDF.
>
> I think we should to create a FLIP for this big enhancements. :)
>
> What do you think?
>
> Best,
> Jincheng
>
>
>
> dianfu  于2020年2月5日周三 下午6:01写道:
>
> > Hi Jingsong,
> >
> > Thanks a lot for the valuable feedback.
> >
> > 1. The configurations "python.fn-execution.bundle.size" and
> > "python.fn-execution.arrow.batch.size" are used for separate purposes
> and I
> > think they are both needed. If they are unified, the Python operator has
> to
> > wait the execution results of the previous batch of elements before
> > processing the next batch. This means that the Python UDF execution can
> not
> > be pipelined between batches. With separate configuration, there will be
> no
> > such problems.
> > 2. It means that the Java operator will convert input elements to Arrow
> > memory format and then send them to the Python worker, vice verse.
> > Regarding to the zero-copy benefits provided by Arrow, we can gain them
> > automatically using Arrow.
> > 3. Good point! As all the classes of Python module is written in Java and
> > it's not suggested to introduce new Scala classes, so I guess it's not
> easy
> > to do so right now. But I think this is definitely a good improvement we
> > can do in the future.
> > 4. You're right and we will add a series of Arrow ColumnVectors for each
> > type supported.
> >
> > Thanks,
> > Dian
> >
> > > 在 2020年2月5日,下午4:57,Jingsong Li  写道:
> > >
> > > Hi Dian,
> > >
> > > +1 for this, thanks driving.
> > > Documentation looks very good. I can imagine a huge performance
> > improvement
> > > and better integration to other Python libraries.
> > >
> > > A few thoughts:
> > > - About data split: "python.fn-execution.arrow.batch.size", can we
> unify
> > it
> > > with "python.fn-execution.bundle.size"?
> > > - Use of Apache Arrow as the exchange format: Do you mean Arrow support
> > > zero-copy between Java and Python?
> > > - ArrowFieldWriter seems we can implement it by code generation. But it
> > is
> > > OK to initial version with virtual function call.
> > > - ColumnarRow for vectorization reading seems that we need implement
> > > ArrowColumnVectors.
> > >
> > > Best,
> > > Jingsong Lee
> > >
> > > On Wed, Feb 5, 2020 at 12:45 PM dianfu  wrote:
> > >
> > >> Hi all,
> > >>
> > >> Scalar Python UDF has already been supported in the coming release
> 1.10
> > >> (FLIP-58[1]). It operates one row at a time. It works in the way that
> > the
> > >> Java operator serializes one input row to bytes and sends them to the
> > >> Python worker; the Python worker deserializes the input row and
> > evaluates
> > >> the Python UDF with it; the result row is serialized and sent back to
> > the
> > >> Java operator.
> > >>
> > >> It suffers from the following problems:
> > >> 1) High serialization/deserialization overhead
> > >> 2) It’s difficult to leverage the popular Python libraries used by
> data
> > >> scientists, such as Pandas, Numpy, etc which provide high performance
> > data
> > >> structure and functions.
> > >>
> > >> Jincheng and I have discussed offline and we want to introduce
> > vectorized
> > >> Python UDF to address the above problems. This feature has also been
> > >> mentioned in the discussion thread about the Python API plan[2]. For
> > >> vectorized Python UDF, a batch of rows are transferred between JVM and
> > >> Python VM in columnar format. The batch of rows will be converted to a
> > >> collection of Pandas.Series and given to the vectorized Python UDF
> which
> > >> could then leverage the popular Python libraries such as Pandas,
> Numpy,
> > etc
> > >> for the Python UDF implementation.
> > >>
> > >> Please refer the design doc[3] for more details and welcome any
> > feedback.
> > >>
> > >> Regards,
> > >> Dian
> > >>
> > >> [1]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table
> > >> [2]
> > >>
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-What-parts-of-the-Python-API-should-we-focus-on-next-tt36119.html
> > >> [3]
> > >>
> >
> https://docs.google.com/document/d/1ls0mt96YV1LWPHfLOh8v7lpgh1KsCNx8B9RrW1ilnoE/edit#heading=h.qivada1u8hwd
> > >>
> > >>
> > >
> > > --
> > > Best, Jingsong Lee
> >
> >
>


Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-09 Thread Zhu Zhu
My bad. The missing commit info is caused by building from the src code zip
which does not contain the git info.
So this is not a problem.

+1 (binding) for rc3
Here's what's were verified :
 * built successfully from the source code
 * run a sample streaming and a batch job with parallelism=1000 on yarn
cluster, with the new scheduler and legacy scheduler, the job runs well
(tuned some resource configs to enable the jobs to work well)
 * killed TMs to trigger failures, the jobs can finally recover from the
failures

Thanks,
Zhu Zhu

Zhu Zhu  于2020年2月10日周一 上午12:31写道:

> The commit info is shown as  on the web UI and in logs.
> Not sure if it's a common issue or just happens to my build only.
>
> Thanks,
> Zhu Zhu
>
> aihua li  于2020年2月9日周日 下午7:42写道:
>
>> Yes, but the results you see in the Performance Code Speed Center [3]
>> skip FLIP-49.
>>  The results of the default configurations are overwritten by the latest
>> results.
>>
>> > 2020年2月9日 下午5:29,Yu Li  写道:
>> >
>> > Thanks for the efforts Aihua! These could definitely improve our RC
>> test coverage!
>> >
>> > Just to confirm, that the stability tests were executed with the same
>> test suite for Alibaba production usage, and the e2e performance one was
>> executed with the test suite proposed in FLIP-83 [1] and FLINK-14917 [2],
>> and the result could also be observed from our performance code-speed
>> center [3], right?
>> >
>> > Thanks.
>> >
>> > Best Regards,
>> > Yu
>> >
>> > [1]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
>> <
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
>> >
>> > [2] https://issues.apache.org/jira/browse/FLINK-14917 <
>> https://issues.apache.org/jira/browse/FLINK-14917>
>> > [3] https://s.apache.org/nglhm 
>> >
>> > On Sun, 9 Feb 2020 at 11:20, aihua li > liaihua1...@gmail.com>> wrote:
>> > +1 (non-binging)
>> >
>> > I ran stability tests and end-to-end performance tests in branch
>> release-1.10.0-rc3,both of them passed.
>> >
>> > Stability test: It mainly checks The flink job can revover from
>> various abnormal situations which concluding disk full,
>> > network interruption, zk unable to connect, rpc message timeout, etc.
>> > If job can't be recoverd it means test failed.
>> > The test passed after running 5 hours.
>> >
>> > End-to-end performance test: It containes 32 test scenarios which
>> designed in FLIP-83.
>> > Test results: The performance regressions about 3% from 1.9.1 if uses
>> default parameters;
>> > The result:
>> >
>> >  if skips FLIP-49 (add parameters:taskmanager.memory.managed.fraction:
>> 0,taskmanager.memory.flink.size: 1568m in flink-conf.yaml),
>> >  the performance improves about 5% from 1.9.1. The result:
>> >
>> >
>> > I confirm it with @Xintong Song <
>> https://cwiki.apache.org/confluence/display/~xintongsong> that the
>> result  makes sense.
>> >
>> >> 2020年2月8日 上午5:54,Gary Yao mailto:g...@apache.org>>
>> 写道:
>> >>
>> >> Hi everyone,
>> >> Please review and vote on the release candidate #3 for the version
>> 1.10.0,
>> >> as follows:
>> >> [ ] +1, Approve the release
>> >> [ ] -1, Do not approve the release (please provide specific comments)
>> >>
>> >>
>> >> The complete staging area is available for your review, which includes:
>> >> * JIRA release notes [1],
>> >> * the official Apache source release and binary convenience releases
>> to be
>> >> deployed to dist.apache.org  [2], which are
>> signed with the key with
>> >> fingerprint BB137807CEFBE7DD2616556710B12A1F89C115E8 [3],
>> >> * all artifacts to be deployed to the Maven Central Repository [4],
>> >> * source code tag "release-1.10.0-rc3" [5],
>> >> * website pull request listing the new release and adding announcement
>> blog
>> >> post [6][7].
>> >>
>> >> The vote will be open for at least 72 hours. It is adopted by majority
>> >> approval, with at least 3 PMC affirmative votes.
>> >>
>> >> Thanks,
>> >> Yu & Gary
>> >>
>> >> [1]
>> >>
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845
>> <
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845
>> >
>> >> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc3/ <
>> https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc3/>
>> >> [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
>> https://dist.apache.org/repos/dist/release/flink/KEYS>
>> >> [4]
>> https://repository.apache.org/content/repositories/orgapacheflink-1333 <
>> https://repository.apache.org/content/repositories/orgapacheflink-1333>
>> >> [5] https://github.com/apache/flink/releases/tag/release-1.10.0-rc3 <
>> https://github.com/apache/flink/releases/tag/release-1.10.0-rc3>
>> >> [6] https://github.com/apache/flink-web/pull/302 <
>> https://github.com/apache/flink-web/pull/302>
>> >> [7] https://github.com/apache/flink-web/pull/301

Re: [DISCUSS] Support Python ML Pipeline API

2020-02-09 Thread Dian Fu
Hi Hequn,

Thanks for bringing up the discussion. +1 to this feature. The design LGTM.
It's great that the Python ML users could use both the Java Pipeline
Transformer/Estimator/Model classes and the Python
Pipeline Transformer/Estimator/Model in the same job.

Regards,
Dian

On Mon, Feb 10, 2020 at 11:08 AM jincheng sun 
wrote:

> Hi Hequn,
>
> Thanks for bring up this discussion.
>
> +1 for add Python ML Pipeline API, even though the Java pipeline API may
> change.
>
> I would like to suggest create a FLIP for this API changes. :)
>
> Best,
> Jincheng
>
>
> Hequn Cheng  于2020年2月5日周三 下午5:24写道:
>
> > Hi everyone,
> >
> > FLIP-39[1] rebuilds the Flink ML pipeline on top of TableAPI and
> introduces
> > a new set of Java APIs. As Python is widely used in ML areas, providing
> > Python ML Pipeline APIs for Flink can not only make it easier to write ML
> > jobs for Python users but also broaden the adoption of Flink ML.
> >
> > Given this, Jincheng and I discussed offline about the support of Python
> ML
> > Pipeline API and drafted a design doc[2]. We'd like to achieve three
> goals
> > for supporting Python Pipeline API:
> > - Add Python pipeline API according to Java pipeline API(we will adapt
> the
> > Python pipeline API if Java pipeline API changes).
> > - Support native Python Transformer/Estimator/Model, i.e., users can
> write
> > not only Python Transformer/Estimator/Model wrappers for calling Java
> ones
> > but also can write native Python Transformer/Estimator/Models.
> > - Ease of use. Support keyword arguments when defining parameters.
> >
> > More details can be found in the design doc and we are looking forward to
> > your feedback.
> >
> > Best,
> > Hequn
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
> > [2]
> >
> >
> https://docs.google.com/document/d/1fwSO5sRNWMoYuvNgfQJUV6N2n2q5UEVA4sezCljKcVQ/edit?usp=sharing
> >
>


Re: [DISCUSS] Support scalar vectorized Python UDF in PyFlink

2020-02-09 Thread Jingsong Li
Thanks Dian for your reply.

+1 to create a FLIP too.

About "python.fn-execution.bundle.size" and
"python.fn-execution.arrow.batch.size", I got what are you mean about
"pipeline". I agree.
It seems that a batch should always in a bundle. Bundle size should always
bigger than batch size. (if a batch can not cross bundle).
Can you explain this relationship to the document?

I think default value is a very important thing, we can discuss:
- In the batch world, vectorization batch size is about 1024+. What do you
think about the default value of "batch"?
- Can we only configure one parameter and calculate another automatically?
For example, if we just want to "pipeline", "bundle.size" is twice as much
as "batch.size", is this work?

Best,
Jingsong Lee

On Mon, Feb 10, 2020 at 11:55 AM Hequn Cheng  wrote:

> Hi Dian,
>
> Thanks a lot for bringing up the discussion!
>
> It is great to see the Pandas UDFs feature is going to be introduced. I
> think this would improve the performance and also the usability of
> user-defined functions (UDFs) in Python.
> One little suggestion: maybe it would be nice if we can add some
> performance explanation in the document? (I just very curious:))
>
> +1 to create a FLIP for this big enhancement.
>
> Best,
> Hequn
>
> On Mon, Feb 10, 2020 at 11:15 AM jincheng sun 
> wrote:
>
> > Hi Dian,
> >
> > Thanks for bring up this discussion. This is very important for the
> > ecological of PyFlink. Add support Pandas greatly enriches the available
> > UDF library of PyFlink and greatly improves the usability of PyFlink!
> >
> > +1 for Support scalar vectorized Python UDF.
> >
> > I think we should to create a FLIP for this big enhancements. :)
> >
> > What do you think?
> >
> > Best,
> > Jincheng
> >
> >
> >
> > dianfu  于2020年2月5日周三 下午6:01写道:
> >
> > > Hi Jingsong,
> > >
> > > Thanks a lot for the valuable feedback.
> > >
> > > 1. The configurations "python.fn-execution.bundle.size" and
> > > "python.fn-execution.arrow.batch.size" are used for separate purposes
> > and I
> > > think they are both needed. If they are unified, the Python operator
> has
> > to
> > > wait the execution results of the previous batch of elements before
> > > processing the next batch. This means that the Python UDF execution can
> > not
> > > be pipelined between batches. With separate configuration, there will
> be
> > no
> > > such problems.
> > > 2. It means that the Java operator will convert input elements to Arrow
> > > memory format and then send them to the Python worker, vice verse.
> > > Regarding to the zero-copy benefits provided by Arrow, we can gain them
> > > automatically using Arrow.
> > > 3. Good point! As all the classes of Python module is written in Java
> and
> > > it's not suggested to introduce new Scala classes, so I guess it's not
> > easy
> > > to do so right now. But I think this is definitely a good improvement
> we
> > > can do in the future.
> > > 4. You're right and we will add a series of Arrow ColumnVectors for
> each
> > > type supported.
> > >
> > > Thanks,
> > > Dian
> > >
> > > > 在 2020年2月5日,下午4:57,Jingsong Li  写道:
> > > >
> > > > Hi Dian,
> > > >
> > > > +1 for this, thanks driving.
> > > > Documentation looks very good. I can imagine a huge performance
> > > improvement
> > > > and better integration to other Python libraries.
> > > >
> > > > A few thoughts:
> > > > - About data split: "python.fn-execution.arrow.batch.size", can we
> > unify
> > > it
> > > > with "python.fn-execution.bundle.size"?
> > > > - Use of Apache Arrow as the exchange format: Do you mean Arrow
> support
> > > > zero-copy between Java and Python?
> > > > - ArrowFieldWriter seems we can implement it by code generation. But
> it
> > > is
> > > > OK to initial version with virtual function call.
> > > > - ColumnarRow for vectorization reading seems that we need implement
> > > > ArrowColumnVectors.
> > > >
> > > > Best,
> > > > Jingsong Lee
> > > >
> > > > On Wed, Feb 5, 2020 at 12:45 PM dianfu  wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> Scalar Python UDF has already been supported in the coming release
> > 1.10
> > > >> (FLIP-58[1]). It operates one row at a time. It works in the way
> that
> > > the
> > > >> Java operator serializes one input row to bytes and sends them to
> the
> > > >> Python worker; the Python worker deserializes the input row and
> > > evaluates
> > > >> the Python UDF with it; the result row is serialized and sent back
> to
> > > the
> > > >> Java operator.
> > > >>
> > > >> It suffers from the following problems:
> > > >> 1) High serialization/deserialization overhead
> > > >> 2) It’s difficult to leverage the popular Python libraries used by
> > data
> > > >> scientists, such as Pandas, Numpy, etc which provide high
> performance
> > > data
> > > >> structure and functions.
> > > >>
> > > >> Jincheng and I have discussed offline and we want to introduce
> > > vectorized
> > > >> Python UDF to address the above problems. This feature has also been
> > > >> mentioned i

Notice: moving fix version of unresolved issues to 1.11.0/1.10.1

2020-02-09 Thread Yu Li
Hi All,

Since we're about to release 1.10.0 (the vote of RC3 [1] is in good shape
although still not concluded), we plan to change fix version of the
unresolved issues to 1.11.0/1.10.1. We will start from issues with "Open"
status, and then the "In-Progress" ones. Any concerns, please let us know,
thanks.

Best Regards,
Gary & Yu

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-10-0-release-candidate-3-td37405.html


Re: [DISCUSS] Support Python ML Pipeline API

2020-02-09 Thread Hequn Cheng
Hi everyone,

Thanks a lot for your feedback. I have created the FLIP[1].

Best,
Hequn

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP+96%3A+Support+Python+ML+Pipeline+API

On Mon, Feb 10, 2020 at 12:29 PM Dian Fu  wrote:

> Hi Hequn,
>
> Thanks for bringing up the discussion. +1 to this feature. The design LGTM.
> It's great that the Python ML users could use both the Java Pipeline
> Transformer/Estimator/Model classes and the Python
> Pipeline Transformer/Estimator/Model in the same job.
>
> Regards,
> Dian
>
> On Mon, Feb 10, 2020 at 11:08 AM jincheng sun 
> wrote:
>
> > Hi Hequn,
> >
> > Thanks for bring up this discussion.
> >
> > +1 for add Python ML Pipeline API, even though the Java pipeline API may
> > change.
> >
> > I would like to suggest create a FLIP for this API changes. :)
> >
> > Best,
> > Jincheng
> >
> >
> > Hequn Cheng  于2020年2月5日周三 下午5:24写道:
> >
> > > Hi everyone,
> > >
> > > FLIP-39[1] rebuilds the Flink ML pipeline on top of TableAPI and
> > introduces
> > > a new set of Java APIs. As Python is widely used in ML areas, providing
> > > Python ML Pipeline APIs for Flink can not only make it easier to write
> ML
> > > jobs for Python users but also broaden the adoption of Flink ML.
> > >
> > > Given this, Jincheng and I discussed offline about the support of
> Python
> > ML
> > > Pipeline API and drafted a design doc[2]. We'd like to achieve three
> > goals
> > > for supporting Python Pipeline API:
> > > - Add Python pipeline API according to Java pipeline API(we will adapt
> > the
> > > Python pipeline API if Java pipeline API changes).
> > > - Support native Python Transformer/Estimator/Model, i.e., users can
> > write
> > > not only Python Transformer/Estimator/Model wrappers for calling Java
> > ones
> > > but also can write native Python Transformer/Estimator/Models.
> > > - Ease of use. Support keyword arguments when defining parameters.
> > >
> > > More details can be found in the design doc and we are looking forward
> to
> > > your feedback.
> > >
> > > Best,
> > > Hequn
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
> > > [2]
> > >
> > >
> >
> https://docs.google.com/document/d/1fwSO5sRNWMoYuvNgfQJUV6N2n2q5UEVA4sezCljKcVQ/edit?usp=sharing
> > >
> >
>


Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-09 Thread Congxian Qiu
+1 (non-binding) for rc3

- build source successfully (inlcude test)
- ran e2e test locally
- test pojo serializer upgrade manually by running flink job.

Best,
Congxian


Zhu Zhu  于2020年2月10日周一 下午12:28写道:

> My bad. The missing commit info is caused by building from the src code zip
> which does not contain the git info.
> So this is not a problem.
>
> +1 (binding) for rc3
> Here's what's were verified :
>  * built successfully from the source code
>  * run a sample streaming and a batch job with parallelism=1000 on yarn
> cluster, with the new scheduler and legacy scheduler, the job runs well
> (tuned some resource configs to enable the jobs to work well)
>  * killed TMs to trigger failures, the jobs can finally recover from the
> failures
>
> Thanks,
> Zhu Zhu
>
> Zhu Zhu  于2020年2月10日周一 上午12:31写道:
>
> > The commit info is shown as  on the web UI and in logs.
> > Not sure if it's a common issue or just happens to my build only.
> >
> > Thanks,
> > Zhu Zhu
> >
> > aihua li  于2020年2月9日周日 下午7:42写道:
> >
> >> Yes, but the results you see in the Performance Code Speed Center [3]
> >> skip FLIP-49.
> >>  The results of the default configurations are overwritten by the latest
> >> results.
> >>
> >> > 2020年2月9日 下午5:29,Yu Li  写道:
> >> >
> >> > Thanks for the efforts Aihua! These could definitely improve our RC
> >> test coverage!
> >> >
> >> > Just to confirm, that the stability tests were executed with the same
> >> test suite for Alibaba production usage, and the e2e performance one was
> >> executed with the test suite proposed in FLIP-83 [1] and FLINK-14917
> [2],
> >> and the result could also be observed from our performance code-speed
> >> center [3], right?
> >> >
> >> > Thanks.
> >> >
> >> > Best Regards,
> >> > Yu
> >> >
> >> > [1]
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> >> <
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> >> >
> >> > [2] https://issues.apache.org/jira/browse/FLINK-14917 <
> >> https://issues.apache.org/jira/browse/FLINK-14917>
> >> > [3] https://s.apache.org/nglhm 
> >> >
> >> > On Sun, 9 Feb 2020 at 11:20, aihua li  >> liaihua1...@gmail.com>> wrote:
> >> > +1 (non-binging)
> >> >
> >> > I ran stability tests and end-to-end performance tests in branch
> >> release-1.10.0-rc3,both of them passed.
> >> >
> >> > Stability test: It mainly checks The flink job can revover from
> >> various abnormal situations which concluding disk full,
> >> > network interruption, zk unable to connect, rpc message timeout, etc.
> >> > If job can't be recoverd it means test failed.
> >> > The test passed after running 5 hours.
> >> >
> >> > End-to-end performance test: It containes 32 test scenarios which
> >> designed in FLIP-83.
> >> > Test results: The performance regressions about 3% from 1.9.1 if uses
> >> default parameters;
> >> > The result:
> >> >
> >> >  if skips FLIP-49 (add parameters:taskmanager.memory.managed.fraction:
> >> 0,taskmanager.memory.flink.size: 1568m in flink-conf.yaml),
> >> >  the performance improves about 5% from 1.9.1. The result:
> >> >
> >> >
> >> > I confirm it with @Xintong Song <
> >> https://cwiki.apache.org/confluence/display/~xintongsong> that the
> >> result  makes sense.
> >> >
> >> >> 2020年2月8日 上午5:54,Gary Yao mailto:g...@apache.org>>
> >> 写道:
> >> >>
> >> >> Hi everyone,
> >> >> Please review and vote on the release candidate #3 for the version
> >> 1.10.0,
> >> >> as follows:
> >> >> [ ] +1, Approve the release
> >> >> [ ] -1, Do not approve the release (please provide specific comments)
> >> >>
> >> >>
> >> >> The complete staging area is available for your review, which
> includes:
> >> >> * JIRA release notes [1],
> >> >> * the official Apache source release and binary convenience releases
> >> to be
> >> >> deployed to dist.apache.org  [2], which are
> >> signed with the key with
> >> >> fingerprint BB137807CEFBE7DD2616556710B12A1F89C115E8 [3],
> >> >> * all artifacts to be deployed to the Maven Central Repository [4],
> >> >> * source code tag "release-1.10.0-rc3" [5],
> >> >> * website pull request listing the new release and adding
> announcement
> >> blog
> >> >> post [6][7].
> >> >>
> >> >> The vote will be open for at least 72 hours. It is adopted by
> majority
> >> >> approval, with at least 3 PMC affirmative votes.
> >> >>
> >> >> Thanks,
> >> >> Yu & Gary
> >> >>
> >> >> [1]
> >> >>
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845
> >> <
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845
> >> >
> >> >> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc3/ <
> >> https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc3/>
> >> >> [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
> >> https://dist.apache.org/repos/dist/release/flink/KEYS

[jira] [Created] (FLINK-15962) Reduce the default chunk size in netty stack

2020-02-09 Thread zhijiang (Jira)
zhijiang created FLINK-15962:


 Summary: Reduce the default chunk size in netty stack
 Key: FLINK-15962
 URL: https://issues.apache.org/jira/browse/FLINK-15962
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Network
Reporter: zhijiang


The current default chunk size inside netty stack is 16MB, and one arena would 
create multiple chunks based on demands. But it does not mean that the new 
chunk is created after the previous one was fully exhausted, which would 
further boost the direct memory overhead.

In order to decrease the total memory overhead caused by netty, we can reduce 
the default chunk size and measure the effect via existing e2e, and also verify 
the performance concern via the benchmarks.

This improvement is orthogonal to the 
[FLINK-10742|https://issues.apache.org/jira/browse/FLINK-10742] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15963) Reuse the same ByteBuf while writing the BufferResponse header

2020-02-09 Thread zhijiang (Jira)
zhijiang created FLINK-15963:


 Summary: Reuse the same ByteBuf while writing the BufferResponse 
header
 Key: FLINK-15963
 URL: https://issues.apache.org/jira/browse/FLINK-15963
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Network
Reporter: zhijiang


On sender side while writing the BufferResponse message, it always allocates 
the new direct ByteBuf from netty allocator to write header part for every 
message.

Considering only one message is written in one channel at the same time, then 
we can make use of a fixed ByteBuf to write header part for all the 
BufferResponse messages. We can verify how it effects the performance in 
practice/benchmark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[VOTE] FLIP-55: Introduction of a Table API Java Expression DSL

2020-02-09 Thread Dawid Wysakowicz
Hi all,

I wanted to resurrect the thread about introducing a Java Expression
DSL. Please see the updated flip page[1]. Most of the flip was concluded
in previous discussion thread. The major changes since then are:

* accepting java.lang.Object in the Java DSL

* adding $ interpolation for a column in the Scala DSL

I think it's important to move those changes forward as it makes it
easier to transition to the new type system (Java parser supports only
the old type system stack for now) that we are working on for the past
releases.

Because the previous discussion thread was rather conclusive I want to
start already with a vote. If you think we need another round of
discussion, feel free to say so.


The vote will last for at least 72 hours, following the consensus voting
process.

FLIP wiki:

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-55%3A+Introduction+of+a+Table+API+Java+Expression+DSL


Discussion thread:

https://lists.apache.org/thread.html/eb5e7b0579e5f1da1e9bf1ab4e4b86dba737946f0261d94d8c30521e@%3Cdev.flink.apache.org%3E






signature.asc
Description: OpenPGP digital signature


[jira] [Created] (FLINK-15964) Getting previous stage in notFollowedBy may throw exception

2020-02-09 Thread shuai.xu (Jira)
shuai.xu created FLINK-15964:


 Summary: Getting previous stage in notFollowedBy may throw 
exception
 Key: FLINK-15964
 URL: https://issues.apache.org/jira/browse/FLINK-15964
 Project: Flink
  Issue Type: Bug
  Components: Library / CEP
Affects Versions: 1.9.0
Reporter: shuai.xu


In a notFollowedBy() condition, it may throw exception if trying to get value 
from previous stage for comparison.

For example:

Pattern pattern = Pattern.begin("start", 
AfterMatchSkipStrategy.skipPastLastEvent())
 .notFollowedBy("not").where(new IterativeCondition() {
 private static final long serialVersionUID = -4702359359303151881L;

 @Override
 public boolean filter(Event value, Context ctx) throws Exception {
 return 
value.getName().equals(ctx.getEventsForPattern("start").iterator().next().getName());
 }
 })
 .followedBy("middle").where(new IterativeCondition() {
 @Override
 public boolean filter(Event value, Context ctx) throws Exception {
 return value.getName().equals("b");
 }
 });

with inputs:
Event a = new Event(40, "a", 1.0);
Event b1 = new Event(41, "a", 2.0);
Event b2 = new Event(43, "b", 3.0);

It will throw 

org.apache.flink.util.FlinkRuntimeException: Failure happened in filter 
function.org.apache.flink.util.FlinkRuntimeException: Failure happened in 
filter function.
 at org.apache.flink.cep.nfa.NFA.findFinalStateAfterProceed(NFA.java:698) at 
org.apache.flink.cep.nfa.NFA.computeNextStates(NFA.java:628) at 
org.apache.flink.cep.nfa.NFA.doProcess(NFA.java:292) at 
org.apache.flink.cep.nfa.NFA.process(NFA.java:228) at 
org.apache.flink.cep.utils.NFATestHarness.consumeRecord(NFATestHarness.java:107)
 at 
org.apache.flink.cep.utils.NFATestHarness.feedRecord(NFATestHarness.java:84) at 
org.apache.flink.cep.utils.NFATestHarness.feedRecords(NFATestHarness.java:77) 
at org.apache.flink.cep.nfa.NFAITCase.testEndWithNotFollow(NFAITCase.java:2914) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at 
org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
org.junit.runner.JUnitCore.run(JUnitCore.java:137) at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
 at 
com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
 at 
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
 at 
com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)Caused 
by: java.util.NoSuchElementException at 
java.util.Collections$EmptyIterator.next(Collections.java:4189) at 
org.apache.flink.cep.nfa.NFAITCase$154.filter(NFAITCase.java:2884) at 
org.apache.flink.cep.nfa.NFAITCase$154.filter(NFAITCase.java:2879) at 
org.apache.flink.cep.nfa.NFA.checkFilterCondition(NFA.java:752) at 
org.apache.flink.cep.nfa.NFA.findFinalStateAfterProceed(NFA.java:688) ... 33 
more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)