Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-10 Thread Rui Li
+1 (non-binding) for rc3


   - Built from source (release tag)
   - Downloaded binaries and verified Hive catalog/connector by manually
   running some queries against Hive tables


On Mon, Feb 10, 2020 at 2:28 PM Congxian Qiu  wrote:

> +1 (non-binding) for rc3
>
> - build source successfully (inlcude test)
> - ran e2e test locally
> - test pojo serializer upgrade manually by running flink job.
>
> Best,
> Congxian
>
>
> Zhu Zhu  于2020年2月10日周一 下午12:28写道:
>
> > My bad. The missing commit info is caused by building from the src code
> zip
> > which does not contain the git info.
> > So this is not a problem.
> >
> > +1 (binding) for rc3
> > Here's what's were verified :
> >  * built successfully from the source code
> >  * run a sample streaming and a batch job with parallelism=1000 on yarn
> > cluster, with the new scheduler and legacy scheduler, the job runs well
> > (tuned some resource configs to enable the jobs to work well)
> >  * killed TMs to trigger failures, the jobs can finally recover from the
> > failures
> >
> > Thanks,
> > Zhu Zhu
> >
> > Zhu Zhu  于2020年2月10日周一 上午12:31写道:
> >
> > > The commit info is shown as  on the web UI and in logs.
> > > Not sure if it's a common issue or just happens to my build only.
> > >
> > > Thanks,
> > > Zhu Zhu
> > >
> > > aihua li  于2020年2月9日周日 下午7:42写道:
> > >
> > >> Yes, but the results you see in the Performance Code Speed Center [3]
> > >> skip FLIP-49.
> > >>  The results of the default configurations are overwritten by the
> latest
> > >> results.
> > >>
> > >> > 2020年2月9日 下午5:29,Yu Li  写道:
> > >> >
> > >> > Thanks for the efforts Aihua! These could definitely improve our RC
> > >> test coverage!
> > >> >
> > >> > Just to confirm, that the stability tests were executed with the
> same
> > >> test suite for Alibaba production usage, and the e2e performance one
> was
> > >> executed with the test suite proposed in FLIP-83 [1] and FLINK-14917
> > [2],
> > >> and the result could also be observed from our performance code-speed
> > >> center [3], right?
> > >> >
> > >> > Thanks.
> > >> >
> > >> > Best Regards,
> > >> > Yu
> > >> >
> > >> > [1]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> > >> <
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> > >> >
> > >> > [2] https://issues.apache.org/jira/browse/FLINK-14917 <
> > >> https://issues.apache.org/jira/browse/FLINK-14917>
> > >> > [3] https://s.apache.org/nglhm 
> > >> >
> > >> > On Sun, 9 Feb 2020 at 11:20, aihua li   > >> liaihua1...@gmail.com>> wrote:
> > >> > +1 (non-binging)
> > >> >
> > >> > I ran stability tests and end-to-end performance tests in branch
> > >> release-1.10.0-rc3,both of them passed.
> > >> >
> > >> > Stability test: It mainly checks The flink job can revover from
> > >> various abnormal situations which concluding disk full,
> > >> > network interruption, zk unable to connect, rpc message timeout,
> etc.
> > >> > If job can't be recoverd it means test failed.
> > >> > The test passed after running 5 hours.
> > >> >
> > >> > End-to-end performance test: It containes 32 test scenarios which
> > >> designed in FLIP-83.
> > >> > Test results: The performance regressions about 3% from 1.9.1 if
> uses
> > >> default parameters;
> > >> > The result:
> > >> >
> > >> >  if skips FLIP-49 (add
> parameters:taskmanager.memory.managed.fraction:
> > >> 0,taskmanager.memory.flink.size: 1568m in flink-conf.yaml),
> > >> >  the performance improves about 5% from 1.9.1. The result:
> > >> >
> > >> >
> > >> > I confirm it with @Xintong Song <
> > >> https://cwiki.apache.org/confluence/display/~xintongsong> that the
> > >> result  makes sense.
> > >> >
> > >> >> 2020年2月8日 上午5:54,Gary Yao mailto:g...@apache.org
> >>
> > >> 写道:
> > >> >>
> > >> >> Hi everyone,
> > >> >> Please review and vote on the release candidate #3 for the version
> > >> 1.10.0,
> > >> >> as follows:
> > >> >> [ ] +1, Approve the release
> > >> >> [ ] -1, Do not approve the release (please provide specific
> comments)
> > >> >>
> > >> >>
> > >> >> The complete staging area is available for your review, which
> > includes:
> > >> >> * JIRA release notes [1],
> > >> >> * the official Apache source release and binary convenience
> releases
> > >> to be
> > >> >> deployed to dist.apache.org  [2], which
> are
> > >> signed with the key with
> > >> >> fingerprint BB137807CEFBE7DD2616556710B12A1F89C115E8 [3],
> > >> >> * all artifacts to be deployed to the Maven Central Repository [4],
> > >> >> * source code tag "release-1.10.0-rc3" [5],
> > >> >> * website pull request listing the new release and adding
> > announcement
> > >> blog
> > >> >> post [6][7].
> > >> >>
> > >> >> The vote will be open for at least 72 hours. It is adopted by
> > majority
> > >> >> approval, with at least 3 PMC affirmative votes.
> > >> >>
> > >> >> Thanks,
> > >> >

[jira] [Created] (FLINK-15965) DATE literal issue in static partition spec

2020-02-10 Thread Rui Li (Jira)
Rui Li created FLINK-15965:
--

 Summary: DATE literal issue in static partition spec
 Key: FLINK-15965
 URL: https://issues.apache.org/jira/browse/FLINK-15965
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Reporter: Rui Li






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [Question] Anyone know where I can find performance test result?

2020-02-10 Thread Yu Li
Hi Xu,

Do you mean the flink-benchmark [1] project for micro benchmark? If so, you
could find the daily run result from our code-speed center [2].

Please note that the result of the benchmark is hardware sensitive, so you
may find your local result different from the online ones.

Best Regards,
Yu

[1] https://github.com/dataArtisans/flink-benchmarks
[2] http://codespeed.dak8s.net:8000/


On Mon, 10 Feb 2020 at 10:57, 闫旭  wrote:

> Hi there,
>
> I am just exploring the apache flink git repo and found the performance
> test. I have already test on my local machine, I’m wondering if we got
> online result?
>
> Thanks
>
> Regards
>
> Xu Yan


Re: [DISCUSS] Support scalar vectorized Python UDF in PyFlink

2020-02-10 Thread jincheng sun
Hi Jingsong,

Thanks for your feedback! I would like to share my thoughts regarding the
follows question:

>> - Can we only configure one parameter and calculate another
automatically? For example, if we just want to "pipeline", "bundle.size" is
twice as much as "batch.size", is this work?

I don't think this works. These two configurations are used for different
purposes and there is no direct relationship between them and so I guess we
cannot infer a configuration from the other configuration.

Best,
Jincheng


Jingsong Li  于2020年2月10日周一 下午1:53写道:

> Thanks Dian for your reply.
>
> +1 to create a FLIP too.
>
> About "python.fn-execution.bundle.size" and
> "python.fn-execution.arrow.batch.size", I got what are you mean about
> "pipeline". I agree.
> It seems that a batch should always in a bundle. Bundle size should always
> bigger than batch size. (if a batch can not cross bundle).
> Can you explain this relationship to the document?
>
> I think default value is a very important thing, we can discuss:
> - In the batch world, vectorization batch size is about 1024+. What do you
> think about the default value of "batch"?
> - Can we only configure one parameter and calculate another automatically?
> For example, if we just want to "pipeline", "bundle.size" is twice as much
> as "batch.size", is this work?
>
> Best,
> Jingsong Lee
>
> On Mon, Feb 10, 2020 at 11:55 AM Hequn Cheng  wrote:
>
> > Hi Dian,
> >
> > Thanks a lot for bringing up the discussion!
> >
> > It is great to see the Pandas UDFs feature is going to be introduced. I
> > think this would improve the performance and also the usability of
> > user-defined functions (UDFs) in Python.
> > One little suggestion: maybe it would be nice if we can add some
> > performance explanation in the document? (I just very curious:))
> >
> > +1 to create a FLIP for this big enhancement.
> >
> > Best,
> > Hequn
> >
> > On Mon, Feb 10, 2020 at 11:15 AM jincheng sun 
> > wrote:
> >
> > > Hi Dian,
> > >
> > > Thanks for bring up this discussion. This is very important for the
> > > ecological of PyFlink. Add support Pandas greatly enriches the
> available
> > > UDF library of PyFlink and greatly improves the usability of PyFlink!
> > >
> > > +1 for Support scalar vectorized Python UDF.
> > >
> > > I think we should to create a FLIP for this big enhancements. :)
> > >
> > > What do you think?
> > >
> > > Best,
> > > Jincheng
> > >
> > >
> > >
> > > dianfu  于2020年2月5日周三 下午6:01写道:
> > >
> > > > Hi Jingsong,
> > > >
> > > > Thanks a lot for the valuable feedback.
> > > >
> > > > 1. The configurations "python.fn-execution.bundle.size" and
> > > > "python.fn-execution.arrow.batch.size" are used for separate purposes
> > > and I
> > > > think they are both needed. If they are unified, the Python operator
> > has
> > > to
> > > > wait the execution results of the previous batch of elements before
> > > > processing the next batch. This means that the Python UDF execution
> can
> > > not
> > > > be pipelined between batches. With separate configuration, there will
> > be
> > > no
> > > > such problems.
> > > > 2. It means that the Java operator will convert input elements to
> Arrow
> > > > memory format and then send them to the Python worker, vice verse.
> > > > Regarding to the zero-copy benefits provided by Arrow, we can gain
> them
> > > > automatically using Arrow.
> > > > 3. Good point! As all the classes of Python module is written in Java
> > and
> > > > it's not suggested to introduce new Scala classes, so I guess it's
> not
> > > easy
> > > > to do so right now. But I think this is definitely a good improvement
> > we
> > > > can do in the future.
> > > > 4. You're right and we will add a series of Arrow ColumnVectors for
> > each
> > > > type supported.
> > > >
> > > > Thanks,
> > > > Dian
> > > >
> > > > > 在 2020年2月5日,下午4:57,Jingsong Li  写道:
> > > > >
> > > > > Hi Dian,
> > > > >
> > > > > +1 for this, thanks driving.
> > > > > Documentation looks very good. I can imagine a huge performance
> > > > improvement
> > > > > and better integration to other Python libraries.
> > > > >
> > > > > A few thoughts:
> > > > > - About data split: "python.fn-execution.arrow.batch.size", can we
> > > unify
> > > > it
> > > > > with "python.fn-execution.bundle.size"?
> > > > > - Use of Apache Arrow as the exchange format: Do you mean Arrow
> > support
> > > > > zero-copy between Java and Python?
> > > > > - ArrowFieldWriter seems we can implement it by code generation.
> But
> > it
> > > > is
> > > > > OK to initial version with virtual function call.
> > > > > - ColumnarRow for vectorization reading seems that we need
> implement
> > > > > ArrowColumnVectors.
> > > > >
> > > > > Best,
> > > > > Jingsong Lee
> > > > >
> > > > > On Wed, Feb 5, 2020 at 12:45 PM dianfu  wrote:
> > > > >
> > > > >> Hi all,
> > > > >>
> > > > >> Scalar Python UDF has already been supported in the coming release
> > > 1.10
> > > > >> (FLIP-58[1]). It operates one row at a time. It works in the way

Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-10 Thread Jingsong Li
Hi,


+1 (non-binding) Thanks for driving this, Gary & Yu.


There is an unfriendly error here: "OutOfMemoryError: Direct buffer memory"
in FileChannelBoundedData$FileBufferReader.

It forces our batch users to configure
"taskmanager.memory.task.off-heap.size" in production jobs. And users are
hard to know how much memory they need configure.

Even for us developers, it is hard to say how much memory, it depends on
tasks left over from the previous stage and the parallelism.


It is not a blocker, but hope to resolve it in 1.11.


- Verified signatures and checksums

- Maven build from source skip tests

- Verified pom files point to the 1.10.0 version

- Test Hive integration and SQL client: work well


Best,

Jingsong Lee

On Mon, Feb 10, 2020 at 12:28 PM Zhu Zhu  wrote:

> My bad. The missing commit info is caused by building from the src code zip
> which does not contain the git info.
> So this is not a problem.
>
> +1 (binding) for rc3
> Here's what's were verified :
>  * built successfully from the source code
>  * run a sample streaming and a batch job with parallelism=1000 on yarn
> cluster, with the new scheduler and legacy scheduler, the job runs well
> (tuned some resource configs to enable the jobs to work well)
>  * killed TMs to trigger failures, the jobs can finally recover from the
> failures
>
> Thanks,
> Zhu Zhu
>
> Zhu Zhu  于2020年2月10日周一 上午12:31写道:
>
> > The commit info is shown as  on the web UI and in logs.
> > Not sure if it's a common issue or just happens to my build only.
> >
> > Thanks,
> > Zhu Zhu
> >
> > aihua li  于2020年2月9日周日 下午7:42写道:
> >
> >> Yes, but the results you see in the Performance Code Speed Center [3]
> >> skip FLIP-49.
> >>  The results of the default configurations are overwritten by the latest
> >> results.
> >>
> >> > 2020年2月9日 下午5:29,Yu Li  写道:
> >> >
> >> > Thanks for the efforts Aihua! These could definitely improve our RC
> >> test coverage!
> >> >
> >> > Just to confirm, that the stability tests were executed with the same
> >> test suite for Alibaba production usage, and the e2e performance one was
> >> executed with the test suite proposed in FLIP-83 [1] and FLINK-14917
> [2],
> >> and the result could also be observed from our performance code-speed
> >> center [3], right?
> >> >
> >> > Thanks.
> >> >
> >> > Best Regards,
> >> > Yu
> >> >
> >> > [1]
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> >> <
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> >> >
> >> > [2] https://issues.apache.org/jira/browse/FLINK-14917 <
> >> https://issues.apache.org/jira/browse/FLINK-14917>
> >> > [3] https://s.apache.org/nglhm 
> >> >
> >> > On Sun, 9 Feb 2020 at 11:20, aihua li  >> liaihua1...@gmail.com>> wrote:
> >> > +1 (non-binging)
> >> >
> >> > I ran stability tests and end-to-end performance tests in branch
> >> release-1.10.0-rc3,both of them passed.
> >> >
> >> > Stability test: It mainly checks The flink job can revover from
> >> various abnormal situations which concluding disk full,
> >> > network interruption, zk unable to connect, rpc message timeout, etc.
> >> > If job can't be recoverd it means test failed.
> >> > The test passed after running 5 hours.
> >> >
> >> > End-to-end performance test: It containes 32 test scenarios which
> >> designed in FLIP-83.
> >> > Test results: The performance regressions about 3% from 1.9.1 if uses
> >> default parameters;
> >> > The result:
> >> >
> >> >  if skips FLIP-49 (add parameters:taskmanager.memory.managed.fraction:
> >> 0,taskmanager.memory.flink.size: 1568m in flink-conf.yaml),
> >> >  the performance improves about 5% from 1.9.1. The result:
> >> >
> >> >
> >> > I confirm it with @Xintong Song <
> >> https://cwiki.apache.org/confluence/display/~xintongsong> that the
> >> result  makes sense.
> >> >
> >> >> 2020年2月8日 上午5:54,Gary Yao mailto:g...@apache.org>>
> >> 写道:
> >> >>
> >> >> Hi everyone,
> >> >> Please review and vote on the release candidate #3 for the version
> >> 1.10.0,
> >> >> as follows:
> >> >> [ ] +1, Approve the release
> >> >> [ ] -1, Do not approve the release (please provide specific comments)
> >> >>
> >> >>
> >> >> The complete staging area is available for your review, which
> includes:
> >> >> * JIRA release notes [1],
> >> >> * the official Apache source release and binary convenience releases
> >> to be
> >> >> deployed to dist.apache.org  [2], which are
> >> signed with the key with
> >> >> fingerprint BB137807CEFBE7DD2616556710B12A1F89C115E8 [3],
> >> >> * all artifacts to be deployed to the Maven Central Repository [4],
> >> >> * source code tag "release-1.10.0-rc3" [5],
> >> >> * website pull request listing the new release and adding
> announcement
> >> blog
> >> >> post [6][7].
> >> >>
> >> >> The vote will be open for at least 72 hours. It is adopted by
> majority
> >> >> approval, with at least

Re: [DISCUSS] Support scalar vectorized Python UDF in PyFlink

2020-02-10 Thread Dian Fu
Hi Jincheng, Hequn & Jingsong,

Thanks a lot for your suggestions. I have created FLIP-97[1] for this
feature.

> One little suggestion: maybe it would be nice if we can add some
performance explanation in the document? (I just very curious:))
Thanks for the suggestion. I have updated the design doc in the
"BackGround" section about where the performance gains could be got from.

> It seems that a batch should always in a bundle. Bundle size should always
bigger than batch size. (if a batch can not cross bundle).
Can you explain this relationship to the document?
I have updated the design doc explaining more about these two
configurations.

> In the batch world, vectorization batch size is about 1024+. What do you
think about the default value of "batch"?
Is there any link about where this value comes from? I have performed a
simple test for Pandas UDF which performs the simple +1 operation. The
performance is best when the batch size is set to 5000. I think it depends
on the data type of each column, the functionality the Pandas UDF does,
etc. However I agree with you that we could give a meaningful default value
for the "batch" size which works in most scenarios.

> Can we only configure one parameter and calculate another automatically?
For example, if we just want to "pipeline", "bundle.size" is twice as much
as "batch.size", is this work?
I agree with Jincheng that this is not feasible. I think that giving an
meaningful default value for the "batch.size" which works in most scenarios
is enough. What's your thought?

Thanks,
Dian

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-97%3A+Support+Scalar+Vectorized+Python+UDF+in+PyFlink


On Mon, Feb 10, 2020 at 4:25 PM jincheng sun 
wrote:

> Hi Jingsong,
>
> Thanks for your feedback! I would like to share my thoughts regarding the
> follows question:
>
> >> - Can we only configure one parameter and calculate another
> automatically? For example, if we just want to "pipeline", "bundle.size" is
> twice as much as "batch.size", is this work?
>
> I don't think this works. These two configurations are used for different
> purposes and there is no direct relationship between them and so I guess we
> cannot infer a configuration from the other configuration.
>
> Best,
> Jincheng
>
>
> Jingsong Li  于2020年2月10日周一 下午1:53写道:
>
> > Thanks Dian for your reply.
> >
> > +1 to create a FLIP too.
> >
> > About "python.fn-execution.bundle.size" and
> > "python.fn-execution.arrow.batch.size", I got what are you mean about
> > "pipeline". I agree.
> > It seems that a batch should always in a bundle. Bundle size should
> always
> > bigger than batch size. (if a batch can not cross bundle).
> > Can you explain this relationship to the document?
> >
> > I think default value is a very important thing, we can discuss:
> > - In the batch world, vectorization batch size is about 1024+. What do
> you
> > think about the default value of "batch"?
> > - Can we only configure one parameter and calculate another
> automatically?
> > For example, if we just want to "pipeline", "bundle.size" is twice as
> much
> > as "batch.size", is this work?
> >
> > Best,
> > Jingsong Lee
> >
> > On Mon, Feb 10, 2020 at 11:55 AM Hequn Cheng  wrote:
> >
> > > Hi Dian,
> > >
> > > Thanks a lot for bringing up the discussion!
> > >
> > > It is great to see the Pandas UDFs feature is going to be introduced. I
> > > think this would improve the performance and also the usability of
> > > user-defined functions (UDFs) in Python.
> > > One little suggestion: maybe it would be nice if we can add some
> > > performance explanation in the document? (I just very curious:))
> > >
> > > +1 to create a FLIP for this big enhancement.
> > >
> > > Best,
> > > Hequn
> > >
> > > On Mon, Feb 10, 2020 at 11:15 AM jincheng sun <
> sunjincheng...@gmail.com>
> > > wrote:
> > >
> > > > Hi Dian,
> > > >
> > > > Thanks for bring up this discussion. This is very important for the
> > > > ecological of PyFlink. Add support Pandas greatly enriches the
> > available
> > > > UDF library of PyFlink and greatly improves the usability of PyFlink!
> > > >
> > > > +1 for Support scalar vectorized Python UDF.
> > > >
> > > > I think we should to create a FLIP for this big enhancements. :)
> > > >
> > > > What do you think?
> > > >
> > > > Best,
> > > > Jincheng
> > > >
> > > >
> > > >
> > > > dianfu  于2020年2月5日周三 下午6:01写道:
> > > >
> > > > > Hi Jingsong,
> > > > >
> > > > > Thanks a lot for the valuable feedback.
> > > > >
> > > > > 1. The configurations "python.fn-execution.bundle.size" and
> > > > > "python.fn-execution.arrow.batch.size" are used for separate
> purposes
> > > > and I
> > > > > think they are both needed. If they are unified, the Python
> operator
> > > has
> > > > to
> > > > > wait the execution results of the previous batch of elements before
> > > > > processing the next batch. This means that the Python UDF execution
> > can
> > > > not
> > > > > be pipelined between batches. With separate configuration, the

Re: [DISCUSS] FLINK-15831: Add Docker image publication to release documentation

2020-02-10 Thread Ufuk Celebi
+1 to have the README as source of truth and link to the repository from
the Wiki page.

– Ufuk


On Mon, Feb 10, 2020 at 3:48 AM Yang Wang  wrote:

> +1 to make flink-docker repository self-contained, including the document.
> And others refer
> to it.
>
>
> Best,
> Yang
>
> Till Rohrmann  于2020年2月9日周日 下午5:35写道:
>
> > Sounds good to me Patrick. +1 for these changes.
> >
> > Cheers,
> > Till
> >
> > On Fri, Feb 7, 2020 at 3:25 PM Patrick Lucas 
> > wrote:
> >
> > > Hi all,
> > >
> > > For FLINK-15831[1], I think the way to start is for the flink-docker
> > > repo[2] itself to sufficiently document the workflow for publishing new
> > > Dockerfiles, and then update the Flink release guide in the wiki to
> refer
> > > to this documentation and to include this step in the "Finalize the
> > > release" checklist.
> > >
> > > To the first point, I have opened a PR[3] on flink-docker to improve
> its
> > > documentation.
> > >
> > > And for updating the release guide, I propose the following changes:
> > >
> > > 1. Add a new subsection to "Finalize the release", prior to "Checklist
> to
> > > proceed to the next step" with the following content:
> > >
> > > Publish the Dockerfiles for the new release
> > > >
> > > > Note: the official Dockerfiles fetch the binary distribution of the
> > > target
> > > > Flink version from an Apache mirror. After publishing the binary
> > release
> > > > artifacts, mirrors can take some hours to start serving the new
> > > artifacts,
> > > > so you may want to wait to do this step until you are ready to
> continue
> > > > with the "Promote the release" steps below.
> > > >
> > > > Follow the instructions in the [flink-docker] repo to build the new
> > > > Dockerfiles and send an updated manifest to Docker Hub so the new
> > images
> > > > are built and published.
> > > >
> > >
> > > 2. Add an entry to the "Checklist to proceed to the next step"
> subsection
> > > of "Finalize the release":
> > >
> > > >
> > > >- Dockerfiles in flink-docker updated for the new Flink release
> and
> > > >pull request opened on the Docker official-images with an updated
> > > manifest
> > > >
> > > > Please let me know if you have any questions or suggestions to
> improve
> > > this proposal.
> > >
> > > Thanks,
> > > Patrick
> > >
> > > [1]https://issues.apache.org/jira/browse/FLINK-15831
> > > [2]https://github.com/apache/flink-docker
> > > [3]https://github.com/apache/flink-docker/pull/5
> > >
> >
>


Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

2020-02-10 Thread Yadong Xie
Hi all
I have drafted the docs of top-level FLIPs for the individual changes
proposed in FLIP-75.
will update it to the cwiki page and start the voting stage soon if there
is no objection.

   - FLIP-98: Better Back Pressure Detection
   

   - FLIP-99: Make Max Exception Configurable
   

   - FLIP-100: Add Attempt Information
   

   - FLIP-101: Add Pending Slots Tab in Job Detail
   

   - FLIP-102: Add More Metrics to TaskManager
   

   - FLIP-103: Better Taskmanager Log Display
   

   - FLIP-104: Add More Metrics to Jobmanager
   

   - FLIP-105: Better Jobmanager Log Display
   



Yadong Xie  于2020年2月9日周日 下午7:24写道:

> Hi Till
> I got your point, will create sub FLIPs and votings according to the
> FLIP-75 and previous discussion soon.
>
> Till Rohrmann  于2020年2月9日周日 下午5:27写道:
>
>> Hi Yadong,
>>
>> I think it would be fine to simply link to this discussion thread to keep
>> the discussion history. Maybe an easier way would be to create top-level
>> FLIPs for the individual changes proposed in FLIP-75. The reason I'm
>> proposing this is that it would be easier to vote on it and to implement
>> it
>> because the scope is smaller. But maybe I'm wrong here and others could
>> chime in to voice their opinion.
>>
>> Cheers,
>> Till
>>
>> On Fri, Feb 7, 2020 at 9:58 AM Yadong Xie  wrote:
>>
>> > Hi Till
>> >
>> > FLIP-75 has been open since September, and the design doc has been
>> iterated
>> > over 3 versions and more than 20 patches.
>> > I had a try, but it is hard to split the design docs into sub FLIP and
>> keep
>> > all the discussion history at the same time.
>> >
>> > Maybe it is better to start another discussion to talk about the
>> individual
>> > sub FLIP voting? and make the next FLIP follow the new practice if
>> > possible.
>> >
>> > Till Rohrmann  于2020年2月3日周一 下午6:28写道:
>> >
>> > > I think there is no such description because we never did it before. I
>> > just
>> > > figured that FLIP-75 could actually be a good candidate to start this
>> > > practice. We would need a community discussion first, though.
>> > >
>> > > Cheers,
>> > > Till
>> > >
>> > > On Mon, Feb 3, 2020 at 10:28 AM Yadong Xie 
>> wrote:
>> > >
>> > > > Hi Till
>> > > > I didn’t find how to create of sub flip at cwiki.apache.org
>> > > > do you mean to create 9 more FLIPS instead of FLIP-75?
>> > > >
>> > > > Till Rohrmann  于2020年1月30日周四 下午11:12写道:
>> > > >
>> > > > > Would it be easier if FLIP-75 would be the umbrella FLIP and we
>> would
>> > > > vote
>> > > > > on the individual improvements as sub FLIPs? Decreasing the scope
>> > > should
>> > > > > make things easier.
>> > > > >
>> > > > > Cheers,
>> > > > > Till
>> > > > >
>> > > > > On Thu, Jan 30, 2020 at 2:35 PM Robert Metzger <
>> rmetz...@apache.org>
>> > > > > wrote:
>> > > > >
>> > > > > > Thanks a lot for this work! I believe the web UI is very
>> important,
>> > > in
>> > > > > > particular to new users. I'm very happy to see that you are
>> putting
>> > > > > effort
>> > > > > > into improving the visibility into Flink through the proposed
>> > > changes.
>> > > > > >
>> > > > > > I can not judge if all the changes make total sense, but the
>> > > discussion
>> > > > > has
>> > > > > > been open since September, and a good number of people have
>> > commented
>> > > > in
>> > > > > > the document.
>> > > > > > I wonder if we can move this FLIP to the VOTing stage?
>> > > > > >
>> > > > > > On Wed, Jan 22, 2020 at 6:27 PM Till Rohrmann <
>> > trohrm...@apache.org>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Thanks for the update Yadong. Big +1 for the proposed
>> > improvements
>> > > > for
>> > > > > > > Flink's web UI. I think they will be super helpful for our
>> users.
>> > > > > > >
>> > > > > > > Cheers,
>> > > > > > > Till
>> > > > > > >
>> > > > > > > On Tue, Jan 7, 2020 at 10:00 AM Yadong Xie <
>> vthink...@gmail.com>
>> > > > > wrote:
>> > > > > > >
>> > > > > > > > Hi everyone
>> > > > > > > >
>> > > > > > > > We have spent some time updating the documentation since the
>> > last
>> > > > > > > > discussion.
>> > > > > > > >
>> > > > > > > > In short, the latest FLIP-75 contains the following
>> > > > > proposal(including
>> > > > > > > both
>> > > > > > > > frontend and 

Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-10 Thread Yang Wang
 +1 non-binding


- Building from source with all tests skipped
- Build a custom image with 1.10-rc3
- K8s tests
* Deploy a standalone session cluster on K8s and submit multiple jobs
* Deploy a standalone per-job cluster
* Deploy a native session cluster on K8s with/without HA configured,
kill TM and jobs could recover successfully


Best,
Yang

Jingsong Li  于2020年2月10日周一 下午4:29写道:

> Hi,
>
>
> +1 (non-binding) Thanks for driving this, Gary & Yu.
>
>
> There is an unfriendly error here: "OutOfMemoryError: Direct buffer memory"
> in FileChannelBoundedData$FileBufferReader.
>
> It forces our batch users to configure
> "taskmanager.memory.task.off-heap.size" in production jobs. And users are
> hard to know how much memory they need configure.
>
> Even for us developers, it is hard to say how much memory, it depends on
> tasks left over from the previous stage and the parallelism.
>
>
> It is not a blocker, but hope to resolve it in 1.11.
>
>
> - Verified signatures and checksums
>
> - Maven build from source skip tests
>
> - Verified pom files point to the 1.10.0 version
>
> - Test Hive integration and SQL client: work well
>
>
> Best,
>
> Jingsong Lee
>
> On Mon, Feb 10, 2020 at 12:28 PM Zhu Zhu  wrote:
>
> > My bad. The missing commit info is caused by building from the src code
> zip
> > which does not contain the git info.
> > So this is not a problem.
> >
> > +1 (binding) for rc3
> > Here's what's were verified :
> >  * built successfully from the source code
> >  * run a sample streaming and a batch job with parallelism=1000 on yarn
> > cluster, with the new scheduler and legacy scheduler, the job runs well
> > (tuned some resource configs to enable the jobs to work well)
> >  * killed TMs to trigger failures, the jobs can finally recover from the
> > failures
> >
> > Thanks,
> > Zhu Zhu
> >
> > Zhu Zhu  于2020年2月10日周一 上午12:31写道:
> >
> > > The commit info is shown as  on the web UI and in logs.
> > > Not sure if it's a common issue or just happens to my build only.
> > >
> > > Thanks,
> > > Zhu Zhu
> > >
> > > aihua li  于2020年2月9日周日 下午7:42写道:
> > >
> > >> Yes, but the results you see in the Performance Code Speed Center [3]
> > >> skip FLIP-49.
> > >>  The results of the default configurations are overwritten by the
> latest
> > >> results.
> > >>
> > >> > 2020年2月9日 下午5:29,Yu Li  写道:
> > >> >
> > >> > Thanks for the efforts Aihua! These could definitely improve our RC
> > >> test coverage!
> > >> >
> > >> > Just to confirm, that the stability tests were executed with the
> same
> > >> test suite for Alibaba production usage, and the e2e performance one
> was
> > >> executed with the test suite proposed in FLIP-83 [1] and FLINK-14917
> > [2],
> > >> and the result could also be observed from our performance code-speed
> > >> center [3], right?
> > >> >
> > >> > Thanks.
> > >> >
> > >> > Best Regards,
> > >> > Yu
> > >> >
> > >> > [1]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> > >> <
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> > >> >
> > >> > [2] https://issues.apache.org/jira/browse/FLINK-14917 <
> > >> https://issues.apache.org/jira/browse/FLINK-14917>
> > >> > [3] https://s.apache.org/nglhm 
> > >> >
> > >> > On Sun, 9 Feb 2020 at 11:20, aihua li   > >> liaihua1...@gmail.com>> wrote:
> > >> > +1 (non-binging)
> > >> >
> > >> > I ran stability tests and end-to-end performance tests in branch
> > >> release-1.10.0-rc3,both of them passed.
> > >> >
> > >> > Stability test: It mainly checks The flink job can revover from
> > >> various abnormal situations which concluding disk full,
> > >> > network interruption, zk unable to connect, rpc message timeout,
> etc.
> > >> > If job can't be recoverd it means test failed.
> > >> > The test passed after running 5 hours.
> > >> >
> > >> > End-to-end performance test: It containes 32 test scenarios which
> > >> designed in FLIP-83.
> > >> > Test results: The performance regressions about 3% from 1.9.1 if
> uses
> > >> default parameters;
> > >> > The result:
> > >> >
> > >> >  if skips FLIP-49 (add
> parameters:taskmanager.memory.managed.fraction:
> > >> 0,taskmanager.memory.flink.size: 1568m in flink-conf.yaml),
> > >> >  the performance improves about 5% from 1.9.1. The result:
> > >> >
> > >> >
> > >> > I confirm it with @Xintong Song <
> > >> https://cwiki.apache.org/confluence/display/~xintongsong> that the
> > >> result  makes sense.
> > >> >
> > >> >> 2020年2月8日 上午5:54,Gary Yao mailto:g...@apache.org
> >>
> > >> 写道:
> > >> >>
> > >> >> Hi everyone,
> > >> >> Please review and vote on the release candidate #3 for the version
> > >> 1.10.0,
> > >> >> as follows:
> > >> >> [ ] +1, Approve the release
> > >> >> [ ] -1, Do not approve the release (please provide specific
> comments)
> > >> >>
> > >> >>
> > >> >> The complete staging area is available for your review, w

[VOTE] Release Flink Python API(PyFlink) 1.9.2 to PyPI, release candidate #1

2020-02-10 Thread jincheng sun
Hi everyone,

Please review and vote on the release candidate #1 for the PyFlink version
1.9.2, as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:

* the official Apache binary convenience releases to be deployed to
dist.apache.org [1], which are signed with the key with fingerprint
8FEA1EE9D0048C0CCC70B7573211B0703B79EA0E [2] and built from source code [3].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Jincheng

[1] https://dist.apache.org/repos/dist/dev/flink/flink-1.9.2-rc1/
[2] https://dist.apache.org/repos/dist/release/flink/KEYS
[3] https://github.com/apache/flink/tree/release-1.9.2


Re: [DISCUSS] FLINK-15831: Add Docker image publication to release documentation

2020-02-10 Thread Patrick Lucas
Thanks for the feedback.

Could someone (Ufuk or Till?) grant me access to to the FLINK space
Confluence so I can make these changes? My Confluence username is plucas.

Thanks,
Patrick

On Mon, Feb 10, 2020 at 9:54 AM Ufuk Celebi  wrote:

> +1 to have the README as source of truth and link to the repository from
> the Wiki page.
>
> – Ufuk
>
>
> On Mon, Feb 10, 2020 at 3:48 AM Yang Wang  wrote:
>
>> +1 to make flink-docker repository self-contained, including the document.
>> And others refer
>> to it.
>>
>>
>> Best,
>> Yang
>>
>> Till Rohrmann  于2020年2月9日周日 下午5:35写道:
>>
>> > Sounds good to me Patrick. +1 for these changes.
>> >
>> > Cheers,
>> > Till
>> >
>> > On Fri, Feb 7, 2020 at 3:25 PM Patrick Lucas 
>> > wrote:
>> >
>> > > Hi all,
>> > >
>> > > For FLINK-15831[1], I think the way to start is for the flink-docker
>> > > repo[2] itself to sufficiently document the workflow for publishing
>> new
>> > > Dockerfiles, and then update the Flink release guide in the wiki to
>> refer
>> > > to this documentation and to include this step in the "Finalize the
>> > > release" checklist.
>> > >
>> > > To the first point, I have opened a PR[3] on flink-docker to improve
>> its
>> > > documentation.
>> > >
>> > > And for updating the release guide, I propose the following changes:
>> > >
>> > > 1. Add a new subsection to "Finalize the release", prior to
>> "Checklist to
>> > > proceed to the next step" with the following content:
>> > >
>> > > Publish the Dockerfiles for the new release
>> > > >
>> > > > Note: the official Dockerfiles fetch the binary distribution of the
>> > > target
>> > > > Flink version from an Apache mirror. After publishing the binary
>> > release
>> > > > artifacts, mirrors can take some hours to start serving the new
>> > > artifacts,
>> > > > so you may want to wait to do this step until you are ready to
>> continue
>> > > > with the "Promote the release" steps below.
>> > > >
>> > > > Follow the instructions in the [flink-docker] repo to build the new
>> > > > Dockerfiles and send an updated manifest to Docker Hub so the new
>> > images
>> > > > are built and published.
>> > > >
>> > >
>> > > 2. Add an entry to the "Checklist to proceed to the next step"
>> subsection
>> > > of "Finalize the release":
>> > >
>> > > >
>> > > >- Dockerfiles in flink-docker updated for the new Flink release
>> and
>> > > >pull request opened on the Docker official-images with an updated
>> > > manifest
>> > > >
>> > > > Please let me know if you have any questions or suggestions to
>> improve
>> > > this proposal.
>> > >
>> > > Thanks,
>> > > Patrick
>> > >
>> > > [1]https://issues.apache.org/jira/browse/FLINK-15831
>> > > [2]https://github.com/apache/flink-docker
>> > > [3]https://github.com/apache/flink-docker/pull/5
>> > >
>> >
>>
>


Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-10 Thread Piotr Nowojski
+1 non-binding

I’ve re-done the manual EMR tests, including running some examples.

Piotrek

> On 10 Feb 2020, at 11:37, Yang Wang  wrote:
> 
> +1 non-binding
> 
> 
> - Building from source with all tests skipped
> - Build a custom image with 1.10-rc3
> - K8s tests
>* Deploy a standalone session cluster on K8s and submit multiple jobs
>* Deploy a standalone per-job cluster
>* Deploy a native session cluster on K8s with/without HA configured,
> kill TM and jobs could recover successfully
> 
> 
> Best,
> Yang
> 
> Jingsong Li  于2020年2月10日周一 下午4:29写道:
> 
>> Hi,
>> 
>> 
>> +1 (non-binding) Thanks for driving this, Gary & Yu.
>> 
>> 
>> There is an unfriendly error here: "OutOfMemoryError: Direct buffer memory"
>> in FileChannelBoundedData$FileBufferReader.
>> 
>> It forces our batch users to configure
>> "taskmanager.memory.task.off-heap.size" in production jobs. And users are
>> hard to know how much memory they need configure.
>> 
>> Even for us developers, it is hard to say how much memory, it depends on
>> tasks left over from the previous stage and the parallelism.
>> 
>> 
>> It is not a blocker, but hope to resolve it in 1.11.
>> 
>> 
>> - Verified signatures and checksums
>> 
>> - Maven build from source skip tests
>> 
>> - Verified pom files point to the 1.10.0 version
>> 
>> - Test Hive integration and SQL client: work well
>> 
>> 
>> Best,
>> 
>> Jingsong Lee
>> 
>> On Mon, Feb 10, 2020 at 12:28 PM Zhu Zhu  wrote:
>> 
>>> My bad. The missing commit info is caused by building from the src code
>> zip
>>> which does not contain the git info.
>>> So this is not a problem.
>>> 
>>> +1 (binding) for rc3
>>> Here's what's were verified :
>>> * built successfully from the source code
>>> * run a sample streaming and a batch job with parallelism=1000 on yarn
>>> cluster, with the new scheduler and legacy scheduler, the job runs well
>>> (tuned some resource configs to enable the jobs to work well)
>>> * killed TMs to trigger failures, the jobs can finally recover from the
>>> failures
>>> 
>>> Thanks,
>>> Zhu Zhu
>>> 
>>> Zhu Zhu  于2020年2月10日周一 上午12:31写道:
>>> 
 The commit info is shown as  on the web UI and in logs.
 Not sure if it's a common issue or just happens to my build only.
 
 Thanks,
 Zhu Zhu
 
 aihua li  于2020年2月9日周日 下午7:42写道:
 
> Yes, but the results you see in the Performance Code Speed Center [3]
> skip FLIP-49.
> The results of the default configurations are overwritten by the
>> latest
> results.
> 
>> 2020年2月9日 下午5:29,Yu Li  写道:
>> 
>> Thanks for the efforts Aihua! These could definitely improve our RC
> test coverage!
>> 
>> Just to confirm, that the stability tests were executed with the
>> same
> test suite for Alibaba production usage, and the e2e performance one
>> was
> executed with the test suite proposed in FLIP-83 [1] and FLINK-14917
>>> [2],
> and the result could also be observed from our performance code-speed
> center [3], right?
>> 
>> Thanks.
>> 
>> Best Regards,
>> Yu
>> 
>> [1]
> 
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> <
> 
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
>> 
>> [2] https://issues.apache.org/jira/browse/FLINK-14917 <
> https://issues.apache.org/jira/browse/FLINK-14917>
>> [3] https://s.apache.org/nglhm 
>> 
>> On Sun, 9 Feb 2020 at 11:20, aihua li >  liaihua1...@gmail.com>> wrote:
>> +1 (non-binging)
>> 
>> I ran stability tests and end-to-end performance tests in branch
> release-1.10.0-rc3,both of them passed.
>> 
>> Stability test: It mainly checks The flink job can revover from
> various abnormal situations which concluding disk full,
>> network interruption, zk unable to connect, rpc message timeout,
>> etc.
>> If job can't be recoverd it means test failed.
>> The test passed after running 5 hours.
>> 
>> End-to-end performance test: It containes 32 test scenarios which
> designed in FLIP-83.
>> Test results: The performance regressions about 3% from 1.9.1 if
>> uses
> default parameters;
>> The result:
>> 
>> if skips FLIP-49 (add
>> parameters:taskmanager.memory.managed.fraction:
> 0,taskmanager.memory.flink.size: 1568m in flink-conf.yaml),
>> the performance improves about 5% from 1.9.1. The result:
>> 
>> 
>> I confirm it with @Xintong Song <
> https://cwiki.apache.org/confluence/display/~xintongsong> that the
> result  makes sense.
>> 
>>> 2020年2月8日 上午5:54,Gary Yao mailto:g...@apache.org
 
> 写道:
>>> 
>>> Hi everyone,
>>> Please review and vote on the release candidate #3 for the version
> 1.10.0,
>>> as follows:
>>> [ ] +1, Approve the release
>>> [ ] -1, Do not appr

Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-10 Thread Kurt Young
+1 (binding)

- verified signatures and checksums
- start local cluster, run some examples, randomly play some sql with sql
client, no suspicious error/warn log found in log files
- repeat above operation with both scala 2.11 and 2.12 binary

Best,
Kurt


On Mon, Feb 10, 2020 at 6:38 PM Yang Wang  wrote:

>  +1 non-binding
>
>
> - Building from source with all tests skipped
> - Build a custom image with 1.10-rc3
> - K8s tests
> * Deploy a standalone session cluster on K8s and submit multiple jobs
> * Deploy a standalone per-job cluster
> * Deploy a native session cluster on K8s with/without HA configured,
> kill TM and jobs could recover successfully
>
>
> Best,
> Yang
>
> Jingsong Li  于2020年2月10日周一 下午4:29写道:
>
> > Hi,
> >
> >
> > +1 (non-binding) Thanks for driving this, Gary & Yu.
> >
> >
> > There is an unfriendly error here: "OutOfMemoryError: Direct buffer
> memory"
> > in FileChannelBoundedData$FileBufferReader.
> >
> > It forces our batch users to configure
> > "taskmanager.memory.task.off-heap.size" in production jobs. And users are
> > hard to know how much memory they need configure.
> >
> > Even for us developers, it is hard to say how much memory, it depends on
> > tasks left over from the previous stage and the parallelism.
> >
> >
> > It is not a blocker, but hope to resolve it in 1.11.
> >
> >
> > - Verified signatures and checksums
> >
> > - Maven build from source skip tests
> >
> > - Verified pom files point to the 1.10.0 version
> >
> > - Test Hive integration and SQL client: work well
> >
> >
> > Best,
> >
> > Jingsong Lee
> >
> > On Mon, Feb 10, 2020 at 12:28 PM Zhu Zhu  wrote:
> >
> > > My bad. The missing commit info is caused by building from the src code
> > zip
> > > which does not contain the git info.
> > > So this is not a problem.
> > >
> > > +1 (binding) for rc3
> > > Here's what's were verified :
> > >  * built successfully from the source code
> > >  * run a sample streaming and a batch job with parallelism=1000 on yarn
> > > cluster, with the new scheduler and legacy scheduler, the job runs well
> > > (tuned some resource configs to enable the jobs to work well)
> > >  * killed TMs to trigger failures, the jobs can finally recover from
> the
> > > failures
> > >
> > > Thanks,
> > > Zhu Zhu
> > >
> > > Zhu Zhu  于2020年2月10日周一 上午12:31写道:
> > >
> > > > The commit info is shown as  on the web UI and in logs.
> > > > Not sure if it's a common issue or just happens to my build only.
> > > >
> > > > Thanks,
> > > > Zhu Zhu
> > > >
> > > > aihua li  于2020年2月9日周日 下午7:42写道:
> > > >
> > > >> Yes, but the results you see in the Performance Code Speed Center
> [3]
> > > >> skip FLIP-49.
> > > >>  The results of the default configurations are overwritten by the
> > latest
> > > >> results.
> > > >>
> > > >> > 2020年2月9日 下午5:29,Yu Li  写道:
> > > >> >
> > > >> > Thanks for the efforts Aihua! These could definitely improve our
> RC
> > > >> test coverage!
> > > >> >
> > > >> > Just to confirm, that the stability tests were executed with the
> > same
> > > >> test suite for Alibaba production usage, and the e2e performance one
> > was
> > > >> executed with the test suite proposed in FLIP-83 [1] and FLINK-14917
> > > [2],
> > > >> and the result could also be observed from our performance
> code-speed
> > > >> center [3], right?
> > > >> >
> > > >> > Thanks.
> > > >> >
> > > >> > Best Regards,
> > > >> > Yu
> > > >> >
> > > >> > [1]
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> > > >> <
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> > > >> >
> > > >> > [2] https://issues.apache.org/jira/browse/FLINK-14917 <
> > > >> https://issues.apache.org/jira/browse/FLINK-14917>
> > > >> > [3] https://s.apache.org/nglhm 
> > > >> >
> > > >> > On Sun, 9 Feb 2020 at 11:20, aihua li  >  > > >> liaihua1...@gmail.com>> wrote:
> > > >> > +1 (non-binging)
> > > >> >
> > > >> > I ran stability tests and end-to-end performance tests in branch
> > > >> release-1.10.0-rc3,both of them passed.
> > > >> >
> > > >> > Stability test: It mainly checks The flink job can revover from
> > > >> various abnormal situations which concluding disk full,
> > > >> > network interruption, zk unable to connect, rpc message timeout,
> > etc.
> > > >> > If job can't be recoverd it means test failed.
> > > >> > The test passed after running 5 hours.
> > > >> >
> > > >> > End-to-end performance test: It containes 32 test scenarios which
> > > >> designed in FLIP-83.
> > > >> > Test results: The performance regressions about 3% from 1.9.1 if
> > uses
> > > >> default parameters;
> > > >> > The result:
> > > >> >
> > > >> >  if skips FLIP-49 (add
> > parameters:taskmanager.memory.managed.fraction:
> > > >> 0,taskmanager.memory.flink.size: 1568m in flink-conf.yaml),
> > > >> >  the performance improves about 5% from 1.9.1. The result:

Re: [DISCUSS] FLINK-15831: Add Docker image publication to release documentation

2020-02-10 Thread Chesnay Schepler

@Patrick You should have the required wiki permissions now.

On 10/02/2020 12:12, Patrick Lucas wrote:

Thanks for the feedback.

Could someone (Ufuk or Till?) grant me access to to the FLINK space
Confluence so I can make these changes? My Confluence username is plucas.

Thanks,
Patrick

On Mon, Feb 10, 2020 at 9:54 AM Ufuk Celebi  wrote:


+1 to have the README as source of truth and link to the repository from
the Wiki page.

– Ufuk


On Mon, Feb 10, 2020 at 3:48 AM Yang Wang  wrote:


+1 to make flink-docker repository self-contained, including the document.
And others refer
to it.


Best,
Yang

Till Rohrmann  于2020年2月9日周日 下午5:35写道:


Sounds good to me Patrick. +1 for these changes.

Cheers,
Till

On Fri, Feb 7, 2020 at 3:25 PM Patrick Lucas 
wrote:


Hi all,

For FLINK-15831[1], I think the way to start is for the flink-docker
repo[2] itself to sufficiently document the workflow for publishing

new

Dockerfiles, and then update the Flink release guide in the wiki to

refer

to this documentation and to include this step in the "Finalize the
release" checklist.

To the first point, I have opened a PR[3] on flink-docker to improve

its

documentation.

And for updating the release guide, I propose the following changes:

1. Add a new subsection to "Finalize the release", prior to

"Checklist to

proceed to the next step" with the following content:

Publish the Dockerfiles for the new release

Note: the official Dockerfiles fetch the binary distribution of the

target

Flink version from an Apache mirror. After publishing the binary

release

artifacts, mirrors can take some hours to start serving the new

artifacts,

so you may want to wait to do this step until you are ready to

continue

with the "Promote the release" steps below.

Follow the instructions in the [flink-docker] repo to build the new
Dockerfiles and send an updated manifest to Docker Hub so the new

images

are built and published.


2. Add an entry to the "Checklist to proceed to the next step"

subsection

of "Finalize the release":


- Dockerfiles in flink-docker updated for the new Flink release

and

pull request opened on the Docker official-images with an updated

manifest

Please let me know if you have any questions or suggestions to

improve

this proposal.

Thanks,
Patrick

[1]https://issues.apache.org/jira/browse/FLINK-15831
[2]https://github.com/apache/flink-docker
[3]https://github.com/apache/flink-docker/pull/5





[jira] [Created] (FLINK-15966) Capture the call stack of RPC ask() calls.

2020-02-10 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-15966:


 Summary: Capture the call stack of RPC ask() calls.
 Key: FLINK-15966
 URL: https://issues.apache.org/jira/browse/FLINK-15966
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.10.1, 1.11.0


Currently, when an RPC ask() call fails, we get a rather unhelpful exception 
with a stack trace from akka's internal scheduler.

Instead, we should capture the call stack during the invocation and use it to 
give a helpful error message when the RPC call failed. This is especially 
helpful in cases where the future (and future handlers) are passed for later 
asynchronous result handling (which is the common case in most components (JM / 
TM / RM).

The options should have a flag to turn it off, because when having a lot of 
concurrent ask calls (hundreds of thousands, during large deploy phases), it 
may be possible that the captured call.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-10 Thread Jark Wu
+1 (binding)

- build the source release with Scala 2.12 and Scala 2.11 successfully
- checked/verified signatures and hashes
- started cluster for both Scala 2.11 and 2.12, ran examples, verified web
ui and log output, nothing unexpected
- started cluster and run some e2e sql queries, all of them works well and
the results are as expected:
  - read from kafka source, aggregate, write into mysql
  - read from kafka source with watermark defined in ddl, window aggregate,
write into mysql
  - read from kafka with computed column defined in ddl, temporal join with
a mysql table, write into kafka

Cheers,
Jark


On Mon, 10 Feb 2020 at 19:23, Kurt Young  wrote:

> +1 (binding)
>
> - verified signatures and checksums
> - start local cluster, run some examples, randomly play some sql with sql
> client, no suspicious error/warn log found in log files
> - repeat above operation with both scala 2.11 and 2.12 binary
>
> Best,
> Kurt
>
>
> On Mon, Feb 10, 2020 at 6:38 PM Yang Wang  wrote:
>
> >  +1 non-binding
> >
> >
> > - Building from source with all tests skipped
> > - Build a custom image with 1.10-rc3
> > - K8s tests
> > * Deploy a standalone session cluster on K8s and submit multiple jobs
> > * Deploy a standalone per-job cluster
> > * Deploy a native session cluster on K8s with/without HA configured,
> > kill TM and jobs could recover successfully
> >
> >
> > Best,
> > Yang
> >
> > Jingsong Li  于2020年2月10日周一 下午4:29写道:
> >
> > > Hi,
> > >
> > >
> > > +1 (non-binding) Thanks for driving this, Gary & Yu.
> > >
> > >
> > > There is an unfriendly error here: "OutOfMemoryError: Direct buffer
> > memory"
> > > in FileChannelBoundedData$FileBufferReader.
> > >
> > > It forces our batch users to configure
> > > "taskmanager.memory.task.off-heap.size" in production jobs. And users
> are
> > > hard to know how much memory they need configure.
> > >
> > > Even for us developers, it is hard to say how much memory, it depends
> on
> > > tasks left over from the previous stage and the parallelism.
> > >
> > >
> > > It is not a blocker, but hope to resolve it in 1.11.
> > >
> > >
> > > - Verified signatures and checksums
> > >
> > > - Maven build from source skip tests
> > >
> > > - Verified pom files point to the 1.10.0 version
> > >
> > > - Test Hive integration and SQL client: work well
> > >
> > >
> > > Best,
> > >
> > > Jingsong Lee
> > >
> > > On Mon, Feb 10, 2020 at 12:28 PM Zhu Zhu  wrote:
> > >
> > > > My bad. The missing commit info is caused by building from the src
> code
> > > zip
> > > > which does not contain the git info.
> > > > So this is not a problem.
> > > >
> > > > +1 (binding) for rc3
> > > > Here's what's were verified :
> > > >  * built successfully from the source code
> > > >  * run a sample streaming and a batch job with parallelism=1000 on
> yarn
> > > > cluster, with the new scheduler and legacy scheduler, the job runs
> well
> > > > (tuned some resource configs to enable the jobs to work well)
> > > >  * killed TMs to trigger failures, the jobs can finally recover from
> > the
> > > > failures
> > > >
> > > > Thanks,
> > > > Zhu Zhu
> > > >
> > > > Zhu Zhu  于2020年2月10日周一 上午12:31写道:
> > > >
> > > > > The commit info is shown as  on the web UI and in logs.
> > > > > Not sure if it's a common issue or just happens to my build only.
> > > > >
> > > > > Thanks,
> > > > > Zhu Zhu
> > > > >
> > > > > aihua li  于2020年2月9日周日 下午7:42写道:
> > > > >
> > > > >> Yes, but the results you see in the Performance Code Speed Center
> > [3]
> > > > >> skip FLIP-49.
> > > > >>  The results of the default configurations are overwritten by the
> > > latest
> > > > >> results.
> > > > >>
> > > > >> > 2020年2月9日 下午5:29,Yu Li  写道:
> > > > >> >
> > > > >> > Thanks for the efforts Aihua! These could definitely improve our
> > RC
> > > > >> test coverage!
> > > > >> >
> > > > >> > Just to confirm, that the stability tests were executed with the
> > > same
> > > > >> test suite for Alibaba production usage, and the e2e performance
> one
> > > was
> > > > >> executed with the test suite proposed in FLIP-83 [1] and
> FLINK-14917
> > > > [2],
> > > > >> and the result could also be observed from our performance
> > code-speed
> > > > >> center [3], right?
> > > > >> >
> > > > >> > Thanks.
> > > > >> >
> > > > >> > Best Regards,
> > > > >> > Yu
> > > > >> >
> > > > >> > [1]
> > > > >>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> > > > >> <
> > > > >>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> > > > >> >
> > > > >> > [2] https://issues.apache.org/jira/browse/FLINK-14917 <
> > > > >> https://issues.apache.org/jira/browse/FLINK-14917>
> > > > >> > [3] https://s.apache.org/nglhm 
> > > > >> >
> > > > >> > On Sun, 9 Feb 2020 at 11:20, aihua li  > >  > > > >> liaihua1...@gmail.com>> wrote:
> > > > >> > +1 (non-binging)
> > > > >>

[jira] [Created] (FLINK-15967) Examples use nightly kafka connector

2020-02-10 Thread Zili Chen (Jira)
Zili Chen created FLINK-15967:
-

 Summary: Examples use nightly kafka connector
 Key: FLINK-15967
 URL: https://issues.apache.org/jira/browse/FLINK-15967
 Project: Flink
  Issue Type: Improvement
  Components: Examples
Reporter: Zili Chen
Assignee: Zili Chen
 Fix For: 1.11.0


{{StateMachineExample}} & {{KafkaEventsGeneratorJob}} still use kafka010 
connector, it would be an improvement we show the ability of nightly connector 
version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLINK-15831: Add Docker image publication to release documentation

2020-02-10 Thread Patrick Lucas
Thanks, Chesnay.

I've updated the release guide

with the new step.

--
Patrick

On Mon, Feb 10, 2020 at 12:34 PM Chesnay Schepler 
wrote:

> @Patrick You should have the required wiki permissions now.
>
> On 10/02/2020 12:12, Patrick Lucas wrote:
> > Thanks for the feedback.
> >
> > Could someone (Ufuk or Till?) grant me access to to the FLINK space
> > Confluence so I can make these changes? My Confluence username is plucas.
> >
> > Thanks,
> > Patrick
> >
> > On Mon, Feb 10, 2020 at 9:54 AM Ufuk Celebi  wrote:
> >
> >> +1 to have the README as source of truth and link to the repository from
> >> the Wiki page.
> >>
> >> – Ufuk
> >>
> >>
> >> On Mon, Feb 10, 2020 at 3:48 AM Yang Wang 
> wrote:
> >>
> >>> +1 to make flink-docker repository self-contained, including the
> document.
> >>> And others refer
> >>> to it.
> >>>
> >>>
> >>> Best,
> >>> Yang
> >>>
> >>> Till Rohrmann  于2020年2月9日周日 下午5:35写道:
> >>>
>  Sounds good to me Patrick. +1 for these changes.
> 
>  Cheers,
>  Till
> 
>  On Fri, Feb 7, 2020 at 3:25 PM Patrick Lucas 
>  wrote:
> 
> > Hi all,
> >
> > For FLINK-15831[1], I think the way to start is for the flink-docker
> > repo[2] itself to sufficiently document the workflow for publishing
> >>> new
> > Dockerfiles, and then update the Flink release guide in the wiki to
> >>> refer
> > to this documentation and to include this step in the "Finalize the
> > release" checklist.
> >
> > To the first point, I have opened a PR[3] on flink-docker to improve
> >>> its
> > documentation.
> >
> > And for updating the release guide, I propose the following changes:
> >
> > 1. Add a new subsection to "Finalize the release", prior to
> >>> "Checklist to
> > proceed to the next step" with the following content:
> >
> > Publish the Dockerfiles for the new release
> >> Note: the official Dockerfiles fetch the binary distribution of the
> > target
> >> Flink version from an Apache mirror. After publishing the binary
>  release
> >> artifacts, mirrors can take some hours to start serving the new
> > artifacts,
> >> so you may want to wait to do this step until you are ready to
> >>> continue
> >> with the "Promote the release" steps below.
> >>
> >> Follow the instructions in the [flink-docker] repo to build the new
> >> Dockerfiles and send an updated manifest to Docker Hub so the new
>  images
> >> are built and published.
> >>
> > 2. Add an entry to the "Checklist to proceed to the next step"
> >>> subsection
> > of "Finalize the release":
> >
> >> - Dockerfiles in flink-docker updated for the new Flink release
> >>> and
> >> pull request opened on the Docker official-images with an
> updated
> > manifest
> >> Please let me know if you have any questions or suggestions to
> >>> improve
> > this proposal.
> >
> > Thanks,
> > Patrick
> >
> > [1]https://issues.apache.org/jira/browse/FLINK-15831
> > [2]https://github.com/apache/flink-docker
> > [3]https://github.com/apache/flink-docker/pull/5
> >
>
>


Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-10 Thread Xintong Song
+1 (non-binding)

- build from source (with tests)
- run nightly e2e tests
- run example jobs in local/standalone/yarn setups
- play around with memory configurations on local/standalone/yarn setups

Thank you~

Xintong Song



On Mon, Feb 10, 2020 at 7:55 PM Jark Wu  wrote:

> +1 (binding)
>
> - build the source release with Scala 2.12 and Scala 2.11 successfully
> - checked/verified signatures and hashes
> - started cluster for both Scala 2.11 and 2.12, ran examples, verified web
> ui and log output, nothing unexpected
> - started cluster and run some e2e sql queries, all of them works well and
> the results are as expected:
>   - read from kafka source, aggregate, write into mysql
>   - read from kafka source with watermark defined in ddl, window aggregate,
> write into mysql
>   - read from kafka with computed column defined in ddl, temporal join with
> a mysql table, write into kafka
>
> Cheers,
> Jark
>
>
> On Mon, 10 Feb 2020 at 19:23, Kurt Young  wrote:
>
> > +1 (binding)
> >
> > - verified signatures and checksums
> > - start local cluster, run some examples, randomly play some sql with sql
> > client, no suspicious error/warn log found in log files
> > - repeat above operation with both scala 2.11 and 2.12 binary
> >
> > Best,
> > Kurt
> >
> >
> > On Mon, Feb 10, 2020 at 6:38 PM Yang Wang  wrote:
> >
> > >  +1 non-binding
> > >
> > >
> > > - Building from source with all tests skipped
> > > - Build a custom image with 1.10-rc3
> > > - K8s tests
> > > * Deploy a standalone session cluster on K8s and submit multiple
> jobs
> > > * Deploy a standalone per-job cluster
> > > * Deploy a native session cluster on K8s with/without HA
> configured,
> > > kill TM and jobs could recover successfully
> > >
> > >
> > > Best,
> > > Yang
> > >
> > > Jingsong Li  于2020年2月10日周一 下午4:29写道:
> > >
> > > > Hi,
> > > >
> > > >
> > > > +1 (non-binding) Thanks for driving this, Gary & Yu.
> > > >
> > > >
> > > > There is an unfriendly error here: "OutOfMemoryError: Direct buffer
> > > memory"
> > > > in FileChannelBoundedData$FileBufferReader.
> > > >
> > > > It forces our batch users to configure
> > > > "taskmanager.memory.task.off-heap.size" in production jobs. And users
> > are
> > > > hard to know how much memory they need configure.
> > > >
> > > > Even for us developers, it is hard to say how much memory, it depends
> > on
> > > > tasks left over from the previous stage and the parallelism.
> > > >
> > > >
> > > > It is not a blocker, but hope to resolve it in 1.11.
> > > >
> > > >
> > > > - Verified signatures and checksums
> > > >
> > > > - Maven build from source skip tests
> > > >
> > > > - Verified pom files point to the 1.10.0 version
> > > >
> > > > - Test Hive integration and SQL client: work well
> > > >
> > > >
> > > > Best,
> > > >
> > > > Jingsong Lee
> > > >
> > > > On Mon, Feb 10, 2020 at 12:28 PM Zhu Zhu  wrote:
> > > >
> > > > > My bad. The missing commit info is caused by building from the src
> > code
> > > > zip
> > > > > which does not contain the git info.
> > > > > So this is not a problem.
> > > > >
> > > > > +1 (binding) for rc3
> > > > > Here's what's were verified :
> > > > >  * built successfully from the source code
> > > > >  * run a sample streaming and a batch job with parallelism=1000 on
> > yarn
> > > > > cluster, with the new scheduler and legacy scheduler, the job runs
> > well
> > > > > (tuned some resource configs to enable the jobs to work well)
> > > > >  * killed TMs to trigger failures, the jobs can finally recover
> from
> > > the
> > > > > failures
> > > > >
> > > > > Thanks,
> > > > > Zhu Zhu
> > > > >
> > > > > Zhu Zhu  于2020年2月10日周一 上午12:31写道:
> > > > >
> > > > > > The commit info is shown as  on the web UI and in logs.
> > > > > > Not sure if it's a common issue or just happens to my build only.
> > > > > >
> > > > > > Thanks,
> > > > > > Zhu Zhu
> > > > > >
> > > > > > aihua li  于2020年2月9日周日 下午7:42写道:
> > > > > >
> > > > > >> Yes, but the results you see in the Performance Code Speed
> Center
> > > [3]
> > > > > >> skip FLIP-49.
> > > > > >>  The results of the default configurations are overwritten by
> the
> > > > latest
> > > > > >> results.
> > > > > >>
> > > > > >> > 2020年2月9日 下午5:29,Yu Li  写道:
> > > > > >> >
> > > > > >> > Thanks for the efforts Aihua! These could definitely improve
> our
> > > RC
> > > > > >> test coverage!
> > > > > >> >
> > > > > >> > Just to confirm, that the stability tests were executed with
> the
> > > > same
> > > > > >> test suite for Alibaba production usage, and the e2e performance
> > one
> > > > was
> > > > > >> executed with the test suite proposed in FLIP-83 [1] and
> > FLINK-14917
> > > > > [2],
> > > > > >> and the result could also be observed from our performance
> > > code-speed
> > > > > >> center [3], right?
> > > > > >> >
> > > > > >> > Thanks.
> > > > > >> >
> > > > > >> > Best Regards,
> > > > > >> > Yu
> > > > > >> >
> > > > > >> > [1]
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/conf

[jira] [Created] (FLINK-15968) LegacyTypeInfoDataTypeConverter should support conversion between BINARY/VARBINARY and BYTE_PRIMITIVE_ARRAY_TYPE_INFO

2020-02-10 Thread Zhenghua Gao (Jira)
Zhenghua Gao created FLINK-15968:


 Summary: LegacyTypeInfoDataTypeConverter should support conversion 
between BINARY/VARBINARY and BYTE_PRIMITIVE_ARRAY_TYPE_INFO
 Key: FLINK-15968
 URL: https://issues.apache.org/jira/browse/FLINK-15968
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Legacy Planner, Table SQL / Planner
Affects Versions: 1.11.0
Reporter: Zhenghua Gao


Currently LegacyTypeInfoDataTypeConverter only support conversion between 
DataTypes.BYTES and BYTE_PRIMITIVE_ARRAY_TYPE_INFO. When we update connectors 
to new type system, we need to convert BINARY(n) or VARBINARY(n) to 
BYTE_PRIMITIVE_ARRAY_TYPE_INFO. 

The Hive connector achieve this via depending blink planner‘s conversion logic, 
which is odd because a planner dependency won't be necessary for connectors.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-10 Thread Benchao Li
+1 (non-binding)

- build from source
- start standalone cluster, and run some examples
- played with sql-client with some simple sql
- run tests in IDE
- run some sqls running in 1.9 internal version with 1.10.0-rc3, seems 1.10
behaves well.

Xintong Song  于2020年2月10日周一 下午8:13写道:

> +1 (non-binding)
>
> - build from source (with tests)
> - run nightly e2e tests
> - run example jobs in local/standalone/yarn setups
> - play around with memory configurations on local/standalone/yarn setups
>
> Thank you~
>
> Xintong Song
>
>
>
> On Mon, Feb 10, 2020 at 7:55 PM Jark Wu  wrote:
>
> > +1 (binding)
> >
> > - build the source release with Scala 2.12 and Scala 2.11 successfully
> > - checked/verified signatures and hashes
> > - started cluster for both Scala 2.11 and 2.12, ran examples, verified
> web
> > ui and log output, nothing unexpected
> > - started cluster and run some e2e sql queries, all of them works well
> and
> > the results are as expected:
> >   - read from kafka source, aggregate, write into mysql
> >   - read from kafka source with watermark defined in ddl, window
> aggregate,
> > write into mysql
> >   - read from kafka with computed column defined in ddl, temporal join
> with
> > a mysql table, write into kafka
> >
> > Cheers,
> > Jark
> >
> >
> > On Mon, 10 Feb 2020 at 19:23, Kurt Young  wrote:
> >
> > > +1 (binding)
> > >
> > > - verified signatures and checksums
> > > - start local cluster, run some examples, randomly play some sql with
> sql
> > > client, no suspicious error/warn log found in log files
> > > - repeat above operation with both scala 2.11 and 2.12 binary
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Mon, Feb 10, 2020 at 6:38 PM Yang Wang 
> wrote:
> > >
> > > >  +1 non-binding
> > > >
> > > >
> > > > - Building from source with all tests skipped
> > > > - Build a custom image with 1.10-rc3
> > > > - K8s tests
> > > > * Deploy a standalone session cluster on K8s and submit multiple
> > jobs
> > > > * Deploy a standalone per-job cluster
> > > > * Deploy a native session cluster on K8s with/without HA
> > configured,
> > > > kill TM and jobs could recover successfully
> > > >
> > > >
> > > > Best,
> > > > Yang
> > > >
> > > > Jingsong Li  于2020年2月10日周一 下午4:29写道:
> > > >
> > > > > Hi,
> > > > >
> > > > >
> > > > > +1 (non-binding) Thanks for driving this, Gary & Yu.
> > > > >
> > > > >
> > > > > There is an unfriendly error here: "OutOfMemoryError: Direct buffer
> > > > memory"
> > > > > in FileChannelBoundedData$FileBufferReader.
> > > > >
> > > > > It forces our batch users to configure
> > > > > "taskmanager.memory.task.off-heap.size" in production jobs. And
> users
> > > are
> > > > > hard to know how much memory they need configure.
> > > > >
> > > > > Even for us developers, it is hard to say how much memory, it
> depends
> > > on
> > > > > tasks left over from the previous stage and the parallelism.
> > > > >
> > > > >
> > > > > It is not a blocker, but hope to resolve it in 1.11.
> > > > >
> > > > >
> > > > > - Verified signatures and checksums
> > > > >
> > > > > - Maven build from source skip tests
> > > > >
> > > > > - Verified pom files point to the 1.10.0 version
> > > > >
> > > > > - Test Hive integration and SQL client: work well
> > > > >
> > > > >
> > > > > Best,
> > > > >
> > > > > Jingsong Lee
> > > > >
> > > > > On Mon, Feb 10, 2020 at 12:28 PM Zhu Zhu 
> wrote:
> > > > >
> > > > > > My bad. The missing commit info is caused by building from the
> src
> > > code
> > > > > zip
> > > > > > which does not contain the git info.
> > > > > > So this is not a problem.
> > > > > >
> > > > > > +1 (binding) for rc3
> > > > > > Here's what's were verified :
> > > > > >  * built successfully from the source code
> > > > > >  * run a sample streaming and a batch job with parallelism=1000
> on
> > > yarn
> > > > > > cluster, with the new scheduler and legacy scheduler, the job
> runs
> > > well
> > > > > > (tuned some resource configs to enable the jobs to work well)
> > > > > >  * killed TMs to trigger failures, the jobs can finally recover
> > from
> > > > the
> > > > > > failures
> > > > > >
> > > > > > Thanks,
> > > > > > Zhu Zhu
> > > > > >
> > > > > > Zhu Zhu  于2020年2月10日周一 上午12:31写道:
> > > > > >
> > > > > > > The commit info is shown as  on the web UI and in
> logs.
> > > > > > > Not sure if it's a common issue or just happens to my build
> only.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Zhu Zhu
> > > > > > >
> > > > > > > aihua li  于2020年2月9日周日 下午7:42写道:
> > > > > > >
> > > > > > >> Yes, but the results you see in the Performance Code Speed
> > Center
> > > > [3]
> > > > > > >> skip FLIP-49.
> > > > > > >>  The results of the default configurations are overwritten by
> > the
> > > > > latest
> > > > > > >> results.
> > > > > > >>
> > > > > > >> > 2020年2月9日 下午5:29,Yu Li  写道:
> > > > > > >> >
> > > > > > >> > Thanks for the efforts Aihua! These could definitely improve
> > our
> > > > RC
> > > > > > >> test coverage!
> > > > > > >> >
> > > > > >

[jira] [Created] (FLINK-15969) Do not multiplex both PersistedValue and PersistedTable with a single MapState state handle

2020-02-10 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-15969:
---

 Summary: Do not multiplex both PersistedValue and PersistedTable 
with a single MapState state handle
 Key: FLINK-15969
 URL: https://issues.apache.org/jira/browse/FLINK-15969
 Project: Flink
  Issue Type: Improvement
  Components: Stateful Functions
Affects Versions: statefun-1.1
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai


Currently in Stateful Functions, {{PersistedValue}}s and {{PersistedTable}}s 
are multiplexed under a single {{MapState}}. I propose to split them up, and 
have them multiplexed with 2 separate {{MapState}}s, for the following reasons:
* There's already a problem with the (to-be-introduced) state reader / 
analyzer, that to read a single function's persisted state values, you have to 
iterate through ALL keys (which includes state of other functions) since we 
multiplex everything into a single handle.
* If you multiplex both tables and values into a single state handle, this will 
because even more of a problem in the future, say when the user just wants to 
read table state and not value state.
* If we do decide to separate the handles, we can slim down the 
{{MultiplexedStateKey}} type a bit, by having a separate 
{{MultiplexedTableStateKey}} that has a {{ByteString userKey}} field and a 
{{MultiplexedStateKey prefix}} field. There's already a minor concern with the 
way we use {{MultiplexedStateKey}}: Does protobuf repeated fields require some 
extra metadata written? If yes, its a tad bit redundant size-wise in this case 
since we only ever have 1 user key added.
* When multiplexing both value states and table state under the same state 
handle, the key is essentially ambiguous - there is a possibility that a value 
state's key in {{MapState}} can be set up to overwrite another key of a table 
state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15970) Optimize the Python UDF execution to only serialize the value

2020-02-10 Thread Dian Fu (Jira)
Dian Fu created FLINK-15970:
---

 Summary: Optimize the Python UDF execution to only serialize the 
value
 Key: FLINK-15970
 URL: https://issues.apache.org/jira/browse/FLINK-15970
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Dian Fu
 Fix For: 1.11.0


Currently, the window/timestamp/pane info are also serialized and sent between 
the Java operator and the Python worker. These informations are useless and 
after bumping beam to 2.19.0(BEAM-7951), optimization is possible to not 
serialize these fields.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-10 Thread Patrick Lucas
Now that [FLINK-15828] Integrate docker-flink/docker-flink into Flink
release process  is
complete, the Dockerfiles for 1.10.0 can be published as part of the
release process.

@Gary/@Yu: please let me know if you have any questions regarding the
workflow or its documentation.

--
Patrick

On Mon, Feb 10, 2020 at 1:29 PM Benchao Li  wrote:

> +1 (non-binding)
>
> - build from source
> - start standalone cluster, and run some examples
> - played with sql-client with some simple sql
> - run tests in IDE
> - run some sqls running in 1.9 internal version with 1.10.0-rc3, seems 1.10
> behaves well.
>
> Xintong Song  于2020年2月10日周一 下午8:13写道:
>
> > +1 (non-binding)
> >
> > - build from source (with tests)
> > - run nightly e2e tests
> > - run example jobs in local/standalone/yarn setups
> > - play around with memory configurations on local/standalone/yarn setups
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Mon, Feb 10, 2020 at 7:55 PM Jark Wu  wrote:
> >
> > > +1 (binding)
> > >
> > > - build the source release with Scala 2.12 and Scala 2.11 successfully
> > > - checked/verified signatures and hashes
> > > - started cluster for both Scala 2.11 and 2.12, ran examples, verified
> > web
> > > ui and log output, nothing unexpected
> > > - started cluster and run some e2e sql queries, all of them works well
> > and
> > > the results are as expected:
> > >   - read from kafka source, aggregate, write into mysql
> > >   - read from kafka source with watermark defined in ddl, window
> > aggregate,
> > > write into mysql
> > >   - read from kafka with computed column defined in ddl, temporal join
> > with
> > > a mysql table, write into kafka
> > >
> > > Cheers,
> > > Jark
> > >
> > >
> > > On Mon, 10 Feb 2020 at 19:23, Kurt Young  wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > - verified signatures and checksums
> > > > - start local cluster, run some examples, randomly play some sql with
> > sql
> > > > client, no suspicious error/warn log found in log files
> > > > - repeat above operation with both scala 2.11 and 2.12 binary
> > > >
> > > > Best,
> > > > Kurt
> > > >
> > > >
> > > > On Mon, Feb 10, 2020 at 6:38 PM Yang Wang 
> > wrote:
> > > >
> > > > >  +1 non-binding
> > > > >
> > > > >
> > > > > - Building from source with all tests skipped
> > > > > - Build a custom image with 1.10-rc3
> > > > > - K8s tests
> > > > > * Deploy a standalone session cluster on K8s and submit
> multiple
> > > jobs
> > > > > * Deploy a standalone per-job cluster
> > > > > * Deploy a native session cluster on K8s with/without HA
> > > configured,
> > > > > kill TM and jobs could recover successfully
> > > > >
> > > > >
> > > > > Best,
> > > > > Yang
> > > > >
> > > > > Jingsong Li  于2020年2月10日周一 下午4:29写道:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > >
> > > > > > +1 (non-binding) Thanks for driving this, Gary & Yu.
> > > > > >
> > > > > >
> > > > > > There is an unfriendly error here: "OutOfMemoryError: Direct
> buffer
> > > > > memory"
> > > > > > in FileChannelBoundedData$FileBufferReader.
> > > > > >
> > > > > > It forces our batch users to configure
> > > > > > "taskmanager.memory.task.off-heap.size" in production jobs. And
> > users
> > > > are
> > > > > > hard to know how much memory they need configure.
> > > > > >
> > > > > > Even for us developers, it is hard to say how much memory, it
> > depends
> > > > on
> > > > > > tasks left over from the previous stage and the parallelism.
> > > > > >
> > > > > >
> > > > > > It is not a blocker, but hope to resolve it in 1.11.
> > > > > >
> > > > > >
> > > > > > - Verified signatures and checksums
> > > > > >
> > > > > > - Maven build from source skip tests
> > > > > >
> > > > > > - Verified pom files point to the 1.10.0 version
> > > > > >
> > > > > > - Test Hive integration and SQL client: work well
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > Jingsong Lee
> > > > > >
> > > > > > On Mon, Feb 10, 2020 at 12:28 PM Zhu Zhu 
> > wrote:
> > > > > >
> > > > > > > My bad. The missing commit info is caused by building from the
> > src
> > > > code
> > > > > > zip
> > > > > > > which does not contain the git info.
> > > > > > > So this is not a problem.
> > > > > > >
> > > > > > > +1 (binding) for rc3
> > > > > > > Here's what's were verified :
> > > > > > >  * built successfully from the source code
> > > > > > >  * run a sample streaming and a batch job with parallelism=1000
> > on
> > > > yarn
> > > > > > > cluster, with the new scheduler and legacy scheduler, the job
> > runs
> > > > well
> > > > > > > (tuned some resource configs to enable the jobs to work well)
> > > > > > >  * killed TMs to trigger failures, the jobs can finally recover
> > > from
> > > > > the
> > > > > > > failures
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Zhu Zhu
> > > > > > >
> > > > > > > Zhu Zhu  于2020年2月10日周一 上午12:31写道:
> > > > > > >
> > > > > > > > The commit info is shown as  on the web UI and in
> > lo

[jira] [Created] (FLINK-15971) Adjust the default value of bundle size and bundle time

2020-02-10 Thread Dian Fu (Jira)
Dian Fu created FLINK-15971:
---

 Summary: Adjust the default value of bundle size and bundle time
 Key: FLINK-15971
 URL: https://issues.apache.org/jira/browse/FLINK-15971
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Dian Fu
 Fix For: 1.11.0


Currently the default value for "python.fn-execution.bundle.size" is 1000 and 
the default value for "python.fn-execution.bundle.time" is 1000ms. We should 
try to find out a meaningful default value which works best in most scenarios.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15972) Add Python building blocks to make sure the basic functionality of Python TableFunction could work

2020-02-10 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-15972:


 Summary: Add Python building blocks to make sure the basic 
functionality of Python TableFunction could work
 Key: FLINK-15972
 URL: https://issues.apache.org/jira/browse/FLINK-15972
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python
Reporter: Huang Xingbo
 Fix For: 1.11.0


We need to add a few Python building blocks such as TableFunctionOperation, 
TableFunctionRowCoder, etc for Python TableFunction execution. 
TableFunctionOperation is subclass of Operation in Beam and 
TableFunctionRowCoder, etc are subclasses of Coder in Beam. These classes will 
be registered into the Beam’s portability framework to make sure they take 
effects.

This PR makes sure that a basic end to end Python UDTF could be executed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] FLIP-55: Introduction of a Table API Java Expression DSL

2020-02-10 Thread Timo Walther

+1 for this.

It will also help in making a TableEnvironment.fromElements() possible 
and reduces technical debt. One entry point of TypeInformation less in 
the API.


Regards,
Timo


On 10.02.20 08:31, Dawid Wysakowicz wrote:

Hi all,

I wanted to resurrect the thread about introducing a Java Expression
DSL. Please see the updated flip page[1]. Most of the flip was concluded
in previous discussion thread. The major changes since then are:

* accepting java.lang.Object in the Java DSL

* adding $ interpolation for a column in the Scala DSL

I think it's important to move those changes forward as it makes it
easier to transition to the new type system (Java parser supports only
the old type system stack for now) that we are working on for the past
releases.

Because the previous discussion thread was rather conclusive I want to
start already with a vote. If you think we need another round of
discussion, feel free to say so.


The vote will last for at least 72 hours, following the consensus voting
process.

FLIP wiki:

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-55%3A+Introduction+of+a+Table+API+Java+Expression+DSL


Discussion thread:

https://lists.apache.org/thread.html/eb5e7b0579e5f1da1e9bf1ab4e4b86dba737946f0261d94d8c30521e@%3Cdev.flink.apache.org%3E








[jira] [Created] (FLINK-15973) Optimize the execution plan where it refers the Python UDF result field in the where clause

2020-02-10 Thread Dian Fu (Jira)
Dian Fu created FLINK-15973:
---

 Summary: Optimize the execution plan where it refers the Python 
UDF result field in the where clause
 Key: FLINK-15973
 URL: https://issues.apache.org/jira/browse/FLINK-15973
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Dian Fu
 Fix For: 1.11.0


For the following job:
{code}
t_env.register_function("inc", inc)

table.select("inc(id) as inc_id") \
 .where("inc_id > 0") \
 .insert_into("sink")
{code}

The execution plan is as following:
{code}
StreamExecPythonCalc(select=inc(f0) AS inc_id))
+- StreamExecCalc(select=id AS f0, where=>(f0, 0))
+--- StreamExecPythonCalc(select=id, inc(f0) AS f0))
+-StreamExecCalc(select=id, id AS f0))
+---StreamExecTableSourceScan(fields=id)
{code}

The plan is not the best. It should be as following:
{code}
StreamExecPythonCalc(select=f0)
+- StreamExecCalc(select=f0, where=>(f0, 0))
+--- StreamExecPythonCalc(select=inc(f0) AS f0))
+-StreamExecCalc(select=id, id AS f0))
+---StreamExecTableSourceScan(fields=id)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[DISCUSS] Drop connectors for Elasticsearch 2.x and 5.x

2020-02-10 Thread Dawid Wysakowicz
Hi all,

As described in this https://issues.apache.org/jira/browse/FLINK-11720
ticket our elasticsearch 5.x connector does not work out of the box on
some systems and requires a version bump. This also happens for our e2e.
We cannot bump the version in es 5.x connector, because 5.x connector
shares a common class with 2.x that uses an API that was replaced in 5.2.

Both versions are already long eol: https://www.elastic.co/support/eol

I suggest to drop both connectors 5.x and 2.x. If it is too much to drop
both of them, I would strongly suggest dropping at least 2.x connector
and update the 5.x line to a working es client module.

What do you think? Should we drop both versions? Drop only the 2.x
connector? Or keep them both?

Best,

Dawid




signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS] Drop connectors for Elasticsearch 2.x and 5.x

2020-02-10 Thread Robert Metzger
Thanks for starting this discussion!

+1 to drop both

On Mon, Feb 10, 2020 at 2:45 PM Dawid Wysakowicz 
wrote:

> Hi all,
>
> As described in this https://issues.apache.org/jira/browse/FLINK-11720
> ticket our elasticsearch 5.x connector does not work out of the box on
> some systems and requires a version bump. This also happens for our e2e.
> We cannot bump the version in es 5.x connector, because 5.x connector
> shares a common class with 2.x that uses an API that was replaced in 5.2.
>
> Both versions are already long eol: https://www.elastic.co/support/eol
>
> I suggest to drop both connectors 5.x and 2.x. If it is too much to drop
> both of them, I would strongly suggest dropping at least 2.x connector
> and update the 5.x line to a working es client module.
>
> What do you think? Should we drop both versions? Drop only the 2.x
> connector? Or keep them both?
>
> Best,
>
> Dawid
>
>
>


[jira] [Created] (FLINK-15974) Support to use the Python UDF directly in the Python Table API

2020-02-10 Thread Dian Fu (Jira)
Dian Fu created FLINK-15974:
---

 Summary: Support to use the Python UDF directly in the Python 
Table API
 Key: FLINK-15974
 URL: https://issues.apache.org/jira/browse/FLINK-15974
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Dian Fu
 Fix For: 1.11.0


Currently, a Python UDF has been registered before using in Python Table API, 
e.g.
{code}
t_env.register_function("inc", inc)
table.select("inc(id)") \
 .insert_into("sink")
{code}

It would be great if we could support to use Python UDF directly in the Python 
Table API, e.g.
{code}
table.select(inc("id")) \
 .insert_into("sink")
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Drop connectors for Elasticsearch 2.x and 5.x

2020-02-10 Thread Flavio Pompermaier
+1 for dropping all Elasticsearch connectors < 6.x

On Mon, Feb 10, 2020 at 2:45 PM Dawid Wysakowicz 
wrote:

> Hi all,
>
> As described in this https://issues.apache.org/jira/browse/FLINK-11720
> ticket our elasticsearch 5.x connector does not work out of the box on
> some systems and requires a version bump. This also happens for our e2e.
> We cannot bump the version in es 5.x connector, because 5.x connector
> shares a common class with 2.x that uses an API that was replaced in 5.2.
>
> Both versions are already long eol: https://www.elastic.co/support/eol
>
> I suggest to drop both connectors 5.x and 2.x. If it is too much to drop
> both of them, I would strongly suggest dropping at least 2.x connector
> and update the 5.x line to a working es client module.
>
> What do you think? Should we drop both versions? Drop only the 2.x
> connector? Or keep them both?
>
> Best,
>
> Dawid
>
>


Re: [DISCUSS] Drop connectors for Elasticsearch 2.x and 5.x

2020-02-10 Thread Benchao Li
+1 for dropping 2.x - 5.x.

FYI currently only 6.x and 7.x ES Connectors are supported by table api.

Flavio Pompermaier  于2020年2月10日周一 下午10:03写道:

> +1 for dropping all Elasticsearch connectors < 6.x
>
> On Mon, Feb 10, 2020 at 2:45 PM Dawid Wysakowicz 
> wrote:
>
> > Hi all,
> >
> > As described in this https://issues.apache.org/jira/browse/FLINK-11720
> > ticket our elasticsearch 5.x connector does not work out of the box on
> > some systems and requires a version bump. This also happens for our e2e.
> > We cannot bump the version in es 5.x connector, because 5.x connector
> > shares a common class with 2.x that uses an API that was replaced in 5.2.
> >
> > Both versions are already long eol: https://www.elastic.co/support/eol
> >
> > I suggest to drop both connectors 5.x and 2.x. If it is too much to drop
> > both of them, I would strongly suggest dropping at least 2.x connector
> > and update the 5.x line to a working es client module.
> >
> > What do you think? Should we drop both versions? Drop only the 2.x
> > connector? Or keep them both?
> >
> > Best,
> >
> > Dawid
> >
> >
>


-- 

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: libenc...@gmail.com; libenc...@pku.edu.cn


Re: [DISCUSS] Drop connectors for Elasticsearch 2.x and 5.x

2020-02-10 Thread Aljoscha Krettek

+1 for dropping them, this stuff is quite old by now.

On 10.02.20 15:04, Benchao Li wrote:

+1 for dropping 2.x - 5.x.

FYI currently only 6.x and 7.x ES Connectors are supported by table api.

Flavio Pompermaier  于2020年2月10日周一 下午10:03写道:


+1 for dropping all Elasticsearch connectors < 6.x

On Mon, Feb 10, 2020 at 2:45 PM Dawid Wysakowicz 
wrote:


Hi all,

As described in this https://issues.apache.org/jira/browse/FLINK-11720
ticket our elasticsearch 5.x connector does not work out of the box on
some systems and requires a version bump. This also happens for our e2e.
We cannot bump the version in es 5.x connector, because 5.x connector
shares a common class with 2.x that uses an API that was replaced in 5.2.

Both versions are already long eol: https://www.elastic.co/support/eol

I suggest to drop both connectors 5.x and 2.x. If it is too much to drop
both of them, I would strongly suggest dropping at least 2.x connector
and update the 5.x line to a working es client module.

What do you think? Should we drop both versions? Drop only the 2.x
connector? Or keep them both?

Best,

Dawid









Re: [DISCUSS] Drop connectors for Elasticsearch 2.x and 5.x

2020-02-10 Thread Itamar Syn-Hershko
+1 from dropping old versions because of jar hells etc. However, in the
wild there are still a lot of 2.x clusters and definitely 5.x clusters that
are having a hard time upgrading. We know because we assist those on a
daily basis.

It is very easy to create an HTTP based connector that works with all ES
versions, though. As Elasticsearch consultants and experts we have done
that many times before. For example see this simplified client that has
zero dependencies and can be easily brought in to Flink to use as a sink
for all ES versions:
https://github.com/BigDataBoutique/log4j2-elasticsearch-http/blob/master/src/main/java/com/bigdataboutique/logging/log4j2/ElasticsearchHttpClient.java

Will be happy to assist in such effort

On Mon, Feb 10, 2020 at 3:45 PM Dawid Wysakowicz 
wrote:

> Hi all,
>
> As described in this https://issues.apache.org/jira/browse/FLINK-11720
> ticket our elasticsearch 5.x connector does not work out of the box on
> some systems and requires a version bump. This also happens for our e2e.
> We cannot bump the version in es 5.x connector, because 5.x connector
> shares a common class with 2.x that uses an API that was replaced in 5.2.
>
> Both versions are already long eol: https://www.elastic.co/support/eol
>
> I suggest to drop both connectors 5.x and 2.x. If it is too much to drop
> both of them, I would strongly suggest dropping at least 2.x connector
> and update the 5.x line to a working es client module.
>
> What do you think? Should we drop both versions? Drop only the 2.x
> connector? Or keep them both?
>
> Best,
>
> Dawid
>
>
>

-- 

[image: logo] 
Itamar Syn-Hershko
CTO, Founder

ita...@bigdataboutique.com
https://bigdataboutique.com





Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-10 Thread Kostas Kloudas
Hi all,

+1 (binding)

- Built Flink locally
- Tested quickstart by writing simple, WordCount-like streaming jobs
- Submitted them to Yarn both "per-job" and "session" mode
- used the yarn-session CLI to start/stop sessions

Thanks a lot Gary and Yu for managing the release.

Cheers,
Kostas



On Mon, Feb 10, 2020 at 1:57 PM Patrick Lucas  wrote:
>
> Now that [FLINK-15828] Integrate docker-flink/docker-flink into Flink
> release process  is
> complete, the Dockerfiles for 1.10.0 can be published as part of the
> release process.
>
> @Gary/@Yu: please let me know if you have any questions regarding the
> workflow or its documentation.
>
> --
> Patrick
>
> On Mon, Feb 10, 2020 at 1:29 PM Benchao Li  wrote:
>
> > +1 (non-binding)
> >
> > - build from source
> > - start standalone cluster, and run some examples
> > - played with sql-client with some simple sql
> > - run tests in IDE
> > - run some sqls running in 1.9 internal version with 1.10.0-rc3, seems 1.10
> > behaves well.
> >
> > Xintong Song  于2020年2月10日周一 下午8:13写道:
> >
> > > +1 (non-binding)
> > >
> > > - build from source (with tests)
> > > - run nightly e2e tests
> > > - run example jobs in local/standalone/yarn setups
> > > - play around with memory configurations on local/standalone/yarn setups
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Mon, Feb 10, 2020 at 7:55 PM Jark Wu  wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > - build the source release with Scala 2.12 and Scala 2.11 successfully
> > > > - checked/verified signatures and hashes
> > > > - started cluster for both Scala 2.11 and 2.12, ran examples, verified
> > > web
> > > > ui and log output, nothing unexpected
> > > > - started cluster and run some e2e sql queries, all of them works well
> > > and
> > > > the results are as expected:
> > > >   - read from kafka source, aggregate, write into mysql
> > > >   - read from kafka source with watermark defined in ddl, window
> > > aggregate,
> > > > write into mysql
> > > >   - read from kafka with computed column defined in ddl, temporal join
> > > with
> > > > a mysql table, write into kafka
> > > >
> > > > Cheers,
> > > > Jark
> > > >
> > > >
> > > > On Mon, 10 Feb 2020 at 19:23, Kurt Young  wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > - verified signatures and checksums
> > > > > - start local cluster, run some examples, randomly play some sql with
> > > sql
> > > > > client, no suspicious error/warn log found in log files
> > > > > - repeat above operation with both scala 2.11 and 2.12 binary
> > > > >
> > > > > Best,
> > > > > Kurt
> > > > >
> > > > >
> > > > > On Mon, Feb 10, 2020 at 6:38 PM Yang Wang 
> > > wrote:
> > > > >
> > > > > >  +1 non-binding
> > > > > >
> > > > > >
> > > > > > - Building from source with all tests skipped
> > > > > > - Build a custom image with 1.10-rc3
> > > > > > - K8s tests
> > > > > > * Deploy a standalone session cluster on K8s and submit
> > multiple
> > > > jobs
> > > > > > * Deploy a standalone per-job cluster
> > > > > > * Deploy a native session cluster on K8s with/without HA
> > > > configured,
> > > > > > kill TM and jobs could recover successfully
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > > Yang
> > > > > >
> > > > > > Jingsong Li  于2020年2月10日周一 下午4:29写道:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > >
> > > > > > > +1 (non-binding) Thanks for driving this, Gary & Yu.
> > > > > > >
> > > > > > >
> > > > > > > There is an unfriendly error here: "OutOfMemoryError: Direct
> > buffer
> > > > > > memory"
> > > > > > > in FileChannelBoundedData$FileBufferReader.
> > > > > > >
> > > > > > > It forces our batch users to configure
> > > > > > > "taskmanager.memory.task.off-heap.size" in production jobs. And
> > > users
> > > > > are
> > > > > > > hard to know how much memory they need configure.
> > > > > > >
> > > > > > > Even for us developers, it is hard to say how much memory, it
> > > depends
> > > > > on
> > > > > > > tasks left over from the previous stage and the parallelism.
> > > > > > >
> > > > > > >
> > > > > > > It is not a blocker, but hope to resolve it in 1.11.
> > > > > > >
> > > > > > >
> > > > > > > - Verified signatures and checksums
> > > > > > >
> > > > > > > - Maven build from source skip tests
> > > > > > >
> > > > > > > - Verified pom files point to the 1.10.0 version
> > > > > > >
> > > > > > > - Test Hive integration and SQL client: work well
> > > > > > >
> > > > > > >
> > > > > > > Best,
> > > > > > >
> > > > > > > Jingsong Lee
> > > > > > >
> > > > > > > On Mon, Feb 10, 2020 at 12:28 PM Zhu Zhu 
> > > wrote:
> > > > > > >
> > > > > > > > My bad. The missing commit info is caused by building from the
> > > src
> > > > > code
> > > > > > > zip
> > > > > > > > which does not contain the git info.
> > > > > > > > So this is not a problem.
> > > > > > > >
> > > > > > > > +1 (binding) for rc3
> > > > > > > > Here's what's were verified :
> > > > > > > > 

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-10 Thread Rong Rong
Yes. I think the argument is fairly valid - we can always adjust the API in
the future, in fact most of the APIs are labeled publicEvolving at this
moment.
I was only trying to provide the info, that the interfaces in flink-ml-api
might change in the near future, for others when voting.

In fact, I am actually always +1 on moving flink-ml-api to /opt :-)
Regarding the Python ML API. sorry for not noticing it earlier as I haven't
given it a deep look yet. will do very soon!

--
Rong

On Sun, Feb 9, 2020 at 7:33 PM Hequn Cheng  wrote:

> Hi Rong,
>
> Thanks a lot for joining the discussion!
>
> It would be great if we can have a long term plan. My intention is to
> provide a way for users to add dependencies of Flink ML, either through the
> opt or download page. This would be more and more critical along with the
> improvement of the Flink ML, as you said there are multiple PRs under
> review and I'm also going to support Python Pipeline API recently[1].
>
> Meanwhile, it also makes sense to include the API into the opt, so it
> would probably not break the long term plan.
> However, even find something wrong in the future, we can revisit this
> easily instead of blocking the improvement for users. What do you think?
>
> Best,
> Hequn
>
> [1]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-Python-ML-Pipeline-API-td37291.html
>
> On Sat, Feb 8, 2020 at 1:57 AM Rong Rong  wrote:
>
>> CC @Xu Yang 
>>
>> Thanks for starting the discussion @Hequn Cheng  and
>> sorry for joining the discussion late.
>>
>> I've mainly helped merging the code in flink-ml-api and flink-ml-lib in
>> the past several months.
>> IMO the flink-ml-api are an extension on top of the table API and agree
>> that it should be treated as a part of the "core" core.
>>
>> However, I think given the fact that there are multiple PRs still under
>> review [1], is it a better idea to come up with a long term plan first
>> before make the decision to moving it to /opt now?
>>
>>
>> --
>> Rong
>>
>> [1]
>> https://github.com/apache/flink/pulls?utf8=%E2%9C%93&q=is%3Apr+is%3Aopen+label%3Acomponent%3DLibrary%2FMachineLearning+
>>
>> On Fri, Feb 7, 2020 at 5:54 AM Hequn Cheng  wrote:
>>
>>> Hi,
>>>
>>> @Till Rohrmann  Thanks for the great inputs. I
>>> agree
>>> with you that we should have a long term plan for this. It definitely
>>> deserves another discussion.
>>> @Jeff Zhang  Thanks for your reports and ideas. It's a
>>> good idea to improve the error messages. Do we have any JIRAs for it or
>>> maybe we can create one for it.
>>>
>>> Thank you again for your feedback and suggestions. I will go on with the
>>> PR. Thanks!
>>>
>>> Best,
>>> Hequn
>>>
>>> On Thu, Feb 6, 2020 at 11:51 PM Jeff Zhang  wrote:
>>>
>>> > I have another concern which may not be closely related to this thread.
>>> > Since flink doesn't include all the necessary jars, I think it is
>>> critical
>>> > for flink to display meaningful error message when any class is
>>> missing.
>>> > e.g. Here's the error message when I use kafka but miss
>>> > including flink-json.  To be honest, the kind of error message is hard
>>> to
>>> > understand for new users.
>>> >
>>> >
>>> > Reason: No factory implements
>>> > 'org.apache.flink.table.factories.DeserializationSchemaFactory'. The
>>> > following properties are requested:
>>> > connector.properties.bootstrap.servers=localhost:9092
>>> > connector.properties.group.id=testGroup
>>> > connector.properties.zookeeper.connect=localhost:2181
>>> > connector.startup-mode=earliest-offset connector.topic=generated.events
>>> > connector.type=kafka connector.version=universal format.type=json
>>> > schema.0.data-type=VARCHAR(2147483647) schema.0.name=status
>>> > schema.1.data-type=VARCHAR(2147483647) schema.1.name=direction
>>> > schema.2.data-type=BIGINT schema.2.name=event_ts update-mode=append
>>> The
>>> > following factories have been considered:
>>> > org.apache.flink.table.catalog.hive.factories.HiveCatalogFactory
>>> > org.apache.flink.table.module.hive.HiveModuleFactory
>>> > org.apache.flink.table.module.CoreModuleFactory
>>> > org.apache.flink.table.catalog.GenericInMemoryCatalogFactory
>>> > org.apache.flink.table.sources.CsvBatchTableSourceFactory
>>> > org.apache.flink.table.sources.CsvAppendTableSourceFactory
>>> > org.apache.flink.table.sinks.CsvBatchTableSinkFactory
>>> > org.apache.flink.table.sinks.CsvAppendTableSinkFactory
>>> > org.apache.flink.table.planner.delegation.BlinkPlannerFactory
>>> > org.apache.flink.table.planner.delegation.BlinkExecutorFactory
>>> > org.apache.flink.table.planner.StreamPlannerFactory
>>> > org.apache.flink.table.executor.StreamExecutorFactory
>>> >
>>> org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactory at
>>> >
>>> >
>>> org.apache.flink.table.factories.TableFactoryService.filterByFactoryClass(TableFactoryService.java:238)
>>> > at
>>> >
>>> >
>>> org.apache.flink.table.factories.TableFactoryService.filter(TableFactoryService.java:185)
>>> > at

[jira] [Created] (FLINK-15975) Use LinkedHashMap for deterministic iterations

2020-02-10 Thread testfixer0 (Jira)
testfixer0 created FLINK-15975:
--

 Summary: Use LinkedHashMap for deterministic iterations
 Key: FLINK-15975
 URL: https://issues.apache.org/jira/browse/FLINK-15975
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Reporter: testfixer0


The test `testMap` in `HiveGenericUDFTest` invokes two methods 
`HiveGenericUDFTest.init` and `HiveScalarFunction.eval`.  These two methods 
depend on `getConversion` and `toFlinkObject` in class `HiveInspectors`. When 
the `inspector` is the instance of `MapObjectInspector`, it can return a 
`HashMap`. However, `HashMap` does not guarantee any specific order of entries. 
Thus, the test can fail due to a different iteration order.

In this PR, we propose to use a `LinkedHashMap` instead to guarantee the 
iteration order.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-10 Thread Thomas Weise
+1 (binding)

- verified signatures and hashes
- rebased internal deploy to rc3 and verified that previously reported
issues are resolved

Thank you for addressing these issues promptly!

Thomas

On Fri, Feb 7, 2020 at 1:54 PM Gary Yao  wrote:

> Hi everyone,
> Please review and vote on the release candidate #3 for the version 1.10.0,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org [2], which are signed with the key with
> fingerprint BB137807CEFBE7DD2616556710B12A1F89C115E8 [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "release-1.10.0-rc3" [5],
> * website pull request listing the new release and adding announcement
> blog post [6][7].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Yu & Gary
>
> [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845
> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc3/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4] https://repository.apache.org/content/repositories/orgapacheflink-1333
> [5] https://github.com/apache/flink/releases/tag/release-1.10.0-rc3
> [6] https://github.com/apache/flink-web/pull/302
> [7] https://github.com/apache/flink-web/pull/301
>


Total recovery time estimation after checkpoint recovery

2020-02-10 Thread Woods, Jessica Hui
??Hi,

I am working with Flink at the moment and am interested in knowing how one 
could estimate the Total Recovery Time for an application after checkpoint 
recovery. What I am specifically interested in is knowing the time needed for 
the recovery of the state + the catch-up phase (since the application's source 
tasks are reset to an earlier input position after recovery, this would be the 
data it processed before the failure and data that accumulated while the 
application was down).

My questions are, What important considerations should I take into account to 
estimate this time and which parts of the codebase would this modification 
involve?

Thanks,
Jessica



[jira] [Created] (FLINK-15976) Build snapshots for Scala 2.12

2020-02-10 Thread Leonid Ilyevsky (Jira)
Leonid Ilyevsky created FLINK-15976:
---

 Summary: Build snapshots for Scala 2.12
 Key: FLINK-15976
 URL: https://issues.apache.org/jira/browse/FLINK-15976
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.10.0, 1.10.1
Reporter: Leonid Ilyevsky


In snapshot repository 
[https://repository.apache.org/content/repositories/snapshots/org/apache/flink/]
 I don't see versions for scala 2.12, only 2.10 and 2.11.

I need to test my code against Flink 1.10, and I use scala 2.12.

Could you please compile it with 2.12 ? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15977) Update pull request template to include Kubernetes as deployment candidates

2020-02-10 Thread Zili Chen (Jira)
Zili Chen created FLINK-15977:
-

 Summary: Update pull request template to include Kubernetes as 
deployment candidates
 Key: FLINK-15977
 URL: https://issues.apache.org/jira/browse/FLINK-15977
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Zili Chen
Assignee: Zili Chen






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-10 Thread tison
+1 (non-binding)

- source doesn't contain binary files & build from source
- run against stability test cases
- manually verify YARN CLI & deployment work as expected

Best,
tison.


Thomas Weise  于2020年2月11日周二 上午2:48写道:

> +1 (binding)
>
> - verified signatures and hashes
> - rebased internal deploy to rc3 and verified that previously reported
> issues are resolved
>
> Thank you for addressing these issues promptly!
>
> Thomas
>
> On Fri, Feb 7, 2020 at 1:54 PM Gary Yao  wrote:
>
> > Hi everyone,
> > Please review and vote on the release candidate #3 for the version
> 1.10.0,
> > as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> >
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1],
> > * the official Apache source release and binary convenience releases to
> be
> > deployed to dist.apache.org [2], which are signed with the key with
> > fingerprint BB137807CEFBE7DD2616556710B12A1F89C115E8 [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag "release-1.10.0-rc3" [5],
> > * website pull request listing the new release and adding announcement
> > blog post [6][7].
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> > approval, with at least 3 PMC affirmative votes.
> >
> > Thanks,
> > Yu & Gary
> >
> > [1]
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845
> > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc3/
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1333
> > [5] https://github.com/apache/flink/releases/tag/release-1.10.0-rc3
> > [6] https://github.com/apache/flink-web/pull/302
> > [7] https://github.com/apache/flink-web/pull/301
> >
>


Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-10 Thread jincheng sun
+1 (binding)

- Download the source code [SUCCESS]
- Build the source release with Scala 2.12 and Scala 2.11 [SUCCESS]
- Checked/verified signatures and hashes [SUCCESS]
- Pip install PyFlink by `pip3 install apache-flink-1.10.0.tar.gz` [SUCCESS]
- Run a word_count.py both in command line and PyCharm [SUCCESS]

Best,
Jincheng


tison  于2020年2月11日周二 上午7:56写道:

> +1 (non-binding)
>
> - source doesn't contain binary files & build from source
> - run against stability test cases
> - manually verify YARN CLI & deployment work as expected
>
> Best,
> tison.
>
>
> Thomas Weise  于2020年2月11日周二 上午2:48写道:
>
> > +1 (binding)
> >
> > - verified signatures and hashes
> > - rebased internal deploy to rc3 and verified that previously reported
> > issues are resolved
> >
> > Thank you for addressing these issues promptly!
> >
> > Thomas
> >
> > On Fri, Feb 7, 2020 at 1:54 PM Gary Yao  wrote:
> >
> > > Hi everyone,
> > > Please review and vote on the release candidate #3 for the version
> > 1.10.0,
> > > as follows:
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > >
> > >
> > > The complete staging area is available for your review, which includes:
> > > * JIRA release notes [1],
> > > * the official Apache source release and binary convenience releases to
> > be
> > > deployed to dist.apache.org [2], which are signed with the key with
> > > fingerprint BB137807CEFBE7DD2616556710B12A1F89C115E8 [3],
> > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > * source code tag "release-1.10.0-rc3" [5],
> > > * website pull request listing the new release and adding announcement
> > > blog post [6][7].
> > >
> > > The vote will be open for at least 72 hours. It is adopted by majority
> > > approval, with at least 3 PMC affirmative votes.
> > >
> > > Thanks,
> > > Yu & Gary
> > >
> > > [1]
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845
> > > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc3/
> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > [4]
> > https://repository.apache.org/content/repositories/orgapacheflink-1333
> > > [5] https://github.com/apache/flink/releases/tag/release-1.10.0-rc3
> > > [6] https://github.com/apache/flink-web/pull/302
> > > [7] https://github.com/apache/flink-web/pull/301
> > >
> >
>


Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-10 Thread Hequn Cheng
Hi Rong,

That's great! Looking forward to your feedback.

Thanks,
Hequn


On Tue, Feb 11, 2020 at 1:06 AM Rong Rong  wrote:

> Yes. I think the argument is fairly valid - we can always adjust the API
> in the future, in fact most of the APIs are labeled publicEvolving at this
> moment.
> I was only trying to provide the info, that the interfaces in flink-ml-api
> might change in the near future, for others when voting.
>
> In fact, I am actually always +1 on moving flink-ml-api to /opt :-)
> Regarding the Python ML API. sorry for not noticing it earlier as I
> haven't given it a deep look yet. will do very soon!
>
> --
> Rong
>
> On Sun, Feb 9, 2020 at 7:33 PM Hequn Cheng  wrote:
>
>> Hi Rong,
>>
>> Thanks a lot for joining the discussion!
>>
>> It would be great if we can have a long term plan. My intention is to
>> provide a way for users to add dependencies of Flink ML, either through the
>> opt or download page. This would be more and more critical along with the
>> improvement of the Flink ML, as you said there are multiple PRs under
>> review and I'm also going to support Python Pipeline API recently[1].
>>
>> Meanwhile, it also makes sense to include the API into the opt, so it
>> would probably not break the long term plan.
>> However, even find something wrong in the future, we can revisit this
>> easily instead of blocking the improvement for users. What do you think?
>>
>> Best,
>> Hequn
>>
>> [1]
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-Python-ML-Pipeline-API-td37291.html
>>
>> On Sat, Feb 8, 2020 at 1:57 AM Rong Rong  wrote:
>>
>>> CC @Xu Yang 
>>>
>>> Thanks for starting the discussion @Hequn Cheng  and
>>> sorry for joining the discussion late.
>>>
>>> I've mainly helped merging the code in flink-ml-api and flink-ml-lib in
>>> the past several months.
>>> IMO the flink-ml-api are an extension on top of the table API and agree
>>> that it should be treated as a part of the "core" core.
>>>
>>> However, I think given the fact that there are multiple PRs still under
>>> review [1], is it a better idea to come up with a long term plan first
>>> before make the decision to moving it to /opt now?
>>>
>>>
>>> --
>>> Rong
>>>
>>> [1]
>>> https://github.com/apache/flink/pulls?utf8=%E2%9C%93&q=is%3Apr+is%3Aopen+label%3Acomponent%3DLibrary%2FMachineLearning+
>>>
>>> On Fri, Feb 7, 2020 at 5:54 AM Hequn Cheng  wrote:
>>>
 Hi,

 @Till Rohrmann  Thanks for the great inputs. I
 agree
 with you that we should have a long term plan for this. It definitely
 deserves another discussion.
 @Jeff Zhang  Thanks for your reports and ideas. It's
 a
 good idea to improve the error messages. Do we have any JIRAs for it or
 maybe we can create one for it.

 Thank you again for your feedback and suggestions. I will go on with the
 PR. Thanks!

 Best,
 Hequn

 On Thu, Feb 6, 2020 at 11:51 PM Jeff Zhang  wrote:

 > I have another concern which may not be closely related to this
 thread.
 > Since flink doesn't include all the necessary jars, I think it is
 critical
 > for flink to display meaningful error message when any class is
 missing.
 > e.g. Here's the error message when I use kafka but miss
 > including flink-json.  To be honest, the kind of error message is
 hard to
 > understand for new users.
 >
 >
 > Reason: No factory implements
 > 'org.apache.flink.table.factories.DeserializationSchemaFactory'. The
 > following properties are requested:
 > connector.properties.bootstrap.servers=localhost:9092
 > connector.properties.group.id=testGroup
 > connector.properties.zookeeper.connect=localhost:2181
 > connector.startup-mode=earliest-offset
 connector.topic=generated.events
 > connector.type=kafka connector.version=universal format.type=json
 > schema.0.data-type=VARCHAR(2147483647) schema.0.name=status
 > schema.1.data-type=VARCHAR(2147483647) schema.1.name=direction
 > schema.2.data-type=BIGINT schema.2.name=event_ts update-mode=append
 The
 > following factories have been considered:
 > org.apache.flink.table.catalog.hive.factories.HiveCatalogFactory
 > org.apache.flink.table.module.hive.HiveModuleFactory
 > org.apache.flink.table.module.CoreModuleFactory
 > org.apache.flink.table.catalog.GenericInMemoryCatalogFactory
 > org.apache.flink.table.sources.CsvBatchTableSourceFactory
 > org.apache.flink.table.sources.CsvAppendTableSourceFactory
 > org.apache.flink.table.sinks.CsvBatchTableSinkFactory
 > org.apache.flink.table.sinks.CsvAppendTableSinkFactory
 > org.apache.flink.table.planner.delegation.BlinkPlannerFactory
 > org.apache.flink.table.planner.delegation.BlinkExecutorFactory
 > org.apache.flink.table.planner.StreamPlannerFactory
 > org.apache.flink.table.executor.StreamExecutorFactory
 >
 org.apache.flink.streaming.connectors.kafka.Kafka

[jira] [Created] (FLINK-15978) Publish Docerfiles for release 1.10.0

2020-02-10 Thread Yu Li (Jira)
Yu Li created FLINK-15978:
-

 Summary: Publish Docerfiles for release 1.10.0
 Key: FLINK-15978
 URL: https://issues.apache.org/jira/browse/FLINK-15978
 Project: Flink
  Issue Type: Task
  Components: Release System
Affects Versions: 1.10.0
Reporter: Yu Li
Assignee: Yu Li
 Fix For: 1.10.0


Publish the Dockerfiles for 1.10.0 after the RC voting passed, to finalize the 
release process as 
[documented|https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Support scalar vectorized Python UDF in PyFlink

2020-02-10 Thread Jingsong Li
Hi Dian and Jincheng,

Thanks for your explanation. Think again. Maybe most of users don't want to
modify this parameters.
We all realize that "batch.size" should be a larger value, so "bundle.size"
must also be increased. Now the default value of "bundle.size" is only 1000.
I think you can update design to provide meaningful default value for
"batch.size" and "bundle.size".

Best,
Jingsong Lee

On Mon, Feb 10, 2020 at 4:36 PM Dian Fu  wrote:

> Hi Jincheng, Hequn & Jingsong,
>
> Thanks a lot for your suggestions. I have created FLIP-97[1] for this
> feature.
>
> > One little suggestion: maybe it would be nice if we can add some
> performance explanation in the document? (I just very curious:))
> Thanks for the suggestion. I have updated the design doc in the
> "BackGround" section about where the performance gains could be got from.
>
> > It seems that a batch should always in a bundle. Bundle size should
> always
> bigger than batch size. (if a batch can not cross bundle).
> Can you explain this relationship to the document?
> I have updated the design doc explaining more about these two
> configurations.
>
> > In the batch world, vectorization batch size is about 1024+. What do you
> think about the default value of "batch"?
> Is there any link about where this value comes from? I have performed a
> simple test for Pandas UDF which performs the simple +1 operation. The
> performance is best when the batch size is set to 5000. I think it depends
> on the data type of each column, the functionality the Pandas UDF does,
> etc. However I agree with you that we could give a meaningful default value
> for the "batch" size which works in most scenarios.
>
> > Can we only configure one parameter and calculate another automatically?
> For example, if we just want to "pipeline", "bundle.size" is twice as much
> as "batch.size", is this work?
> I agree with Jincheng that this is not feasible. I think that giving an
> meaningful default value for the "batch.size" which works in most scenarios
> is enough. What's your thought?
>
> Thanks,
> Dian
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-97%3A+Support+Scalar+Vectorized+Python+UDF+in+PyFlink
>
>
> On Mon, Feb 10, 2020 at 4:25 PM jincheng sun 
> wrote:
>
> > Hi Jingsong,
> >
> > Thanks for your feedback! I would like to share my thoughts regarding the
> > follows question:
> >
> > >> - Can we only configure one parameter and calculate another
> > automatically? For example, if we just want to "pipeline", "bundle.size"
> is
> > twice as much as "batch.size", is this work?
> >
> > I don't think this works. These two configurations are used for different
> > purposes and there is no direct relationship between them and so I guess
> we
> > cannot infer a configuration from the other configuration.
> >
> > Best,
> > Jincheng
> >
> >
> > Jingsong Li  于2020年2月10日周一 下午1:53写道:
> >
> > > Thanks Dian for your reply.
> > >
> > > +1 to create a FLIP too.
> > >
> > > About "python.fn-execution.bundle.size" and
> > > "python.fn-execution.arrow.batch.size", I got what are you mean about
> > > "pipeline". I agree.
> > > It seems that a batch should always in a bundle. Bundle size should
> > always
> > > bigger than batch size. (if a batch can not cross bundle).
> > > Can you explain this relationship to the document?
> > >
> > > I think default value is a very important thing, we can discuss:
> > > - In the batch world, vectorization batch size is about 1024+. What do
> > you
> > > think about the default value of "batch"?
> > > - Can we only configure one parameter and calculate another
> > automatically?
> > > For example, if we just want to "pipeline", "bundle.size" is twice as
> > much
> > > as "batch.size", is this work?
> > >
> > > Best,
> > > Jingsong Lee
> > >
> > > On Mon, Feb 10, 2020 at 11:55 AM Hequn Cheng  wrote:
> > >
> > > > Hi Dian,
> > > >
> > > > Thanks a lot for bringing up the discussion!
> > > >
> > > > It is great to see the Pandas UDFs feature is going to be
> introduced. I
> > > > think this would improve the performance and also the usability of
> > > > user-defined functions (UDFs) in Python.
> > > > One little suggestion: maybe it would be nice if we can add some
> > > > performance explanation in the document? (I just very curious:))
> > > >
> > > > +1 to create a FLIP for this big enhancement.
> > > >
> > > > Best,
> > > > Hequn
> > > >
> > > > On Mon, Feb 10, 2020 at 11:15 AM jincheng sun <
> > sunjincheng...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Dian,
> > > > >
> > > > > Thanks for bring up this discussion. This is very important for the
> > > > > ecological of PyFlink. Add support Pandas greatly enriches the
> > > available
> > > > > UDF library of PyFlink and greatly improves the usability of
> PyFlink!
> > > > >
> > > > > +1 for Support scalar vectorized Python UDF.
> > > > >
> > > > > I think we should to create a FLIP for this big enhancements. :)
> > > > >
> > > > > What do you think?
> > > > >
> > > > > Best,
> > >

Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-10 Thread Yu Li
Thanks for the reminder Patrick! According to the release process [1] we
will publish the Dockerfiles *after* the RC voting passed, to finalize the
release.

I have created FLINK-15978 [2] and prepared a PR [3] for it, will follow up
after we conclude our RC vote. Thanks.

Best Regards,
Yu

[1]
https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release
[2] https://issues.apache.org/jira/browse/FLINK-15978
[3] https://github.com/apache/flink-docker/pull/6


On Mon, 10 Feb 2020 at 20:57, Patrick Lucas  wrote:

> Now that [FLINK-15828] Integrate docker-flink/docker-flink into Flink
> release process  is
> complete, the Dockerfiles for 1.10.0 can be published as part of the
> release process.
>
> @Gary/@Yu: please let me know if you have any questions regarding the
> workflow or its documentation.
>
> --
> Patrick
>
> On Mon, Feb 10, 2020 at 1:29 PM Benchao Li  wrote:
>
> > +1 (non-binding)
> >
> > - build from source
> > - start standalone cluster, and run some examples
> > - played with sql-client with some simple sql
> > - run tests in IDE
> > - run some sqls running in 1.9 internal version with 1.10.0-rc3, seems
> 1.10
> > behaves well.
> >
> > Xintong Song  于2020年2月10日周一 下午8:13写道:
> >
> > > +1 (non-binding)
> > >
> > > - build from source (with tests)
> > > - run nightly e2e tests
> > > - run example jobs in local/standalone/yarn setups
> > > - play around with memory configurations on local/standalone/yarn
> setups
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Mon, Feb 10, 2020 at 7:55 PM Jark Wu  wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > - build the source release with Scala 2.12 and Scala 2.11
> successfully
> > > > - checked/verified signatures and hashes
> > > > - started cluster for both Scala 2.11 and 2.12, ran examples,
> verified
> > > web
> > > > ui and log output, nothing unexpected
> > > > - started cluster and run some e2e sql queries, all of them works
> well
> > > and
> > > > the results are as expected:
> > > >   - read from kafka source, aggregate, write into mysql
> > > >   - read from kafka source with watermark defined in ddl, window
> > > aggregate,
> > > > write into mysql
> > > >   - read from kafka with computed column defined in ddl, temporal
> join
> > > with
> > > > a mysql table, write into kafka
> > > >
> > > > Cheers,
> > > > Jark
> > > >
> > > >
> > > > On Mon, 10 Feb 2020 at 19:23, Kurt Young  wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > - verified signatures and checksums
> > > > > - start local cluster, run some examples, randomly play some sql
> with
> > > sql
> > > > > client, no suspicious error/warn log found in log files
> > > > > - repeat above operation with both scala 2.11 and 2.12 binary
> > > > >
> > > > > Best,
> > > > > Kurt
> > > > >
> > > > >
> > > > > On Mon, Feb 10, 2020 at 6:38 PM Yang Wang 
> > > wrote:
> > > > >
> > > > > >  +1 non-binding
> > > > > >
> > > > > >
> > > > > > - Building from source with all tests skipped
> > > > > > - Build a custom image with 1.10-rc3
> > > > > > - K8s tests
> > > > > > * Deploy a standalone session cluster on K8s and submit
> > multiple
> > > > jobs
> > > > > > * Deploy a standalone per-job cluster
> > > > > > * Deploy a native session cluster on K8s with/without HA
> > > > configured,
> > > > > > kill TM and jobs could recover successfully
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > > Yang
> > > > > >
> > > > > > Jingsong Li  于2020年2月10日周一 下午4:29写道:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > >
> > > > > > > +1 (non-binding) Thanks for driving this, Gary & Yu.
> > > > > > >
> > > > > > >
> > > > > > > There is an unfriendly error here: "OutOfMemoryError: Direct
> > buffer
> > > > > > memory"
> > > > > > > in FileChannelBoundedData$FileBufferReader.
> > > > > > >
> > > > > > > It forces our batch users to configure
> > > > > > > "taskmanager.memory.task.off-heap.size" in production jobs. And
> > > users
> > > > > are
> > > > > > > hard to know how much memory they need configure.
> > > > > > >
> > > > > > > Even for us developers, it is hard to say how much memory, it
> > > depends
> > > > > on
> > > > > > > tasks left over from the previous stage and the parallelism.
> > > > > > >
> > > > > > >
> > > > > > > It is not a blocker, but hope to resolve it in 1.11.
> > > > > > >
> > > > > > >
> > > > > > > - Verified signatures and checksums
> > > > > > >
> > > > > > > - Maven build from source skip tests
> > > > > > >
> > > > > > > - Verified pom files point to the 1.10.0 version
> > > > > > >
> > > > > > > - Test Hive integration and SQL client: work well
> > > > > > >
> > > > > > >
> > > > > > > Best,
> > > > > > >
> > > > > > > Jingsong Lee
> > > > > > >
> > > > > > > On Mon, Feb 10, 2020 at 12:28 PM Zhu Zhu 
> > > wrote:
> > > > > > >
> > > > > > > > My bad. The missing commit info is caused by building from
> the
> > > src
> > > > > code
> > > > > > > zip
> > > > > > > >

Re: [VOTE] FLIP-55: Introduction of a Table API Java Expression DSL

2020-02-10 Thread Jingsong Li
Hi Dawid,

Thanks for driving.

- adding $ in scala api looks good to me.
- Just a question, what should be expected to java.lang.Object? literal
object or expression? So the Object is the grammatical sugar of literal?

Best,
Jingsong Lee

On Mon, Feb 10, 2020 at 9:40 PM Timo Walther  wrote:

> +1 for this.
>
> It will also help in making a TableEnvironment.fromElements() possible
> and reduces technical debt. One entry point of TypeInformation less in
> the API.
>
> Regards,
> Timo
>
>
> On 10.02.20 08:31, Dawid Wysakowicz wrote:
> > Hi all,
> >
> > I wanted to resurrect the thread about introducing a Java Expression
> > DSL. Please see the updated flip page[1]. Most of the flip was concluded
> > in previous discussion thread. The major changes since then are:
> >
> > * accepting java.lang.Object in the Java DSL
> >
> > * adding $ interpolation for a column in the Scala DSL
> >
> > I think it's important to move those changes forward as it makes it
> > easier to transition to the new type system (Java parser supports only
> > the old type system stack for now) that we are working on for the past
> > releases.
> >
> > Because the previous discussion thread was rather conclusive I want to
> > start already with a vote. If you think we need another round of
> > discussion, feel free to say so.
> >
> >
> > The vote will last for at least 72 hours, following the consensus voting
> > process.
> >
> > FLIP wiki:
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-55%3A+Introduction+of+a+Table+API+Java+Expression+DSL
> >
> >
> > Discussion thread:
> >
> >
> https://lists.apache.org/thread.html/eb5e7b0579e5f1da1e9bf1ab4e4b86dba737946f0261d94d8c30521e@%3Cdev.flink.apache.org%3E
> >
> >
> >
> >
>
>

-- 
Best, Jingsong Lee


Re: [DISCUSS] Support scalar vectorized Python UDF in PyFlink

2020-02-10 Thread Dian Fu
Hi Jingsong,

You're right. I have updated the FLIP which reflects this. 

Thanks,
Dian

> 在 2020年2月11日,上午10:03,Jingsong Li  写道:
> 
> Hi Dian and Jincheng,
> 
> Thanks for your explanation. Think again. Maybe most of users don't want to
> modify this parameters.
> We all realize that "batch.size" should be a larger value, so "bundle.size"
> must also be increased. Now the default value of "bundle.size" is only 1000.
> I think you can update design to provide meaningful default value for
> "batch.size" and "bundle.size".
> 
> Best,
> Jingsong Lee
> 
> On Mon, Feb 10, 2020 at 4:36 PM Dian Fu  wrote:
> 
>> Hi Jincheng, Hequn & Jingsong,
>> 
>> Thanks a lot for your suggestions. I have created FLIP-97[1] for this
>> feature.
>> 
>>> One little suggestion: maybe it would be nice if we can add some
>> performance explanation in the document? (I just very curious:))
>> Thanks for the suggestion. I have updated the design doc in the
>> "BackGround" section about where the performance gains could be got from.
>> 
>>> It seems that a batch should always in a bundle. Bundle size should
>> always
>> bigger than batch size. (if a batch can not cross bundle).
>> Can you explain this relationship to the document?
>> I have updated the design doc explaining more about these two
>> configurations.
>> 
>>> In the batch world, vectorization batch size is about 1024+. What do you
>> think about the default value of "batch"?
>> Is there any link about where this value comes from? I have performed a
>> simple test for Pandas UDF which performs the simple +1 operation. The
>> performance is best when the batch size is set to 5000. I think it depends
>> on the data type of each column, the functionality the Pandas UDF does,
>> etc. However I agree with you that we could give a meaningful default value
>> for the "batch" size which works in most scenarios.
>> 
>>> Can we only configure one parameter and calculate another automatically?
>> For example, if we just want to "pipeline", "bundle.size" is twice as much
>> as "batch.size", is this work?
>> I agree with Jincheng that this is not feasible. I think that giving an
>> meaningful default value for the "batch.size" which works in most scenarios
>> is enough. What's your thought?
>> 
>> Thanks,
>> Dian
>> 
>> [1]
>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-97%3A+Support+Scalar+Vectorized+Python+UDF+in+PyFlink
>> 
>> 
>> On Mon, Feb 10, 2020 at 4:25 PM jincheng sun 
>> wrote:
>> 
>>> Hi Jingsong,
>>> 
>>> Thanks for your feedback! I would like to share my thoughts regarding the
>>> follows question:
>>> 
> - Can we only configure one parameter and calculate another
>>> automatically? For example, if we just want to "pipeline", "bundle.size"
>> is
>>> twice as much as "batch.size", is this work?
>>> 
>>> I don't think this works. These two configurations are used for different
>>> purposes and there is no direct relationship between them and so I guess
>> we
>>> cannot infer a configuration from the other configuration.
>>> 
>>> Best,
>>> Jincheng
>>> 
>>> 
>>> Jingsong Li  于2020年2月10日周一 下午1:53写道:
>>> 
 Thanks Dian for your reply.
 
 +1 to create a FLIP too.
 
 About "python.fn-execution.bundle.size" and
 "python.fn-execution.arrow.batch.size", I got what are you mean about
 "pipeline". I agree.
 It seems that a batch should always in a bundle. Bundle size should
>>> always
 bigger than batch size. (if a batch can not cross bundle).
 Can you explain this relationship to the document?
 
 I think default value is a very important thing, we can discuss:
 - In the batch world, vectorization batch size is about 1024+. What do
>>> you
 think about the default value of "batch"?
 - Can we only configure one parameter and calculate another
>>> automatically?
 For example, if we just want to "pipeline", "bundle.size" is twice as
>>> much
 as "batch.size", is this work?
 
 Best,
 Jingsong Lee
 
 On Mon, Feb 10, 2020 at 11:55 AM Hequn Cheng  wrote:
 
> Hi Dian,
> 
> Thanks a lot for bringing up the discussion!
> 
> It is great to see the Pandas UDFs feature is going to be
>> introduced. I
> think this would improve the performance and also the usability of
> user-defined functions (UDFs) in Python.
> One little suggestion: maybe it would be nice if we can add some
> performance explanation in the document? (I just very curious:))
> 
> +1 to create a FLIP for this big enhancement.
> 
> Best,
> Hequn
> 
> On Mon, Feb 10, 2020 at 11:15 AM jincheng sun <
>>> sunjincheng...@gmail.com>
> wrote:
> 
>> Hi Dian,
>> 
>> Thanks for bring up this discussion. This is very important for the
>> ecological of PyFlink. Add support Pandas greatly enriches the
 available
>> UDF library of PyFlink and greatly improves the usability of
>> PyFlink!
>> 
>> +1 for Support scalar vectorized Python UDF.
>

[jira] [Created] (FLINK-15979) Fix the merged count is not accurate in CountDistinctWithMerge

2020-02-10 Thread Jark Wu (Jira)
Jark Wu created FLINK-15979:
---

 Summary: Fix the merged count is not accurate in 
CountDistinctWithMerge 
 Key: FLINK-15979
 URL: https://issues.apache.org/jira/browse/FLINK-15979
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / Legacy Planner
Reporter: Jark Wu


As discussed in the user ML: 
https://lists.apache.org/thread.html/rc4b06c9931656c94dc993b124da3ff00f04099e41201c64788936c24%40%3Cuser.flink.apache.org%3E.

The current implementation of 
{{org.apache.flink.table.runtime.utils.JavaUserDefinedAggFunctions.CountDistinctWithMerge#merge}}
 in old planner is not correct which will have a wrong merged count. 

The test 
(org.apache.flink.table.runtime.stream.table.GroupWindowITCase#testEventTimeSessionGroupWindowOverTime)
 which uses this UDAF can't expose the bug because there are no distinct values 
in the test data.  

The class {{CountDistinctWithMerge}} is a testing implementation which is not a 
critical problem. Blink planner has a correct implementation: 
https://github.com/apache/flink/blob/master/flink-table/flink-table-planner-blink/src/test/java/org/apache/flink/table/planner/plan/utils/JavaUserDefinedAggFunctions.java#L369



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15980) The notFollowedBy in the end of GroupPattern may be ignored

2020-02-10 Thread shuai.xu (Jira)
shuai.xu created FLINK-15980:


 Summary: The notFollowedBy in the end of GroupPattern may be 
ignored
 Key: FLINK-15980
 URL: https://issues.apache.org/jira/browse/FLINK-15980
 Project: Flink
  Issue Type: Bug
  Components: Library / CEP
Affects Versions: 1.9.0
Reporter: shuai.xu


If we write a Pattern like this:
Pattern group = Pattern.begin('A').notFollowedBy("B");
Pattern pattern = Pattern.begin(group).followedBy("C");
Let notFollowedBy as the last part of a GroupPattern.

This pattern can be compile normally, but the notFollowedBy("B") doesn't work 
in fact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15981) Control the direct memory in FileChannelBoundedData.FileBufferReader

2020-02-10 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-15981:


 Summary: Control the direct memory in 
FileChannelBoundedData.FileBufferReader
 Key: FLINK-15981
 URL: https://issues.apache.org/jira/browse/FLINK-15981
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Network
Affects Versions: 1.10.0
Reporter: Jingsong Lee
 Fix For: 1.11.0


Now, the default blocking BoundedData is FileChannelBoundedData. In its reader, 
will create new direct buffer 64KB.

When parallelism greater than 100, users need configure 
"taskmanager.memory.task.off-heap.size" to avoid direct memory OOM. It is hard 
to configure, and it cost a lot of memory. Consider 1000 parallelism, maybe we 
need 1GB+ for a task manager.

This is not conducive to the scenario of less slots and large parallelism. 
Batch jobs could run little by little, but memory shortage would consume a lot.

If we provided N-Input operators, maybe things will be worse. This means the 
number of subpartitions that can be requested at the same time will be more. We 
have no idea how much memory.

Here are my rough thoughts:
 * Obtain memory from network buffers.
 * provide "The maximum number of subpartitions that can be requested at the 
same time".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.10.0, release candidate #3

2020-02-10 Thread Yangze Guo
+1 (non-binding)

- Build from source
- Run mesos e2e tests(including unmerged heap state backend and rocks
state backend case)


Best,
Yangze Guo

On Tue, Feb 11, 2020 at 10:08 AM Yu Li  wrote:
>
> Thanks for the reminder Patrick! According to the release process [1] we
> will publish the Dockerfiles *after* the RC voting passed, to finalize the
> release.
>
> I have created FLINK-15978 [2] and prepared a PR [3] for it, will follow up
> after we conclude our RC vote. Thanks.
>
> Best Regards,
> Yu
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release
> [2] https://issues.apache.org/jira/browse/FLINK-15978
> [3] https://github.com/apache/flink-docker/pull/6
>
>
> On Mon, 10 Feb 2020 at 20:57, Patrick Lucas  wrote:
>
> > Now that [FLINK-15828] Integrate docker-flink/docker-flink into Flink
> > release process  is
> > complete, the Dockerfiles for 1.10.0 can be published as part of the
> > release process.
> >
> > @Gary/@Yu: please let me know if you have any questions regarding the
> > workflow or its documentation.
> >
> > --
> > Patrick
> >
> > On Mon, Feb 10, 2020 at 1:29 PM Benchao Li  wrote:
> >
> > > +1 (non-binding)
> > >
> > > - build from source
> > > - start standalone cluster, and run some examples
> > > - played with sql-client with some simple sql
> > > - run tests in IDE
> > > - run some sqls running in 1.9 internal version with 1.10.0-rc3, seems
> > 1.10
> > > behaves well.
> > >
> > > Xintong Song  于2020年2月10日周一 下午8:13写道:
> > >
> > > > +1 (non-binding)
> > > >
> > > > - build from source (with tests)
> > > > - run nightly e2e tests
> > > > - run example jobs in local/standalone/yarn setups
> > > > - play around with memory configurations on local/standalone/yarn
> > setups
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Mon, Feb 10, 2020 at 7:55 PM Jark Wu  wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > - build the source release with Scala 2.12 and Scala 2.11
> > successfully
> > > > > - checked/verified signatures and hashes
> > > > > - started cluster for both Scala 2.11 and 2.12, ran examples,
> > verified
> > > > web
> > > > > ui and log output, nothing unexpected
> > > > > - started cluster and run some e2e sql queries, all of them works
> > well
> > > > and
> > > > > the results are as expected:
> > > > >   - read from kafka source, aggregate, write into mysql
> > > > >   - read from kafka source with watermark defined in ddl, window
> > > > aggregate,
> > > > > write into mysql
> > > > >   - read from kafka with computed column defined in ddl, temporal
> > join
> > > > with
> > > > > a mysql table, write into kafka
> > > > >
> > > > > Cheers,
> > > > > Jark
> > > > >
> > > > >
> > > > > On Mon, 10 Feb 2020 at 19:23, Kurt Young  wrote:
> > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > > - verified signatures and checksums
> > > > > > - start local cluster, run some examples, randomly play some sql
> > with
> > > > sql
> > > > > > client, no suspicious error/warn log found in log files
> > > > > > - repeat above operation with both scala 2.11 and 2.12 binary
> > > > > >
> > > > > > Best,
> > > > > > Kurt
> > > > > >
> > > > > >
> > > > > > On Mon, Feb 10, 2020 at 6:38 PM Yang Wang 
> > > > wrote:
> > > > > >
> > > > > > >  +1 non-binding
> > > > > > >
> > > > > > >
> > > > > > > - Building from source with all tests skipped
> > > > > > > - Build a custom image with 1.10-rc3
> > > > > > > - K8s tests
> > > > > > > * Deploy a standalone session cluster on K8s and submit
> > > multiple
> > > > > jobs
> > > > > > > * Deploy a standalone per-job cluster
> > > > > > > * Deploy a native session cluster on K8s with/without HA
> > > > > configured,
> > > > > > > kill TM and jobs could recover successfully
> > > > > > >
> > > > > > >
> > > > > > > Best,
> > > > > > > Yang
> > > > > > >
> > > > > > > Jingsong Li  于2020年2月10日周一 下午4:29写道:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > >
> > > > > > > > +1 (non-binding) Thanks for driving this, Gary & Yu.
> > > > > > > >
> > > > > > > >
> > > > > > > > There is an unfriendly error here: "OutOfMemoryError: Direct
> > > buffer
> > > > > > > memory"
> > > > > > > > in FileChannelBoundedData$FileBufferReader.
> > > > > > > >
> > > > > > > > It forces our batch users to configure
> > > > > > > > "taskmanager.memory.task.off-heap.size" in production jobs. And
> > > > users
> > > > > > are
> > > > > > > > hard to know how much memory they need configure.
> > > > > > > >
> > > > > > > > Even for us developers, it is hard to say how much memory, it
> > > > depends
> > > > > > on
> > > > > > > > tasks left over from the previous stage and the parallelism.
> > > > > > > >
> > > > > > > >
> > > > > > > > It is not a blocker, but hope to resolve it in 1.11.
> > > > > > > >
> > > > > > > >
> > > > > > > > - Verified signatures and checksums
> > > > > > > >
> > > > > > > > - Maven build from sou

Re: [VOTE] FLIP-55: Introduction of a Table API Java Expression DSL

2020-02-10 Thread Jark Wu
+1 for this.

I have some minor comments:
- I'm +1 to use $ in both Java and Scala API.
- I'm +1 to use lit(), Spark also provides lit() function to create a
literal value.
- Is it possible to have `isGreater` instead of `isGreaterThan` and
`isGreaterOrEqual` instead of `isGreaterThanOrEqualTo` in BaseExpressions?

Best,
Jark

On Tue, 11 Feb 2020 at 10:21, Jingsong Li  wrote:

> Hi Dawid,
>
> Thanks for driving.
>
> - adding $ in scala api looks good to me.
> - Just a question, what should be expected to java.lang.Object? literal
> object or expression? So the Object is the grammatical sugar of literal?
>
> Best,
> Jingsong Lee
>
> On Mon, Feb 10, 2020 at 9:40 PM Timo Walther  wrote:
>
> > +1 for this.
> >
> > It will also help in making a TableEnvironment.fromElements() possible
> > and reduces technical debt. One entry point of TypeInformation less in
> > the API.
> >
> > Regards,
> > Timo
> >
> >
> > On 10.02.20 08:31, Dawid Wysakowicz wrote:
> > > Hi all,
> > >
> > > I wanted to resurrect the thread about introducing a Java Expression
> > > DSL. Please see the updated flip page[1]. Most of the flip was
> concluded
> > > in previous discussion thread. The major changes since then are:
> > >
> > > * accepting java.lang.Object in the Java DSL
> > >
> > > * adding $ interpolation for a column in the Scala DSL
> > >
> > > I think it's important to move those changes forward as it makes it
> > > easier to transition to the new type system (Java parser supports only
> > > the old type system stack for now) that we are working on for the past
> > > releases.
> > >
> > > Because the previous discussion thread was rather conclusive I want to
> > > start already with a vote. If you think we need another round of
> > > discussion, feel free to say so.
> > >
> > >
> > > The vote will last for at least 72 hours, following the consensus
> voting
> > > process.
> > >
> > > FLIP wiki:
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-55%3A+Introduction+of+a+Table+API+Java+Expression+DSL
> > >
> > >
> > > Discussion thread:
> > >
> > >
> >
> https://lists.apache.org/thread.html/eb5e7b0579e5f1da1e9bf1ab4e4b86dba737946f0261d94d8c30521e@%3Cdev.flink.apache.org%3E
> > >
> > >
> > >
> > >
> >
> >
>
> --
> Best, Jingsong Lee
>


Re: [VOTE] Release Flink Python API(PyFlink) 1.9.2 to PyPI, release candidate #1

2020-02-10 Thread Wei Zhong
Hi,

Thanks for driving this, Jincheng.

+1 (non-binding) 

- Verified signatures and checksums.
- Verified README.md and setup.py.
- Run `pip install apache-flink-1.9.2.tar.gz` in Python 2.7.15 and Python 3.7.5 
successfully.
- Start local pyflink shell in Python 2.7.15 and Python 3.7.5 via 
`pyflink-shell.sh local` and try the examples in the help message, run well and 
no exception.
- Try a word count example in IDE with Python 2.7.15 and Python 3.7.5, run well 
and no exception.

Best,
Wei


> 在 2020年2月10日,19:12,jincheng sun  写道:
> 
> Hi everyone,
> 
> Please review and vote on the release candidate #1 for the PyFlink version 
> 1.9.2, as follows:
> 
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
> 
> The complete staging area is available for your review, which includes:
> 
> * the official Apache binary convenience releases to be deployed to 
> dist.apache.org  [1], which are signed with the key 
> with fingerprint 8FEA1EE9D0048C0CCC70B7573211B0703B79EA0E [2] and built from 
> source code [3].
> 
> The vote will be open for at least 72 hours. It is adopted by majority 
> approval, with at least 3 PMC affirmative votes.
> 
> Thanks,
> Jincheng
> 
> [1] https://dist.apache.org/repos/dist/dev/flink/flink-1.9.2-rc1/ 
> 
> [2] https://dist.apache.org/repos/dist/release/flink/KEYS 
> 
> [3] https://github.com/apache/flink/tree/release-1.9.2 
> 


[jira] [Created] (FLINK-15982) 'Quickstarts Java nightly end-to-end test' is failed on travis

2020-02-10 Thread Jark Wu (Jira)
Jark Wu created FLINK-15982:
---

 Summary: 'Quickstarts Java nightly end-to-end test' is failed on 
travis
 Key: FLINK-15982
 URL: https://issues.apache.org/jira/browse/FLINK-15982
 Project: Flink
  Issue Type: Bug
  Components: Tests
Reporter: Jark Wu


{code:java}
==
Running 'Quickstarts Java nightly end-to-end test'
==
TEST_DATA_DIR: 
/home/travis/build/apache/flink/flink-end-to-end-tests/test-scripts/temp-test-directory-42718423491
Flink dist directory: 
/home/travis/build/apache/flink/flink-dist/target/flink-1.11-SNAPSHOT-bin/flink-1.11-SNAPSHOT
22:16:44.021 [INFO] Scanning for projects...
22:16:44.095 [INFO] 

22:16:44.095 [INFO] BUILD FAILURE
22:16:44.095 [INFO] 

22:16:44.098 [INFO] Total time: 0.095 s
22:16:44.099 [INFO] Finished at: 2020-02-10T22:16:44+00:00
22:16:44.143 [INFO] Final Memory: 5M/153M
22:16:44.143 [INFO] 

22:16:44.144 [ERROR] The goal you specified requires a project to execute but 
there is no POM in this directory 
(/home/travis/build/apache/flink/flink-end-to-end-tests/test-scripts/temp-test-directory-42718423491).
 Please verify you invoked Maven from the correct directory. -> [Help 1]
22:16:44.144 [ERROR] 
22:16:44.145 [ERROR] To see the full stack trace of the errors, re-run Maven 
with the -e switch.
22:16:44.145 [ERROR] Re-run Maven using the -X switch to enable full debug 
logging.
22:16:44.145 [ERROR] 
22:16:44.145 [ERROR] For more information about the errors and possible 
solutions, please read the following articles:
22:16:44.145 [ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MissingProjectException
/home/travis/build/apache/flink/flink-end-to-end-tests/test-scripts/test_quickstarts.sh:
 line 57: cd: flink-quickstart-java: No such file or directory
cp: cannot create regular file 
'/home/travis/build/apache/flink/flink-end-to-end-tests/test-scripts/temp-test-directory-42718423491/flink-quickstart-java/src/main/java/org/apache/flink/quickstart/Elasticsearch5SinkExample.java':
 No such file or directory
sed: can't read 
/home/travis/build/apache/flink/flink-end-to-end-tests/test-scripts/temp-test-directory-42718423491/flink-quickstart-java/src/main/java/org/apache/flink/quickstart/Elasticsearch5SinkExample.java:
 No such file or directory
awk: fatal: cannot open file `pom.xml' for reading (No such file or directory)
sed: can't read pom.xml: No such file or directory
sed: can't read pom.xml: No such file or directory
22:16:45.312 [INFO] Scanning for projects...
22:16:45.386 [INFO] 

22:16:45.386 [INFO] BUILD FAILURE
22:16:45.386 [INFO] 

22:16:45.391 [INFO] Total time: 0.097 s
22:16:45.391 [INFO] Finished at: 2020-02-10T22:16:45+00:00
22:16:45.438 [INFO] Final Memory: 5M/153M
22:16:45.438 [INFO] 

22:16:45.440 [ERROR] The goal you specified requires a project to execute but 
there is no POM in this directory 
(/home/travis/build/apache/flink/flink-end-to-end-tests/test-scripts/temp-test-directory-42718423491).
 Please verify you invoked Maven from the correct directory. -> [Help 1]
22:16:45.440 [ERROR] 
22:16:45.440 [ERROR] To see the full stack trace of the errors, re-run Maven 
with the -e switch.
22:16:45.440 [ERROR] Re-run Maven using the -X switch to enable full debug 
logging.
22:16:45.440 [ERROR] 
22:16:45.440 [ERROR] For more information about the errors and possible 
solutions, please read the following articles:
22:16:45.440 [ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MissingProjectException
/home/travis/build/apache/flink/flink-end-to-end-tests/test-scripts/test_quickstarts.sh:
 line 73: cd: target: No such file or directory
java.io.FileNotFoundException: flink-quickstart-java-0.1.jar (No such file or 
directory)
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.(ZipFile.java:225)
at java.util.zip.ZipFile.(ZipFile.java:155)
at java.util.zip.ZipFile.(ZipFile.java:126)
at sun.tools.jar.Main.list(Main.java:1115)
at sun.tools.jar.Main.run(Main.java:293)
at sun.tools.jar.Main.main(Main.java:1288)
Success: There are no flink core classes are contained in the jar.
Failure: Since Elasticsearch5SinkExample.class and other user classes are not 
included in the jar. 
[FAIL] Test script contains errors.
{code}

Here are some instances:
- https://api.travis-ci.org/v3/job/64

Re: [VOTE] Release Flink Python API(PyFlink) 1.9.2 to PyPI, release candidate #1

2020-02-10 Thread jincheng sun
+1 (binding)

- Install the PyFlink by `pip install` [SUCCESS]
- Run word_count in both command line and IDE [SUCCESS]

Best,
Jincheng



Wei Zhong  于2020年2月11日周二 上午11:17写道:

> Hi,
>
> Thanks for driving this, Jincheng.
>
> +1 (non-binding)
>
> - Verified signatures and checksums.
> - Verified README.md and setup.py.
> - Run `pip install apache-flink-1.9.2.tar.gz` in Python 2.7.15 and Python
> 3.7.5 successfully.
> - Start local pyflink shell in Python 2.7.15 and Python 3.7.5 via
> `pyflink-shell.sh local` and try the examples in the help message, run well
> and no exception.
> - Try a word count example in IDE with Python 2.7.15 and Python 3.7.5, run
> well and no exception.
>
> Best,
> Wei
>
>
> 在 2020年2月10日,19:12,jincheng sun  写道:
>
> Hi everyone,
>
> Please review and vote on the release candidate #1 for the PyFlink version
> 1.9.2, as follows:
>
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
> The complete staging area is available for your review, which includes:
>
> * the official Apache binary convenience releases to be deployed to
> dist.apache.org [1], which are signed with the key with fingerprint
> 8FEA1EE9D0048C0CCC70B7573211B0703B79EA0E [2] and built from source code [3].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Jincheng
>
> [1] https://dist.apache.org/repos/dist/dev/flink/flink-1.9.2-rc1/
> [2] https://dist.apache.org/repos/dist/release/flink/KEYS
> [3] https://github.com/apache/flink/tree/release-1.9.2
>
>
>


Re: [VOTE] Release Flink Python API(PyFlink) 1.9.2 to PyPI, release candidate #1

2020-02-10 Thread Dian Fu
+1 (non-binding)

- Verified the signature and checksum
- Pip installed the package successfully: pip install apache-flink-1.9.2.tar.gz
- Run word count example successfully.

Regards,
Dian

> 在 2020年2月11日,上午11:44,jincheng sun  写道:
> 
> 
> +1 (binding) 
> 
> - Install the PyFlink by `pip install` [SUCCESS]
> - Run word_count in both command line and IDE [SUCCESS]
> 
> Best,
> Jincheng
> 
> 
> 
> Wei Zhong mailto:weizhong0...@gmail.com>> 
> 于2020年2月11日周二 上午11:17写道:
> Hi,
> 
> Thanks for driving this, Jincheng.
> 
> +1 (non-binding) 
> 
> - Verified signatures and checksums.
> - Verified README.md and setup.py.
> - Run `pip install apache-flink-1.9.2.tar.gz` in Python 2.7.15 and Python 
> 3.7.5 successfully.
> - Start local pyflink shell in Python 2.7.15 and Python 3.7.5 via 
> `pyflink-shell.sh local` and try the examples in the help message, run well 
> and no exception.
> - Try a word count example in IDE with Python 2.7.15 and Python 3.7.5, run 
> well and no exception.
> 
> Best,
> Wei
> 
> 
>> 在 2020年2月10日,19:12,jincheng sun > > 写道:
>> 
>> Hi everyone,
>> 
>> Please review and vote on the release candidate #1 for the PyFlink version 
>> 1.9.2, as follows:
>> 
>> [ ] +1, Approve the release
>> [ ] -1, Do not approve the release (please provide specific comments)
>> 
>> The complete staging area is available for your review, which includes:
>> 
>> * the official Apache binary convenience releases to be deployed to 
>> dist.apache.org  [1], which are signed with the key 
>> with fingerprint 8FEA1EE9D0048C0CCC70B7573211B0703B79EA0E [2] and built from 
>> source code [3].
>> 
>> The vote will be open for at least 72 hours. It is adopted by majority 
>> approval, with at least 3 PMC affirmative votes.
>> 
>> Thanks,
>> Jincheng
>> 
>> [1] https://dist.apache.org/repos/dist/dev/flink/flink-1.9.2-rc1/ 
>> 
>> [2] https://dist.apache.org/repos/dist/release/flink/KEYS 
>> 
>> [3] https://github.com/apache/flink/tree/release-1.9.2 
>> 


[jira] [Created] (FLINK-15983) add native reader for Hive parquet files

2020-02-10 Thread Bowen Li (Jira)
Bowen Li created FLINK-15983:


 Summary: add native reader for Hive parquet files
 Key: FLINK-15983
 URL: https://issues.apache.org/jira/browse/FLINK-15983
 Project: Flink
  Issue Type: New Feature
  Components: Connectors / Hive
Reporter: Bowen Li
Assignee: Jingsong Lee
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15984) support hive stream table sink

2020-02-10 Thread Bowen Li (Jira)
Bowen Li created FLINK-15984:


 Summary: support hive stream table sink
 Key: FLINK-15984
 URL: https://issues.apache.org/jira/browse/FLINK-15984
 Project: Flink
  Issue Type: New Feature
  Components: Connectors / Hive
Reporter: Bowen Li
Assignee: Rui Li
 Fix For: 1.11.0


support hive stream table sink for stream processing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15985) offload runtime params from DDL to table hints in DML/queries

2020-02-10 Thread Bowen Li (Jira)
Bowen Li created FLINK-15985:


 Summary: offload runtime params from DDL to table hints in 
DML/queries
 Key: FLINK-15985
 URL: https://issues.apache.org/jira/browse/FLINK-15985
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / API
Reporter: Bowen Li
Assignee: Danny Chen
 Fix For: 1.11.0


background:

Currently Flink DDL mixes three types of params all together: 
 * External data’s metadata: defines what the data looks like (schema), where 
it is (location/url), how it should be accessed (username/pwd)
 * Source/sink runtime params: defines how and usually how fast Flink 
source/sink reads/writes data, not affecting the results
 * Kafka “sink-partitioner”
 * Elastic “bulk-flush.interval/max-size/...”


 * Semantics params: defines aspects like how much data Flink reads/writes, how 
the result will look like
 * Kafka “startup-mode”, “offset”
 * Watermark, timestamp column

 

Problems of the current mix-up: Flink cannot leverage catalogs and external 
system metadata alone to run queries with all the non-metadata params involved 
in DDL. E.g. when we add a catalog for Confluent Schema Registry, the expected 
user experience should be that Flink users just configure the catalog with url 
and usr/pwd, and should be able to run queries immediately; however, that’s not 
the case right now because users still have to use DDL to define a bunch params 
like “startup-mode”, “offset”, timestamp column, etc, along with the schema 
redundantly. We’ve heard many user complaints on this.

 

cc [~ykt836] [~lirui] [~lzljs3620320] [~jark] [~twalthr] [~dwysakowicz]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15986) support setting or changing session properties in Flink SQL

2020-02-10 Thread Bowen Li (Jira)
Bowen Li created FLINK-15986:


 Summary: support setting or changing session properties in Flink 
SQL
 Key: FLINK-15986
 URL: https://issues.apache.org/jira/browse/FLINK-15986
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / API, Table SQL / Ecosystem
Reporter: Bowen Li
Assignee: Kurt Young
 Fix For: 1.11.0


as Flink SQL is more and more critical for user running batch jobs, 
experiments, and OLAP exploration, it's important than ever to support setting 
and changing session properties in Flink SQL.

 

Use cases include switching SQL dialects at runtime, switching job mode between 
"streaming" and "batch", changing other params defined in flink-conf.yaml and 
default-sql-client.yaml 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Drop connectors for Elasticsearch 2.x and 5.x

2020-02-10 Thread Danny Chan
5.x seems to have a lot of users, is the 6.x completely compatible with 5.x ~

Best,
Danny Chan
在 2020年2月10日 +0800 PM9:45,Dawid Wysakowicz ,写道:
> Hi all,
>
> As described in this https://issues.apache.org/jira/browse/FLINK-11720
> ticket our elasticsearch 5.x connector does not work out of the box on
> some systems and requires a version bump. This also happens for our e2e.
> We cannot bump the version in es 5.x connector, because 5.x connector
> shares a common class with 2.x that uses an API that was replaced in 5.2.
>
> Both versions are already long eol: https://www.elastic.co/support/eol
>
> I suggest to drop both connectors 5.x and 2.x. If it is too much to drop
> both of them, I would strongly suggest dropping at least 2.x connector
> and update the 5.x line to a working es client module.
>
> What do you think? Should we drop both versions? Drop only the 2.x
> connector? Or keep them both?
>
> Best,
>
> Dawid
>
>


[jira] [Created] (FLINK-15987) SELECT 1.0e0 / 0.0e0 throws NumberFormatException

2020-02-10 Thread Caizhi Weng (Jira)
Caizhi Weng created FLINK-15987:
---

 Summary: SELECT 1.0e0 / 0.0e0 throws NumberFormatException
 Key: FLINK-15987
 URL: https://issues.apache.org/jira/browse/FLINK-15987
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.10.0
Reporter: Caizhi Weng


{code:sql}
SELECT 1.0e / 0.0e
{code}

throws the following exception

{code:java}
Caused by: java.lang.NumberFormatException: Infinite or NaN
at java.math.BigDecimal.(BigDecimal.java:895)
at java.math.BigDecimal.(BigDecimal.java:872)
at 
org.apache.flink.table.planner.codegen.ExpressionReducer.reduce(ExpressionReducer.scala:189)
at 
org.apache.calcite.rel.rules.ReduceExpressionsRule.reduceExpressionsInternal(ReduceExpressionsRule.java:695)
at 
org.apache.calcite.rel.rules.ReduceExpressionsRule.reduceExpressions(ReduceExpressionsRule.java:616)
at 
org.apache.calcite.rel.rules.ReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(ReduceExpressionsRule.java:301)
at 
org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:319)
at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:560)
at 
org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:419)
at 
org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:256)
at 
org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127)
at 
org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:215)
at 
org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:202)
at 
org.apache.flink.table.planner.plan.optimize.program.FlinkHepProgram.optimize(FlinkHepProgram.scala:69)
at 
org.apache.flink.table.planner.plan.optimize.program.FlinkHepRuleSetProgram.optimize(FlinkHepRuleSetProgram.scala:87)
at 
org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:62)
at 
org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:58)
at 
scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
at 
scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at 
scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157)
at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104)
at 
org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.optimize(FlinkChainedProgram.scala:57)
at 
org.apache.flink.table.planner.plan.optimize.BatchCommonSubGraphBasedOptimizer.optimizeTree(BatchCommonSubGraphBasedOptimizer.scala:83)
at 
org.apache.flink.table.planner.plan.optimize.BatchCommonSubGraphBasedOptimizer.org$apache$flink$table$planner$plan$optimize$BatchCommonSubGraphBasedOptimizer$$optimizeBlock(BatchCommonSubGraphBasedOptimizer.scala:56)
at 
org.apache.flink.table.planner.plan.optimize.BatchCommonSubGraphBasedOptimizer$$anonfun$doOptimize$1.apply(BatchCommonSubGraphBasedOptimizer.scala:44)
at 
org.apache.flink.table.planner.plan.optimize.BatchCommonSubGraphBasedOptimizer$$anonfun$doOptimize$1.apply(BatchCommonSubGraphBasedOptimizer.scala:44)
at scala.collection.immutable.List.foreach(List.scala:392)
at 
org.apache.flink.table.planner.plan.optimize.BatchCommonSubGraphBasedOptimizer.doOptimize(BatchCommonSubGraphBasedOptimizer.scala:44)
at 
org.apache.flink.table.planner.plan.optimize.CommonSubGraphBasedOptimizer.optimize(CommonSubGraphBasedOptimizer.scala:77)
at 
org.apache.flink.table.planner.delegation.PlannerBase.optimize(PlannerBase.scala:248)
at 
org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:151)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:682)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.insertIntoInternal(TableEnvironmentImpl.java:355)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.insertInto(TableEnvironmentImpl.java:343)
at 
org.apache.flink.table.api.internal.TableImpl.insertInto(TableImpl.java:428)
at 
org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeQueryInternal$11(LocalExecutor.java:610)
at 
org.apache.flink.table.client.gateway.local.ExecutionContext.wrapClassLoader(ExecutionContext.java:240)
at 
org.

[RESULT] [VOTE] Release 1.10.0, release candidate #3

2020-02-10 Thread Gary Yao
I'm happy to announce that we have unanimously approved this release.

There are 16 approving votes, 5 of which are binding:
* Kurt Young (binding)
* Jark Wu (binding)
* Kostas Kloudas (binding)
* Thomas Weise (binding)
* Jincheng Sun (binding)
* Aihua Li (non-binding)
* Zhu Zhu (non-binding)
* Congxian Qiu (non-binding)
* Rui Li (non-binding)
* Jingsong Li (non-binding)
* Yang Wang (non-binding)
* Piotr Nowojski (non-binding)
* Xintong Song (non-binding)
* Benchao Li (non-binding)
* Zili Chen (non-binding)
* Yangze Guo (non-binding)

There are no disapproving votes.

Thanks everyone!