Re: [VOTE] Release 1.10.0, release candidate #2

2020-02-06 Thread jincheng sun
There is a blocker issue: https://issues.apache.org/jira/browse/FLINK-15937

Best,
Jincheng



Jeff Zhang  于2020年2月6日周四 下午3:09写道:

> -1, I just found one critical issue
> https://issues.apache.org/jira/browse/FLINK-15935
> This ticket means user unable to use watermark in sql if he specify both
> flink planner and blink planner in pom.xml
>
> 
>org.apache.flink
>flink-table-planner_${scala.binary.version}
>${project.version}
> 
> 
>org.apache.flink
>
>  flink-table-planner-blink_${scala.binary.version}
>${project.version}
> 
>
>
>
> Thomas Weise  于2020年2月6日周四 上午5:16写道:
>
> > I deployed commit 81cf2f9e59259389a6549b07dcf822ec63c899a4 and can
> confirm
> > that the dataformat-cbor and checkpoint alignment metric issues are
> > resolved.
> >
> >
> > On Wed, Feb 5, 2020 at 11:26 AM Gary Yao  wrote:
> >
> > > Note that there is currently an ongoing discussion about whether
> > > FLINK-15917
> > > and FLINK-15918 should be fixed in 1.10.0 [1].
> > >
> > > [1]
> > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/flink-dev/202002.mbox/%3CCA%2B5xAo3D21-T5QysQg3XOdm%3DL9ipz3rMkA%3DqMzxraJRgfuyg2A%40mail.gmail.com%3E
> > >
> > > On Wed, Feb 5, 2020 at 8:00 PM Gary Yao  wrote:
> > >
> > > > Hi everyone,
> > > > Please review and vote on the release candidate #2 for the version
> > > 1.10.0,
> > > > as follows:
> > > > [ ] +1, Approve the release
> > > > [ ] -1, Do not approve the release (please provide specific comments)
> > > >
> > > >
> > > > The complete staging area is available for your review, which
> includes:
> > > > * JIRA release notes [1],
> > > > * the official Apache source release and binary convenience releases
> to
> > > be
> > > > deployed to dist.apache.org [2], which are signed with the key with
> > > > fingerprint BB137807CEFBE7DD2616556710B12A1F89C115E8 [3],
> > > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > > * source code tag "release-1.10.0-rc2" [5],
> > > > * website pull request listing the new release and adding
> announcement
> > > > blog post [6][7].
> > > >
> > > > The vote will be open for at least 24 hours. It is adopted by
> majority
> > > > approval, with at least 3 PMC affirmative votes.
> > > >
> > > > Thanks,
> > > > Yu & Gary
> > > >
> > > > [1]
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845
> > > > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc2/
> > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > [4]
> > > https://repository.apache.org/content/repositories/orgapacheflink-1332
> > > > [5] https://github.com/apache/flink/releases/tag/release-1.10.0-rc2
> > > > [6] https://github.com/apache/flink-web/pull/302
> > > > [7] https://github.com/apache/flink-web/pull/301
> > > >
> > >
> >
>
>
> --
> Best Regards
>
> Jeff Zhang
>


Re: [VOTE] Release 1.10.0, release candidate #2

2020-02-06 Thread Jingsong Li
Hi Jeff,


For FLINK-15935 [1],

I try to think of it as a non blocker. But it's really an important issue.


The problem is the class loading order. We want to load the class in the
blink-planner.jar, but actually load the class in the flink-planner.jar.


First of all, the order of class loading is based on the order of classpath.


I just tried, the order of classpath of the folder is depends on the order
of file names.

-That is to say, our order is OK now: because
flink-table-blink_2.11-1.11-SNAPSHOT.jar is before the name order of
flink-table_2.11-1.11-snapshot.jar.

-But once I change the name of flink-table_2.11-1.11-SNAPSHOT.jar to
aflink-table_2.11-1.11-SNAPSHOT.jar, an error will be reported.


The order of classpaths should be influenced by the ls of Linux. By default
the ls command is listing the files in alphabetical order. [1]


IDE seems to have another rule, because it is directly load classes. Not
jars.


So the possible impact of this problem is:

-IDE with two planners to use watermark, which will report an error.

-The real production environment, under Linux, will not report errors
because of the above reasons.


[1] https://issues.apache.org/jira/browse/FLINK-15935

[2]
https://linuxize.com/post/how-to-list-files-in-linux-using-the-ls-command/


Best,

Jingsong Lee

On Thu, Feb 6, 2020 at 3:09 PM Jeff Zhang  wrote:

> -1, I just found one critical issue
> https://issues.apache.org/jira/browse/FLINK-15935
> This ticket means user unable to use watermark in sql if he specify both
> flink planner and blink planner in pom.xml
>
> 
>org.apache.flink
>flink-table-planner_${scala.binary.version}
>${project.version}
> 
> 
>org.apache.flink
>
>  flink-table-planner-blink_${scala.binary.version}
>${project.version}
> 
>
>
>
> Thomas Weise  于2020年2月6日周四 上午5:16写道:
>
> > I deployed commit 81cf2f9e59259389a6549b07dcf822ec63c899a4 and can
> confirm
> > that the dataformat-cbor and checkpoint alignment metric issues are
> > resolved.
> >
> >
> > On Wed, Feb 5, 2020 at 11:26 AM Gary Yao  wrote:
> >
> > > Note that there is currently an ongoing discussion about whether
> > > FLINK-15917
> > > and FLINK-15918 should be fixed in 1.10.0 [1].
> > >
> > > [1]
> > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/flink-dev/202002.mbox/%3CCA%2B5xAo3D21-T5QysQg3XOdm%3DL9ipz3rMkA%3DqMzxraJRgfuyg2A%40mail.gmail.com%3E
> > >
> > > On Wed, Feb 5, 2020 at 8:00 PM Gary Yao  wrote:
> > >
> > > > Hi everyone,
> > > > Please review and vote on the release candidate #2 for the version
> > > 1.10.0,
> > > > as follows:
> > > > [ ] +1, Approve the release
> > > > [ ] -1, Do not approve the release (please provide specific comments)
> > > >
> > > >
> > > > The complete staging area is available for your review, which
> includes:
> > > > * JIRA release notes [1],
> > > > * the official Apache source release and binary convenience releases
> to
> > > be
> > > > deployed to dist.apache.org [2], which are signed with the key with
> > > > fingerprint BB137807CEFBE7DD2616556710B12A1F89C115E8 [3],
> > > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > > * source code tag "release-1.10.0-rc2" [5],
> > > > * website pull request listing the new release and adding
> announcement
> > > > blog post [6][7].
> > > >
> > > > The vote will be open for at least 24 hours. It is adopted by
> majority
> > > > approval, with at least 3 PMC affirmative votes.
> > > >
> > > > Thanks,
> > > > Yu & Gary
> > > >
> > > > [1]
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345845
> > > > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.10.0-rc2/
> > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > [4]
> > > https://repository.apache.org/content/repositories/orgapacheflink-1332
> > > > [5] https://github.com/apache/flink/releases/tag/release-1.10.0-rc2
> > > > [6] https://github.com/apache/flink-web/pull/302
> > > > [7] https://github.com/apache/flink-web/pull/301
> > > >
> > >
> >
>
>
> --
> Best Regards
>
> Jeff Zhang
>


-- 
Best, Jingsong Lee


Re: [VOTE] Release 1.10.0, release candidate #2

2020-02-06 Thread Jeff Zhang
Hi Jingsong,

Thanks for the suggestion. It works for running it in IDE, but for
downstream project like Zeppelin where I will include flink jars in
classpath.
it only works when I specify the jars one by one explicitly in classpath,
using * doesn't work.

e.g.

The following command where I use * to specify classpath doesn't work,
jzhang 4794 1.2 6.3 8408904 1048828 s007 S 4:30PM 0:33.90
/Library/Java/JavaVirtualMachines/jdk1.8.0_211.jdk/Contents/Home/bin/java
-Dfile.encoding=UTF-8
-Dlog4j.configuration=file:///Users/jzhang/github/zeppelin/conf/log4j.properties
-Dzeppelin.log.file=/Users/jzhang/github/zeppelin/logs/zeppelin-interpreter-flink-shared_process-jzhang-JeffdeMacBook-Pro.local.log
-Xms1024m -Xmx2048m -XX:MaxPermSize=512m -cp
*:/Users/jzhang/github/flink/build-target/lib/**:/Users/jzhang/github/zeppelin/interpreter/flink/*:/Users/jzhang/github/zeppelin/zeppelin-interpreter-api/target/*:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes/*::/Users/jzhang/github/zeppelin/interpreter/zeppelin-interpreter-api-0.9.0-SNAPSHOT.jar:/Users/jzhang/github/zeppelin/zeppelin-interpreter/target/classes:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes:/Users/jzhang/github/flink/build-target/opt/flink-python_2.11-1.10-SNAPSHOT.jar
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer 0.0.0.0
52395 flink-shared_process :


While this command where I specify jar one by one explicitly in classpath
works

jzhang5660 205.2  6.1  8399420 1015036 s005  R 4:43PM
0:24.82
/Library/Java/JavaVirtualMachines/jdk1.8.0_211.jdk/Contents/Home/bin/java
-Dfile.encoding=UTF-8
-Dlog4j.configuration=file:///Users/jzhang/github/zeppelin/conf/log4j.properties
-Dzeppelin.log.file=/Users/jzhang/github/zeppelin/logs/zeppelin-interpreter-flink-shared_process-jzhang-JeffdeMacBook-Pro.local.log
-Xms1024m -Xmx2048m -XX:MaxPermSize=512m -cp
*:/Users/jzhang/github/flink/build-target/lib/slf4j-log4j12-1.7.15.jar:/Users/jzhang/github/flink/build-target/lib/flink-connector-hive_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/hive-exec-2.3.4.jar:/Users/jzhang/github/flink/build-target/lib/flink-shaded-hadoop-2-uber-2.7.5-7.0.jar:/Users/jzhang/github/flink/build-target/lib/log4j-1.2.17.jar:/Users/jzhang/github/flink/build-target/lib/flink-table-blink_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-table_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-dist_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-python_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-hadoop-compatibility_2.11-1.9.1.jar*:/Users/jzhang/github/zeppelin/interpreter/flink/*:/Users/jzhang/github/zeppelin/zeppelin-interpreter-api/target/*:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes/*::/Users/jzhang/github/zeppelin/interpreter/zeppelin-interpreter-api-0.9.0-SNAPSHOT.jar:/Users/jzhang/github/zeppelin/zeppelin-interpreter/target/classes:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes:/Users/jzhang/github/flink/build-target/opt/flink-python_2.11-1.10-SNAPSHOT.jar
/Users/jzhang/github/flink/build-target/opt/tmp/flink-python_2.11-1.10-SNAPSHOT.jar
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer 0.0.0.0
52603 flink-shared_process :


Jingsong Li  于2020年2月6日周四 下午4:10写道:

> Hi Jeff,
>
>
> For FLINK-15935 [1],
>
> I try to think of it as a non blocker. But it's really an important issue.
>
>
> The problem is the class loading order. We want to load the class in the
> blink-planner.jar, but actually load the class in the flink-planner.jar.
>
>
> First of all, the order of class loading is based on the order of
> classpath.
>
>
> I just tried, the order of classpath of the folder is depends on the order
> of file names.
>
> -That is to say, our order is OK now: because
> flink-table-blink_2.11-1.11-SNAPSHOT.jar is before the name order of
> flink-table_2.11-1.11-snapshot.jar.
>
> -But once I change the name of flink-table_2.11-1.11-SNAPSHOT.jar to
> aflink-table_2.11-1.11-SNAPSHOT.jar, an error will be reported.
>
>
> The order of classpaths should be influenced by the ls of Linux. By default
> the ls command is listing the files in alphabetical order. [1]
>
>
> IDE seems to have another rule, because it is directly load classes. Not
> jars.
>
>
> So the possible impact of this problem is:
>
> -IDE with two planners to use watermark, which will report an error.
>
> -The real production environment, under Linux, will not report errors
> because of the above reasons.
>
>
> [1] https://issues.apache.org/jira/browse/FLINK-15935
>
> [2]
> https://linuxize.com/post/how-to-list-files-in-linux-using-the-ls-command/
>
>
> Best,
>
> Jingsong Lee
>
> On Thu, Feb 6, 2020 at 3:09 PM Jeff Zhang  wrote:
>
> > -1, I just found one critical issue
> > https://issues.apache.org/jira/browse/FLINK-15935
> > This ticket means user unable to use watermark in sql if he specify both
> > fli

Re: [DISCUSS] have separate Flink distributions with built-in Hive dependencies

2020-02-06 Thread Jingsong Li
Hi Stephan,

The hive/lib/ has many jars, this lib is for execution, metastore, hive
client and all things.
What we really depend on is hive-exec.jar. (hive-metastore.jar is also
required in the low version hive)
And hive-exec.jar is a uber jar. We just want half classes of it. These
half classes are not so clean, but it is OK to have them.

Our solution now:
- exclude hive jars from build
- provide 8 versions dependencies way, user choose by his hive version.[1]

Spark's solution:
- build-in hive 1.2.1 dependencies to support hive 0.12.0 through 2.3.3. [2]
- hive-exec.jar is hive-exec.spark.jar, Spark has modified the
hive-exec build pom to exclude unnecessary classes including Orc and
parquet.
- build-in orc and parquet dependencies to optimizer performance.
- support hive version 2.3.3 upper by "mvn install -Phive-2.3", to built-in
hive-exec-2.3.6.jar. It seems that since this version, hive's API has been
seriously incompatible.
Most of the versions used by users are hive 0.12.0 through 2.3.3. So the
default build of Spark is good to most of users.

Presto's solution:
- Built-in presto's hive.[3] Shade hive classes instead of thrift classes.
- Rewrite some client related code to solve kinds of issues.
This approach is the heaviest, but also the cleanest. It can support all
kinds of hive versions with one build.

So I think we can do:

- The eight versions we now maintain are too many. I think we can move
forward in the direction of Presto/Spark and try to reduce dependencies
versions.

- As your said, about provide fat/uber jars or helper script, I prefer uber
jars, user can download one jar to their startup. Just like Kafka.

[1]
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/#dependencies
[2]
https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html#interacting-with-different-versions-of-hive-metastore
[3] https://github.com/prestodb/presto-hive-apache

Best,
Jingsong Lee

On Wed, Feb 5, 2020 at 10:15 PM Stephan Ewen  wrote:

> Some thoughts about other options we have:
>
>   - Put fat/shaded jars for the common versions into "flink-shaded" and
> offer them for download on the website, similar to pre-bundles Hadoop
> versions.
>
>   - Look at the Presto code (Metastore protocol) and see if we can reuse
> that
>
>   - Have a setup helper script that takes the versions and pulls the
> required dependencies.
>
> Can you share how can a "built-in" dependency could work, if there are so
> many different conflicting versions?
>
> Thanks,
> Stephan
>
>
> On Tue, Feb 4, 2020 at 12:59 PM Rui Li  wrote:
>
>> Hi Stephan,
>>
>> As Jingsong stated, in our documentation the recommended way to add Hive
>> deps is to use exactly what users have installed. It's just we ask users
>> to
>> manually add those jars, instead of automatically find them based on env
>> variables. I prefer to keep it this way for a while, and see if there're
>> real concerns/complaints from user feedbacks.
>>
>> Please also note the Hive jars are not the only ones needed to integrate
>> with Hive, users have to make sure flink-connector-hive and Hadoop jars
>> are
>> in classpath too. So I'm afraid a single "HIVE" env variable wouldn't save
>> all the manual work for our users.
>>
>> On Tue, Feb 4, 2020 at 5:54 PM Jingsong Li 
>> wrote:
>>
>> > Hi all,
>> >
>> > For your information, we have document the dependencies detailed
>> > information [1]. I think it's a lot clearer than before, but it's worse
>> > than presto and spark (they avoid or have built-in hive dependency).
>> >
>> > I thought about Stephan's suggestion:
>> > - The hive/lib has 200+ jars, but we only need hive-exec.jar or plus two
>> > or three jars, if so many jars are introduced, maybe will there be a big
>> > conflict.
>> > - And hive/lib is not available on every machine. We need to upload so
>> > many jars.
>> > - A separate classloader maybe hard to work too, our
>> flink-connector-hive
>> > need hive jars, we may need to deal with flink-connector-hive jar
>> spacial
>> > too.
>> > CC: Rui Li
>> >
>> > I think the best system to integrate with hive is presto, which only
>> > connects hive metastore through thrift protocol. But I understand that
>> it
>> > costs a lot to rewrite the code.
>> >
>> > [1]
>> >
>> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/#dependencies
>> >
>> > Best,
>> > Jingsong Lee
>> >
>> > On Tue, Feb 4, 2020 at 1:44 AM Stephan Ewen  wrote:
>> >
>> >> We have had much trouble in the past from "too deep too custom"
>> >> integrations that everyone got out of the box, i.e., Hadoop.
>> >> Flink has has such a broad spectrum of use cases, if we have custom
>> build
>> >> for every other framework in that spectrum, we'll be in trouble.
>> >>
>> >> So I would also be -1 for custom builds.
>> >>
>> >> Couldn't we do something similar as we started doing for Hadoop? Moving
>> >> away from convenience downloads to allowing users to "export" their
>> setup
>> >> for Flink?
>> >>
>> >>   - We

Re: [VOTE] Release Flink Python API(PyFlink) 1.9.2 to PyPI, release candidate #0

2020-02-06 Thread jincheng sun
Hi Chesnay,

Thanks a lot for sharing your thoughts.

>> this is not a source release by definition, since a source release must
not contain binaries. This is a convenience binary, or possibly even a
distributed-channel appropriate version of our existing convenience binary.
A user downloading this package should know what they are downloading.

Yes, I agree it should be a binary release as we mentioned it in the
discussion thread [1].

>>We have never released a binary without a corresponding source release,
and don't really have established processes for this nor for distribution
channels other than maven.

This binary release is built from the 1.9.2 source release and the python
binary release package will be moved into the release folder[2] of 1.9.2 at
the final stage of this Python release.

>> Technically speaking we don't require a vote, but it is something that
the PMC has to decide.

Personally I think this is an official release because we have already
integrated the release process of PyFlink into the release of Flink[3].
Besides, Spark[4] and Beam[5] also considered the PyPI package as an
official release as they have also integrated it into the official release
process. As any official release requires PMC votes according to the
bylaws[6], I think releasing PyFlink 1.9.2 to PyPI requires PMC votes.
Furthermore, it provides an opportunity for the community to verify the
release package.

>> the artifact name is not descriptive as it neither says that it is a
binary nor that it is a python/PyPi-specific release

Regarding the artifact name, it should match the project name in the PyPI
and so I think it is OK.

>> Development status classifier seems incorrect as it is set to "Planning"

Regarding the development status classifier name, I think you’re right and
will change it to 'Development Status :: 5 - Production/Stable', and change
the  [author='Flink Developers'] to [author='Apache Software Foundation'],
in the new RC.

What’s your thought?

Best,
Jincheng

[1]
https://lists.apache.org/thread.html/1dabcda27a584ecda59129db4188073fb8ff7100b884a7564c1c2f73%40%3Cdev.flink.apache.org%3E
[2] https://dist.apache.org/repos/dist/release/flink/flink-1.9.2/
[3]
https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release#CreatingaFlinkRelease-DeployPythonartifactstoPyPI
[4] https://spark.apache.org/release-process.html
[5] https://beam.apache.org/contribute/release-guide/
[6] https://cwiki.apache.org/confluence/display/FLINK/Flink+Bylaws


Chesnay Schepler  于2020年2月5日周三 下午10:18写道:

> -1
>
> - this is not a source release by definition, since a source release
> must not contain binaries. This is a convenience binary, or possibly
> even a distributed-channel appropriate version of our existing
> convenience binary. A user downloading this package should know what
> they are downloading.
> We have never released a binary without a corresponding source release,
> and don't really have established processes for this nor for
> distribution channels other than maven. Technically speaking we don't
> require a vote, but it is something that the PMC has to decide.
>
> - the artifact name is not descriptive as it neither says that it is a
> binary nor that it is a python/PyPi-specific release
>
> - Development status classifier seems incorrect as it is set to "Planning"
>
>
> On 05/02/2020 09:03, jincheng sun wrote:
> > Hi Wei,
> >
> > Thanks for your vote and I appreciate that you kindly help to take the
> > ticket.
> >
> >   I've assigned the JIRAs to you!
> >
> > Best,
> > Jincheng
> >
> >
> > Wei Zhong  于2020年2月5日周三 下午3:55写道:
> >
> >> Hi,
> >>
> >> Thanks for driving this, Jincheng.
> >>
> >> +1 (non-binding)
> >>
> >> - Verified signatures and checksums.
> >> - `pip install apache-flink-1.9.2.tar.gz` successfully.
> >> - Start local pyflink shell via `pyflink-shell.sh local` and try the
> >> examples in the help message, run well and no exception.
> >> - Try a word count example in IDE, run well and no exception.
> >>
> >> In addition I'm willing to take these JIRAs. Could you assign them to
> me?
> >> :)
> >>
> >> Best,
> >> Wei
> >>
> >>
> >>> 在 2020年2月5日,14:49,jincheng sun  写道:
> >>>
> >>> Hi everyone,
> >>>
> >>> Please review and vote on the release candidate #0 for the PyFlink
> >> version
> >>> 1.9.2, as follows:
> >>>
> >>> [ ] +1, Approve the release
> >>> [ ] -1, Do not approve the release (please provide specific comments)
> >>>
> >>> The complete staging area is available for your review, which includes:
> >>> * the official Apache source release and binary convenience releases to
> >> be
> >>> deployed to dist.apache.org [1], which are signed with the key with
> >>> fingerprint 8FEA1EE9D0048C0CCC70B7573211B0703B79EA0E [2],
> >>> * source code tag "release-1.9.2" [3],
> >>> * create JIRA. for add description of support 'pip install' to 1.9.x
> >>> documents[4]
> >>> * create JIRA. for add PyPI release process for subsequent version
> >> release
> >>> of 1.9.x . i.e. improve the scrip

Re: [DISCUSS] have separate Flink distributions with built-in Hive dependencies

2020-02-06 Thread Stephan Ewen
Hi Jingsong!

This sounds that with two pre-bundled versions (hive 1.2.1 and hive 2.3.6)
you can cover a lot of versions.

Would it make sense to add these to flink-shaded (with proper dependency
exclusions of unnecessary dependencies) and offer them as a download,
similar as we offer pre-shaded Hadoop downloads?

Best,
Stephan


On Thu, Feb 6, 2020 at 10:26 AM Jingsong Li  wrote:

> Hi Stephan,
>
> The hive/lib/ has many jars, this lib is for execution, metastore, hive
> client and all things.
> What we really depend on is hive-exec.jar. (hive-metastore.jar is also
> required in the low version hive)
> And hive-exec.jar is a uber jar. We just want half classes of it. These
> half classes are not so clean, but it is OK to have them.
>
> Our solution now:
> - exclude hive jars from build
> - provide 8 versions dependencies way, user choose by his hive version.[1]
>
> Spark's solution:
> - build-in hive 1.2.1 dependencies to support hive 0.12.0 through 2.3.3.
> [2]
> - hive-exec.jar is hive-exec.spark.jar, Spark has modified the
> hive-exec build pom to exclude unnecessary classes including Orc and
> parquet.
> - build-in orc and parquet dependencies to optimizer performance.
> - support hive version 2.3.3 upper by "mvn install -Phive-2.3", to
> built-in hive-exec-2.3.6.jar. It seems that since this version, hive's API
> has been seriously incompatible.
> Most of the versions used by users are hive 0.12.0 through 2.3.3. So the
> default build of Spark is good to most of users.
>
> Presto's solution:
> - Built-in presto's hive.[3] Shade hive classes instead of thrift classes.
> - Rewrite some client related code to solve kinds of issues.
> This approach is the heaviest, but also the cleanest. It can support all
> kinds of hive versions with one build.
>
> So I think we can do:
>
> - The eight versions we now maintain are too many. I think we can move
> forward in the direction of Presto/Spark and try to reduce dependencies
> versions.
>
> - As your said, about provide fat/uber jars or helper script, I prefer
> uber jars, user can download one jar to their startup. Just like Kafka.
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/#dependencies
> [2]
> https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html#interacting-with-different-versions-of-hive-metastore
> [3] https://github.com/prestodb/presto-hive-apache
>
> Best,
> Jingsong Lee
>
> On Wed, Feb 5, 2020 at 10:15 PM Stephan Ewen  wrote:
>
>> Some thoughts about other options we have:
>>
>>   - Put fat/shaded jars for the common versions into "flink-shaded" and
>> offer them for download on the website, similar to pre-bundles Hadoop
>> versions.
>>
>>   - Look at the Presto code (Metastore protocol) and see if we can reuse
>> that
>>
>>   - Have a setup helper script that takes the versions and pulls the
>> required dependencies.
>>
>> Can you share how can a "built-in" dependency could work, if there are so
>> many different conflicting versions?
>>
>> Thanks,
>> Stephan
>>
>>
>> On Tue, Feb 4, 2020 at 12:59 PM Rui Li  wrote:
>>
>>> Hi Stephan,
>>>
>>> As Jingsong stated, in our documentation the recommended way to add Hive
>>> deps is to use exactly what users have installed. It's just we ask users
>>> to
>>> manually add those jars, instead of automatically find them based on env
>>> variables. I prefer to keep it this way for a while, and see if there're
>>> real concerns/complaints from user feedbacks.
>>>
>>> Please also note the Hive jars are not the only ones needed to integrate
>>> with Hive, users have to make sure flink-connector-hive and Hadoop jars
>>> are
>>> in classpath too. So I'm afraid a single "HIVE" env variable wouldn't
>>> save
>>> all the manual work for our users.
>>>
>>> On Tue, Feb 4, 2020 at 5:54 PM Jingsong Li 
>>> wrote:
>>>
>>> > Hi all,
>>> >
>>> > For your information, we have document the dependencies detailed
>>> > information [1]. I think it's a lot clearer than before, but it's worse
>>> > than presto and spark (they avoid or have built-in hive dependency).
>>> >
>>> > I thought about Stephan's suggestion:
>>> > - The hive/lib has 200+ jars, but we only need hive-exec.jar or plus
>>> two
>>> > or three jars, if so many jars are introduced, maybe will there be a
>>> big
>>> > conflict.
>>> > - And hive/lib is not available on every machine. We need to upload so
>>> > many jars.
>>> > - A separate classloader maybe hard to work too, our
>>> flink-connector-hive
>>> > need hive jars, we may need to deal with flink-connector-hive jar
>>> spacial
>>> > too.
>>> > CC: Rui Li
>>> >
>>> > I think the best system to integrate with hive is presto, which only
>>> > connects hive metastore through thrift protocol. But I understand that
>>> it
>>> > costs a lot to rewrite the code.
>>> >
>>> > [1]
>>> >
>>> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/#dependencies
>>> >
>>> > Best,
>>> > Jingsong Lee
>>> >
>>> > On Tue, Feb 4, 2020 at 1:44 AM Stephan Ewe

Re: [DISCUSS] have separate Flink distributions with built-in Hive dependencies

2020-02-06 Thread Jingsong Li
Hi Stephan,

Good idea. Just like hadoop, we can have flink-shaded-hive-uber.
Then the startup of hive integration will be very simple with one or two
pre-bundled, user just add these dependencies:
- flink-connector-hive.jar
- flink-shaded-hive-uber-.jar

Some changes are needed, but I think it should work.

Another thing is can we put flink-connector-hive.jar into flink/lib, it
should clean and no dependencies.

Best,
Jingsong Lee

On Thu, Feb 6, 2020 at 7:13 PM Stephan Ewen  wrote:

> Hi Jingsong!
>
> This sounds that with two pre-bundled versions (hive 1.2.1 and hive 2.3.6)
> you can cover a lot of versions.
>
> Would it make sense to add these to flink-shaded (with proper dependency
> exclusions of unnecessary dependencies) and offer them as a download,
> similar as we offer pre-shaded Hadoop downloads?
>
> Best,
> Stephan
>
>
> On Thu, Feb 6, 2020 at 10:26 AM Jingsong Li 
> wrote:
>
>> Hi Stephan,
>>
>> The hive/lib/ has many jars, this lib is for execution, metastore, hive
>> client and all things.
>> What we really depend on is hive-exec.jar. (hive-metastore.jar is also
>> required in the low version hive)
>> And hive-exec.jar is a uber jar. We just want half classes of it. These
>> half classes are not so clean, but it is OK to have them.
>>
>> Our solution now:
>> - exclude hive jars from build
>> - provide 8 versions dependencies way, user choose by his hive version.[1]
>>
>> Spark's solution:
>> - build-in hive 1.2.1 dependencies to support hive 0.12.0 through 2.3.3.
>> [2]
>> - hive-exec.jar is hive-exec.spark.jar, Spark has modified the
>> hive-exec build pom to exclude unnecessary classes including Orc and
>> parquet.
>> - build-in orc and parquet dependencies to optimizer performance.
>> - support hive version 2.3.3 upper by "mvn install -Phive-2.3", to
>> built-in hive-exec-2.3.6.jar. It seems that since this version, hive's API
>> has been seriously incompatible.
>> Most of the versions used by users are hive 0.12.0 through 2.3.3. So the
>> default build of Spark is good to most of users.
>>
>> Presto's solution:
>> - Built-in presto's hive.[3] Shade hive classes instead of thrift classes.
>> - Rewrite some client related code to solve kinds of issues.
>> This approach is the heaviest, but also the cleanest. It can support all
>> kinds of hive versions with one build.
>>
>> So I think we can do:
>>
>> - The eight versions we now maintain are too many. I think we can move
>> forward in the direction of Presto/Spark and try to reduce dependencies
>> versions.
>>
>> - As your said, about provide fat/uber jars or helper script, I prefer
>> uber jars, user can download one jar to their startup. Just like Kafka.
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/#dependencies
>> [2]
>> https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html#interacting-with-different-versions-of-hive-metastore
>> [3] https://github.com/prestodb/presto-hive-apache
>>
>> Best,
>> Jingsong Lee
>>
>> On Wed, Feb 5, 2020 at 10:15 PM Stephan Ewen  wrote:
>>
>>> Some thoughts about other options we have:
>>>
>>>   - Put fat/shaded jars for the common versions into "flink-shaded" and
>>> offer them for download on the website, similar to pre-bundles Hadoop
>>> versions.
>>>
>>>   - Look at the Presto code (Metastore protocol) and see if we can reuse
>>> that
>>>
>>>   - Have a setup helper script that takes the versions and pulls the
>>> required dependencies.
>>>
>>> Can you share how can a "built-in" dependency could work, if there are
>>> so many different conflicting versions?
>>>
>>> Thanks,
>>> Stephan
>>>
>>>
>>> On Tue, Feb 4, 2020 at 12:59 PM Rui Li  wrote:
>>>
 Hi Stephan,

 As Jingsong stated, in our documentation the recommended way to add Hive
 deps is to use exactly what users have installed. It's just we ask
 users to
 manually add those jars, instead of automatically find them based on env
 variables. I prefer to keep it this way for a while, and see if there're
 real concerns/complaints from user feedbacks.

 Please also note the Hive jars are not the only ones needed to integrate
 with Hive, users have to make sure flink-connector-hive and Hadoop jars
 are
 in classpath too. So I'm afraid a single "HIVE" env variable wouldn't
 save
 all the manual work for our users.

 On Tue, Feb 4, 2020 at 5:54 PM Jingsong Li 
 wrote:

 > Hi all,
 >
 > For your information, we have document the dependencies detailed
 > information [1]. I think it's a lot clearer than before, but it's
 worse
 > than presto and spark (they avoid or have built-in hive dependency).
 >
 > I thought about Stephan's suggestion:
 > - The hive/lib has 200+ jars, but we only need hive-exec.jar or plus
 two
 > or three jars, if so many jars are introduced, maybe will there be a
 big
 > conflict.
 > - And hive/lib is not available on every machine. We need to up

Re: [DISCUSS] Release flink-shaded 10.0

2020-02-06 Thread Hequn Cheng
+1. It sounds great to allow us to support zk 3.4 and 3.5.
Thanks for starting the discussion.

Best,
Hequn

On Thu, Feb 6, 2020 at 12:21 AM Till Rohrmann  wrote:

> Thanks for starting this discussion Chesnay. +1 for starting a new
> flink-shaded release.
>
> Cheers,
> Till
>
> On Wed, Feb 5, 2020 at 2:10 PM Chesnay Schepler 
> wrote:
>
> > Hello,
> >
> > I would like to kick off the next release of flink-shaded. The main
> > feature are new modules that bundle zookeeper&curator, that will allow
> > us to support zk 3.4 and 3.5 .
> >
> > Additionally we fixed an issue where slightly older dependencies than
> > intended were bundled in the flink-shaded-hadoop-2-uber jar, which was
> > flagged by security checks.
> >
> > Are there any other changes that people are interested in doing?
> >
> >
> > Regards,
> >
> > Chesnay
> >
> >
>


[jira] [Created] (FLINK-15938) Idle state not cleaned in StreamingJoinOperator and StreamingSemiAntiJoinOperator

2020-02-06 Thread Benchao Li (Jira)
Benchao Li created FLINK-15938:
--

 Summary: Idle state not cleaned in StreamingJoinOperator and 
StreamingSemiAntiJoinOperator
 Key: FLINK-15938
 URL: https://issues.apache.org/jira/browse/FLINK-15938
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.9.2, 1.10.0
Reporter: Benchao Li


{code:java}
return StateTtlConfig
   .newBuilder(Time.milliseconds(retentionTime))
   .setUpdateType(StateTtlConfig.UpdateType.OnCreateAndWrite)
   
.setStateVisibility(StateTtlConfig.StateVisibility.ReturnExpiredIfNotCleanedUp)
   .build();
{code}
StateTtl is constructed in above code for `StreamingJoinOperator` and 
`StreamingSemiAntiJoinOperator`.

However, as stated in 
[https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/state/state.html#cleanup-of-expired-state]
 , the state will be cleaned only when it's read which means the state will not 
be cleaned enless we read it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.10.0, release candidate #2

2020-02-06 Thread Benchao Li
Hi all,

I found another issue[1], I don't know if it should be a blocker. But it
does affects joins without window in blink planner.

[1] https://issues.apache.org/jira/browse/FLINK-15938

Jeff Zhang  于2020年2月6日周四 下午5:05写道:

> Hi Jingsong,
>
> Thanks for the suggestion. It works for running it in IDE, but for
> downstream project like Zeppelin where I will include flink jars in
> classpath.
> it only works when I specify the jars one by one explicitly in classpath,
> using * doesn't work.
>
> e.g.
>
> The following command where I use * to specify classpath doesn't work,
> jzhang 4794 1.2 6.3 8408904 1048828 s007 S 4:30PM 0:33.90
> /Library/Java/JavaVirtualMachines/jdk1.8.0_211.jdk/Contents/Home/bin/java
> -Dfile.encoding=UTF-8
>
> -Dlog4j.configuration=file:///Users/jzhang/github/zeppelin/conf/log4j.properties
>
> -Dzeppelin.log.file=/Users/jzhang/github/zeppelin/logs/zeppelin-interpreter-flink-shared_process-jzhang-JeffdeMacBook-Pro.local.log
> -Xms1024m -Xmx2048m -XX:MaxPermSize=512m -cp
>
> *:/Users/jzhang/github/flink/build-target/lib/**:/Users/jzhang/github/zeppelin/interpreter/flink/*:/Users/jzhang/github/zeppelin/zeppelin-interpreter-api/target/*:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes/*::/Users/jzhang/github/zeppelin/interpreter/zeppelin-interpreter-api-0.9.0-SNAPSHOT.jar:/Users/jzhang/github/zeppelin/zeppelin-interpreter/target/classes:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes:/Users/jzhang/github/flink/build-target/opt/flink-python_2.11-1.10-SNAPSHOT.jar
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer 0.0.0.0
> 52395 flink-shared_process :
>
>
> While this command where I specify jar one by one explicitly in classpath
> works
>
> jzhang5660 205.2  6.1  8399420 1015036 s005  R 4:43PM
> 0:24.82
> /Library/Java/JavaVirtualMachines/jdk1.8.0_211.jdk/Contents/Home/bin/java
> -Dfile.encoding=UTF-8
>
> -Dlog4j.configuration=file:///Users/jzhang/github/zeppelin/conf/log4j.properties
>
> -Dzeppelin.log.file=/Users/jzhang/github/zeppelin/logs/zeppelin-interpreter-flink-shared_process-jzhang-JeffdeMacBook-Pro.local.log
> -Xms1024m -Xmx2048m -XX:MaxPermSize=512m -cp
>
> *:/Users/jzhang/github/flink/build-target/lib/slf4j-log4j12-1.7.15.jar:/Users/jzhang/github/flink/build-target/lib/flink-connector-hive_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/hive-exec-2.3.4.jar:/Users/jzhang/github/flink/build-target/lib/flink-shaded-hadoop-2-uber-2.7.5-7.0.jar:/Users/jzhang/github/flink/build-target/lib/log4j-1.2.17.jar:/Users/jzhang/github/flink/build-target/lib/flink-table-blink_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-table_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-dist_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-python_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-hadoop-compatibility_2.11-1.9.1.jar*:/Users/jzhang/github/zeppelin/interpreter/flink/*:/Users/jzhang/github/zeppelin/zeppelin-interpreter-api/target/*:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes/*::/Users/jzhang/github/zeppelin/interpreter/zeppelin-interpreter-api-0.9.0-SNAPSHOT.jar:/Users/jzhang/github/zeppelin/zeppelin-interpreter/target/classes:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes:/Users/jzhang/github/flink/build-target/opt/flink-python_2.11-1.10-SNAPSHOT.jar
>
> /Users/jzhang/github/flink/build-target/opt/tmp/flink-python_2.11-1.10-SNAPSHOT.jar
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer 0.0.0.0
> 52603 flink-shared_process :
>
>
> Jingsong Li  于2020年2月6日周四 下午4:10写道:
>
> > Hi Jeff,
> >
> >
> > For FLINK-15935 [1],
> >
> > I try to think of it as a non blocker. But it's really an important
> issue.
> >
> >
> > The problem is the class loading order. We want to load the class in the
> > blink-planner.jar, but actually load the class in the flink-planner.jar.
> >
> >
> > First of all, the order of class loading is based on the order of
> > classpath.
> >
> >
> > I just tried, the order of classpath of the folder is depends on the
> order
> > of file names.
> >
> > -That is to say, our order is OK now: because
> > flink-table-blink_2.11-1.11-SNAPSHOT.jar is before the name order of
> > flink-table_2.11-1.11-snapshot.jar.
> >
> > -But once I change the name of flink-table_2.11-1.11-SNAPSHOT.jar to
> > aflink-table_2.11-1.11-SNAPSHOT.jar, an error will be reported.
> >
> >
> > The order of classpaths should be influenced by the ls of Linux. By
> default
> > the ls command is listing the files in alphabetical order. [1]
> >
> >
> > IDE seems to have another rule, because it is directly load classes. Not
> > jars.
> >
> >
> > So the possible impact of this problem is:
> >
> > -IDE with two planners to use watermark, which will report an error.
> >
> > -The real production environment, under Linux, will not report errors
> > because of the above reasons.
> >
> >
>

[jira] [Created] (FLINK-15939) Move runtime.clusterframework.overlays package into flink-mesos module

2020-02-06 Thread vinoyang (Jira)
vinoyang created FLINK-15939:


 Summary: Move runtime.clusterframework.overlays package into 
flink-mesos module
 Key: FLINK-15939
 URL: https://issues.apache.org/jira/browse/FLINK-15939
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Configuration
Reporter: vinoyang


Due to historical reasons, almost all classes in the current overlays package 
are not dependent on other components in the runtime module. All of these 
classes are only dependent on the {{MesosUtils}} class of the {{flink-mesos}} 
module. The amount of code in the current runtime module is large enough, we 
should move some weakly dependent classes to where it should be placed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15940) Flink TupleSerializer and CaseClassSerializer shoud support serialize and deserialize NULL value

2020-02-06 Thread YufeiLiu (Jira)
YufeiLiu created FLINK-15940:


 Summary: Flink TupleSerializer and CaseClassSerializer shoud 
support serialize and deserialize NULL value
 Key: FLINK-15940
 URL: https://issues.apache.org/jira/browse/FLINK-15940
 Project: Flink
  Issue Type: Bug
  Components: API / Type Serialization System
Affects Versions: 1.10.0
Reporter: YufeiLiu


It's allready support copy NULL value in 
[FLINK-14642|https://issues.apache.org/jira/browse/FLINK-14642].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-06 Thread Hequn Cheng
Hi everyone,

Thank you all for the great inputs!

I think probably what we all agree on is we should try to make a leaner
flink-dist. However, we may also need to do some compromises considering
the user experience that users don't need to download the dependencies from
different places. Otherwise, we can move all the jars in the current opt
folder to the download page.

The missing of clear rules for guiding such compromises makes things more
complicated now. I would agree that the decisive factor for what goes into
Flink's binary distribution should be how core it is to Flink. Meanwhile,
it's better to treat Flink API as a (core) core to Flink. Not only it is a
very clear rule that easy to be followed but also in most cases, API is
very significant and deserved to be included in the dist.

Given this, it might make sense to put flink-ml-api and flink-ml-lib into
the opt.
What do you think?

Best,
Hequn

On Wed, Feb 5, 2020 at 12:39 AM Chesnay Schepler  wrote:

> Around a year ago I started a discussion
> 
> on reducing the amount of jars we ship with the distribution.
>
> While there was no definitive conclusion there was a shared sentiment that
> APIs should be shipped with the distribution.
>
> On 04/02/2020 17:25, Till Rohrmann wrote:
>
> I think there is no such rule that APIs go automatically into opt/ and
> "libraries" not. The contents of opt/ have mainly grown over time w/o
> following a strict rule.
>
> I think the decisive factor for what goes into Flink's binary distribution
> should be how core it is to Flink. Of course another important
> consideration is which use cases Flink should promote "out of the box" (not
> sure whether this is actual true for content shipped in opt/ because you
> also have to move it to lib).
>
> For example, Gelly would be an example which I would rather see as an
> optional component than shipping it with every Flink binary distribution.
>
> Cheers,
> Till
>
> On Tue, Feb 4, 2020 at 11:24 AM Becket Qin  
>  wrote:
>
>
> Thanks for the suggestion, Till.
>
> I am curious about how do we usually decide when to put the jars into the
> opt folder?
>
> Technically speaking, it seems that `flink-ml-api` should be put into the
> opt directory because they are actually API instead of libraries, just like
> CEP and Table.
>
> `flink-ml-lib` seems to be on the border. On one hand, it is a library. On
> the other hand, unlike SQL formats and Hadoop whose major code are outside
> of Flink, the algorithm codes are in Flink. So `flink-ml-lib` is more like
> those of built-in SQL UDFs. So it seems fine to either put it in the opt
> folder or in the downloads page.
>
> From the user experience perspective, it might be better to have both
> `flink-ml-lib` and `flink-ml-api` in opt folder so users needn't go to two
> places for the required dependencies.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Tue, Feb 4, 2020 at 2:32 PM Hequn Cheng  
>  wrote:
>
>
> Hi Till,
>
> Thanks a lot for your suggestion. It's a good idea to offer the flink-ml
> libraries as optional dependencies on the download page which can make
>
> the
>
> dist smaller.
>
> But I also have some concerns for it, e.g., the download page now only
> includes the latest 3 releases. We may need to find ways to support more
> versions.
> On the other hand, the size of the flink-ml libraries now is very
> small(about 246K), so it would not bring much impact on the size of dist.
>
> What do you think?
>
> Best,
> Hequn
>
> On Mon, Feb 3, 2020 at 6:24 PM Till Rohrmann  
> 
>
> wrote:
>
> An alternative solution would be to offer the flink-ml libraries as
> optional dependencies on the download page. Similar to how we offer the
> different SQL formats and Hadoop releases [1].
>
> [1] https://flink.apache.org/downloads.html
>
> Cheers,
> Till
>
> On Mon, Feb 3, 2020 at 10:19 AM Hequn Cheng  
>  wrote:
>
>
> Thank you all for your feedback and suggestions!
>
> Best, Hequn
>
> On Mon, Feb 3, 2020 at 5:07 PM Becket Qin  
> 
>
> wrote:
>
> Thanks for bringing up the discussion, Hequn.
>
> +1 on adding `flink-ml-api` and `flink-ml-lib` into opt. This would
>
> make
>
> it much easier for the users to try out some simple ml tasks.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Mon, Feb 3, 2020 at 4:34 PM jincheng sun <
>
> sunjincheng...@gmail.com
>
> wrote:
>
>
> Thank you for pushing forward @Hequn Cheng  
>  !
>
> Hi  @Becket Qin   , Do you have 
> any concerns
>
> on
>
> this ?
>
> Best,
> Jincheng
>
> Hequn Cheng   于2020年2月3日周一 下午2:09写道:
>
>
> Hi everyone,
>
> Thanks for the feedback. As there are no objections, I've opened a
>
> JIRA
>
> issue(FLINK-15847[1]) to address this issue.
> The implementation details can be discussed in the issue or in the
> following PR.
>
> Best,
> Hequn
>
> [1] https://issues.apache.org/jira/browse/FLINK-15847
>
> On Wed, Jan 8, 2020 at 9:15 PM Hequn Cheng  
> 
>
> wrote:
>
> Hi Jinc

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-02-06 Thread Till Rohrmann
Hi Gyula,

technically speaking the JobGraph is sent to the Dispatcher where a
JobMaster is started to execute the JobGraph. The JobGraph comes either
from the JobSubmitHandler or the JarRunHandler. Except for creating the
ExecutionGraph from the JobGraph there is not much happening on the
Dispatcher.

If Atlas only requires to work on the JobGraph, then I believe it would be
good enough if Flink offers the utilities to analyze it. Then we would not
need to add anything to Flink itself.

For the second question, I guess it mostly depends on the requirements from
Atlas. I guess one could make it a bit easier to extract information from
the JobGraph. Maybe Aljoscha can chime in on this topic.

Cheers,
Till

On Wed, Feb 5, 2020 at 7:35 PM Gyula Fóra  wrote:

> @Till Rohrmann 
> You are completely right that the Atlas hook itself should not live inside
> Flink. All other hooks for the other projects are implemented as part of
> Atlas,
> and the Atlas community is ready to maintain it once we have a working
> version. The discussion is more about changes that we need in Flink (if
> any) to make it possible to implement the Atlas hook outside Flink.
>
> So in theory I agree that the hook should receive job submissions and just
> extract the metadata required by Atlas.
>
> There are 2 questions here (and my initial email gives one possible
> solution):
>
> 1. What is the component that receives the submission and runs the
> extraction logic? If you want to remove this process from Flink you could
> put something in front of the job submission rest endpoint but I think that
> would be an overkill. So I assumed that we will have to add some way of
> executing code on job submissions, hence my proposal of a job submission
> hook.
>
> 2. What information do we need to extract the atlas metadata? On job
> submissions we usually have JobGraph instances which are pretty hard to
> handle compared to a StreamGraph for instance when it comes to getting
> source/sink udfs. I am wondering if we need to make this easier somehow.
>
> Gyula
>
> On Wed, Feb 5, 2020 at 6:03 PM Taher Koitawala  wrote:
>
> > As far as I know, Atlas entries can be created with a rest call. Can we
> not
> > create an abstracted Flink operator that makes the rest call on job
> > execution/submission?
> >
> > Regards,
> > Taher Koitawala
> >
> > On Wed, Feb 5, 2020, 10:16 PM Flavio Pompermaier 
> > wrote:
> >
> > > Hi Gyula,
> > > thanks for taking care of integrating Flink with Atlas (and Egeria
> > > initiative in the end) that is IMHO the most important part of all the
> > > Hadoop ecosystem and that, unfortunately, was quite overlooked. I can
> > > confirm that the integration with Atlas/Egeria is absolutely of big
> > > interest.
> > >
> > > Il Mer 5 Feb 2020, 17:12 Till Rohrmann  ha
> > scritto:
> > >
> > > > Hi Gyula,
> > > >
> > > > thanks for starting this discussion. Before diving in the details of
> > how
> > > to
> > > > implement this feature, I wanted to ask whether it is strictly
> required
> > > > that the Atlas integration lives within Flink or not? Could it also
> > work
> > > if
> > > > you have tool which receives job submissions, extracts the required
> > > > information, forwards the job submission to Flink, monitors the
> > execution
> > > > result and finally publishes some information to Atlas (modulo some
> > other
> > > > steps which are missing in my description)? Having a different layer
> > > being
> > > > responsible for this would keep complexity out of Flink.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Wed, Feb 5, 2020 at 12:48 PM Gyula Fóra 
> wrote:
> > > >
> > > > > Hi all!
> > > > >
> > > > > We have started some preliminary work on the Flink - Atlas
> > integration
> > > at
> > > > > Cloudera. It seems that the integration will require some new hook
> > > > > interfaces at the jobgraph generation and submission phases, so I
> > > > figured I
> > > > > will open a discussion thread with my initial ideas to get some
> early
> > > > > feedback.
> > > > >
> > > > > *Minimal background*
> > > > > Very simply put Apache Atlas is a data governance framework that
> > stores
> > > > > metadata for our data and processing logic to track ownership,
> > lineage
> > > > etc.
> > > > > It is already integrated with systems like HDFS, Kafka, Hive and
> many
> > > > > others.
> > > > >
> > > > > Adding Flink integration would mean that we can track the input
> > output
> > > > data
> > > > > of our Flink jobs, their owners and how different Flink jobs are
> > > > connected
> > > > > to each other through the data they produce (lineage). This seems
> to
> > > be a
> > > > > very big deal for a lot of companies :)
> > > > >
> > > > > *Flink - Atlas integration in a nutshell*
> > > > > In order to integrate with Atlas we basically need 2 things.
> > > > >  - Flink entity definitions
> > > > >  - Flink Atlas hook
> > > > >
> > > > > The entity definition is the easy part. It is a json that contains
> > the
> > > > > object

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-02-06 Thread Jeff Zhang
Hi Gyula,

Flink 1.10 introduced JobListener which is invoked after job submission and
finished.  May we can add api on JobClient to get what info you needed for
altas integration.

https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/execution/JobListener.java#L46


Gyula Fóra  于2020年2月5日周三 下午7:48写道:

> Hi all!
>
> We have started some preliminary work on the Flink - Atlas integration at
> Cloudera. It seems that the integration will require some new hook
> interfaces at the jobgraph generation and submission phases, so I figured I
> will open a discussion thread with my initial ideas to get some early
> feedback.
>
> *Minimal background*
> Very simply put Apache Atlas is a data governance framework that stores
> metadata for our data and processing logic to track ownership, lineage etc.
> It is already integrated with systems like HDFS, Kafka, Hive and many
> others.
>
> Adding Flink integration would mean that we can track the input output data
> of our Flink jobs, their owners and how different Flink jobs are connected
> to each other through the data they produce (lineage). This seems to be a
> very big deal for a lot of companies :)
>
> *Flink - Atlas integration in a nutshell*
> In order to integrate with Atlas we basically need 2 things.
>  - Flink entity definitions
>  - Flink Atlas hook
>
> The entity definition is the easy part. It is a json that contains the
> objects (entities) that we want to store for any give Flink job. As a
> starter we could have a single FlinkApplication entity that has a set of
> inputs and outputs. These inputs/outputs are other Atlas entities that are
> already defines such as Kafka topic or Hbase table.
>
> The Flink atlas hook will be the logic that creates the entity instance and
> uploads it to Atlas when we start a new Flink job. This is the part where
> we implement the core logic.
>
> *Job submission hook*
> In order to implement the Atlas hook we need a place where we can inspect
> the pipeline, create and send the metadata when the job starts. When we
> create the FlinkApplication entity we need to be able to easily determine
> the sources and sinks (and their properties) of the pipeline.
>
> Unfortunately there is no JobSubmission hook in Flink that could execute
> this logic and even if there was one there is a mismatch of abstraction
> levels needed to implement the integration.
> We could imagine a JobSubmission hook executed in the JobManager runner as
> this:
>
> void onSuccessfulSubmission(JobGraph jobGraph, Configuration
> configuration);
>
> This is nice but the JobGraph makes it super difficult to extract sources
> and UDFs to create the metadata entity. The atlas entity however could be
> easily created from the StreamGraph object (used to represent the logical
> flow) before the JobGraph is generated. To go around this limitation we
> could add a JobGraphGeneratorHook interface:
>
> void preProcess(StreamGraph streamGraph); void postProcess(JobGraph
> jobGraph);
>
> We could then generate the atlas entity in the preprocess step and add a
> jobmission hook in the postprocess step that will simply send the already
> baked in entity.
>
> *This kinda works but...*
> The approach outlined above seems to work and we have built a POC using it.
> Unfortunately it is far from nice as it exposes non-public APIs such as the
> StreamGraph. Also it feels a bit weird to have 2 hooks instead of one.
>
> It would be much nicer if we could somehow go back from JobGraph to
> StreamGraph or at least have an easy way to access source/sink UDFS.
>
> What do you think?
>
> Cheers,
> Gyula
>


-- 
Best Regards

Jeff Zhang


Re: [VOTE] Improve TableFactory to add Context

2020-02-06 Thread Timo Walther

+1

On 06.02.20 05:54, Bowen Li wrote:

+1, LGTM

On Tue, Feb 4, 2020 at 11:28 PM Jark Wu  wrote:


+1 form my side.
Thanks for driving this.

Btw, could you also attach a JIRA issue with the changes described in it,
so that users can find the issue through the mailing list in the future.

Best,
Jark

On Wed, 5 Feb 2020 at 13:38, Kurt Young  wrote:


+1 from my side.

Best,
Kurt


On Wed, Feb 5, 2020 at 10:59 AM Jingsong Li 
wrote:


Hi all,

Interface updated.
Please re-vote.

Best,
Jingsong Lee

On Tue, Feb 4, 2020 at 1:28 PM Jingsong Li 

wrote:



Hi all,

I would like to start the vote for the improve of
TableFactory, which is discussed and
reached a consensus in the discussion thread[2].

The vote will be open for at least 72 hours. I'll try to close it
unless there is an objection or not enough votes.

[1]






http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improve-TableFactory-td36647.html


Best,
Jingsong Lee




--
Best, Jingsong Lee











Re: [VOTE] Release 1.10.0, release candidate #2

2020-02-06 Thread Andrey Zagrebin
Hi Benchao,

Do you observe this issue FLINK-15938 with 1.9 or 1.10?
If with 1.9, I suggest to check with 1.10.

Thanks,
Andrey

On Thu, Feb 6, 2020 at 4:07 PM Benchao Li  wrote:

> Hi all,
>
> I found another issue[1], I don't know if it should be a blocker. But it
> does affects joins without window in blink planner.
>
> [1] https://issues.apache.org/jira/browse/FLINK-15938
>
> Jeff Zhang  于2020年2月6日周四 下午5:05写道:
>
> > Hi Jingsong,
> >
> > Thanks for the suggestion. It works for running it in IDE, but for
> > downstream project like Zeppelin where I will include flink jars in
> > classpath.
> > it only works when I specify the jars one by one explicitly in classpath,
> > using * doesn't work.
> >
> > e.g.
> >
> > The following command where I use * to specify classpath doesn't work,
> > jzhang 4794 1.2 6.3 8408904 1048828 s007 S 4:30PM 0:33.90
> > /Library/Java/JavaVirtualMachines/jdk1.8.0_211.jdk/Contents/Home/bin/java
> > -Dfile.encoding=UTF-8
> >
> >
> -Dlog4j.configuration=file:///Users/jzhang/github/zeppelin/conf/log4j.properties
> >
> >
> -Dzeppelin.log.file=/Users/jzhang/github/zeppelin/logs/zeppelin-interpreter-flink-shared_process-jzhang-JeffdeMacBook-Pro.local.log
> > -Xms1024m -Xmx2048m -XX:MaxPermSize=512m -cp
> >
> >
> *:/Users/jzhang/github/flink/build-target/lib/**:/Users/jzhang/github/zeppelin/interpreter/flink/*:/Users/jzhang/github/zeppelin/zeppelin-interpreter-api/target/*:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes/*::/Users/jzhang/github/zeppelin/interpreter/zeppelin-interpreter-api-0.9.0-SNAPSHOT.jar:/Users/jzhang/github/zeppelin/zeppelin-interpreter/target/classes:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes:/Users/jzhang/github/flink/build-target/opt/flink-python_2.11-1.10-SNAPSHOT.jar
> > org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer 0.0.0.0
> > 52395 flink-shared_process :
> >
> >
> > While this command where I specify jar one by one explicitly in classpath
> > works
> >
> > jzhang5660 205.2  6.1  8399420 1015036 s005  R 4:43PM
> > 0:24.82
> > /Library/Java/JavaVirtualMachines/jdk1.8.0_211.jdk/Contents/Home/bin/java
> > -Dfile.encoding=UTF-8
> >
> >
> -Dlog4j.configuration=file:///Users/jzhang/github/zeppelin/conf/log4j.properties
> >
> >
> -Dzeppelin.log.file=/Users/jzhang/github/zeppelin/logs/zeppelin-interpreter-flink-shared_process-jzhang-JeffdeMacBook-Pro.local.log
> > -Xms1024m -Xmx2048m -XX:MaxPermSize=512m -cp
> >
> >
> *:/Users/jzhang/github/flink/build-target/lib/slf4j-log4j12-1.7.15.jar:/Users/jzhang/github/flink/build-target/lib/flink-connector-hive_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/hive-exec-2.3.4.jar:/Users/jzhang/github/flink/build-target/lib/flink-shaded-hadoop-2-uber-2.7.5-7.0.jar:/Users/jzhang/github/flink/build-target/lib/log4j-1.2.17.jar:/Users/jzhang/github/flink/build-target/lib/flink-table-blink_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-table_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-dist_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-python_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-hadoop-compatibility_2.11-1.9.1.jar*:/Users/jzhang/github/zeppelin/interpreter/flink/*:/Users/jzhang/github/zeppelin/zeppelin-interpreter-api/target/*:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes/*::/Users/jzhang/github/zeppelin/interpreter/zeppelin-interpreter-api-0.9.0-SNAPSHOT.jar:/Users/jzhang/github/zeppelin/zeppelin-interpreter/target/classes:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes:/Users/jzhang/github/flink/build-target/opt/flink-python_2.11-1.10-SNAPSHOT.jar
> >
> >
> /Users/jzhang/github/flink/build-target/opt/tmp/flink-python_2.11-1.10-SNAPSHOT.jar
> > org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer 0.0.0.0
> > 52603 flink-shared_process :
> >
> >
> > Jingsong Li  于2020年2月6日周四 下午4:10写道:
> >
> > > Hi Jeff,
> > >
> > >
> > > For FLINK-15935 [1],
> > >
> > > I try to think of it as a non blocker. But it's really an important
> > issue.
> > >
> > >
> > > The problem is the class loading order. We want to load the class in
> the
> > > blink-planner.jar, but actually load the class in the
> flink-planner.jar.
> > >
> > >
> > > First of all, the order of class loading is based on the order of
> > > classpath.
> > >
> > >
> > > I just tried, the order of classpath of the folder is depends on the
> > order
> > > of file names.
> > >
> > > -That is to say, our order is OK now: because
> > > flink-table-blink_2.11-1.11-SNAPSHOT.jar is before the name order of
> > > flink-table_2.11-1.11-snapshot.jar.
> > >
> > > -But once I change the name of flink-table_2.11-1.11-SNAPSHOT.jar to
> > > aflink-table_2.11-1.11-SNAPSHOT.jar, an error will be reported.
> > >
> > >
> > > The order of classpaths should be influenced by the ls of Linux. By
> > default
> > > the ls command is listing the files in al

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-06 Thread Till Rohrmann
I would not object given that it is rather small at the moment. However, I
also think that we should have a plan how to handle the ever growing Flink
ecosystem and how to make it easily accessible to our users. E.g. one far
fetched idea could be something like a configuration script which downloads
the required components for the user. But this deserves definitely a
separate discussion and does not really belong here.

Cheers,
Till

On Thu, Feb 6, 2020 at 3:35 PM Hequn Cheng  wrote:

>
> Hi everyone,
>
> Thank you all for the great inputs!
>
> I think probably what we all agree on is we should try to make a leaner
> flink-dist. However, we may also need to do some compromises considering
> the user experience that users don't need to download the dependencies from
> different places. Otherwise, we can move all the jars in the current opt
> folder to the download page.
>
> The missing of clear rules for guiding such compromises makes things more
> complicated now. I would agree that the decisive factor for what goes into
> Flink's binary distribution should be how core it is to Flink. Meanwhile,
> it's better to treat Flink API as a (core) core to Flink. Not only it is a
> very clear rule that easy to be followed but also in most cases, API is
> very significant and deserved to be included in the dist.
>
> Given this, it might make sense to put flink-ml-api and flink-ml-lib into
> the opt.
> What do you think?
>
> Best,
> Hequn
>
> On Wed, Feb 5, 2020 at 12:39 AM Chesnay Schepler 
> wrote:
>
>> Around a year ago I started a discussion
>> 
>> on reducing the amount of jars we ship with the distribution.
>>
>> While there was no definitive conclusion there was a shared sentiment
>> that APIs should be shipped with the distribution.
>>
>> On 04/02/2020 17:25, Till Rohrmann wrote:
>>
>> I think there is no such rule that APIs go automatically into opt/ and
>> "libraries" not. The contents of opt/ have mainly grown over time w/o
>> following a strict rule.
>>
>> I think the decisive factor for what goes into Flink's binary distribution
>> should be how core it is to Flink. Of course another important
>> consideration is which use cases Flink should promote "out of the box" (not
>> sure whether this is actual true for content shipped in opt/ because you
>> also have to move it to lib).
>>
>> For example, Gelly would be an example which I would rather see as an
>> optional component than shipping it with every Flink binary distribution.
>>
>> Cheers,
>> Till
>>
>> On Tue, Feb 4, 2020 at 11:24 AM Becket Qin  
>>  wrote:
>>
>>
>> Thanks for the suggestion, Till.
>>
>> I am curious about how do we usually decide when to put the jars into the
>> opt folder?
>>
>> Technically speaking, it seems that `flink-ml-api` should be put into the
>> opt directory because they are actually API instead of libraries, just like
>> CEP and Table.
>>
>> `flink-ml-lib` seems to be on the border. On one hand, it is a library. On
>> the other hand, unlike SQL formats and Hadoop whose major code are outside
>> of Flink, the algorithm codes are in Flink. So `flink-ml-lib` is more like
>> those of built-in SQL UDFs. So it seems fine to either put it in the opt
>> folder or in the downloads page.
>>
>> From the user experience perspective, it might be better to have both
>> `flink-ml-lib` and `flink-ml-api` in opt folder so users needn't go to two
>> places for the required dependencies.
>>
>> Thanks,
>>
>> Jiangjie (Becket) Qin
>>
>> On Tue, Feb 4, 2020 at 2:32 PM Hequn Cheng  
>>  wrote:
>>
>>
>> Hi Till,
>>
>> Thanks a lot for your suggestion. It's a good idea to offer the flink-ml
>> libraries as optional dependencies on the download page which can make
>>
>> the
>>
>> dist smaller.
>>
>> But I also have some concerns for it, e.g., the download page now only
>> includes the latest 3 releases. We may need to find ways to support more
>> versions.
>> On the other hand, the size of the flink-ml libraries now is very
>> small(about 246K), so it would not bring much impact on the size of dist.
>>
>> What do you think?
>>
>> Best,
>> Hequn
>>
>> On Mon, Feb 3, 2020 at 6:24 PM Till Rohrmann  
>> 
>>
>> wrote:
>>
>> An alternative solution would be to offer the flink-ml libraries as
>> optional dependencies on the download page. Similar to how we offer the
>> different SQL formats and Hadoop releases [1].
>>
>> [1] https://flink.apache.org/downloads.html
>>
>> Cheers,
>> Till
>>
>> On Mon, Feb 3, 2020 at 10:19 AM Hequn Cheng  
>>  wrote:
>>
>>
>> Thank you all for your feedback and suggestions!
>>
>> Best, Hequn
>>
>> On Mon, Feb 3, 2020 at 5:07 PM Becket Qin  
>> 
>>
>> wrote:
>>
>> Thanks for bringing up the discussion, Hequn.
>>
>> +1 on adding `flink-ml-api` and `flink-ml-lib` into opt. This would
>>
>> make
>>
>> it much easier for the users to try out some simple ml tasks.
>>
>> Thanks,
>>
>> Jiangjie (Becket) Qin
>>
>> On 

Re: [VOTE] Release 1.10.0, release candidate #2

2020-02-06 Thread Benchao Li
Hi Andrey,

I noticed that 1.10 has changed to enabling background cleanup by default
just after I posted to this email.
So it won't affect 1.10 any more, just affect 1.9.x. We can move to the
Jira ticket to discuss further more.

Andrey Zagrebin  于2020年2月6日周四 下午11:30写道:

> Hi Benchao,
>
> Do you observe this issue FLINK-15938 with 1.9 or 1.10?
> If with 1.9, I suggest to check with 1.10.
>
> Thanks,
> Andrey
>
> On Thu, Feb 6, 2020 at 4:07 PM Benchao Li  wrote:
>
> > Hi all,
> >
> > I found another issue[1], I don't know if it should be a blocker. But it
> > does affects joins without window in blink planner.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-15938
> >
> > Jeff Zhang  于2020年2月6日周四 下午5:05写道:
> >
> > > Hi Jingsong,
> > >
> > > Thanks for the suggestion. It works for running it in IDE, but for
> > > downstream project like Zeppelin where I will include flink jars in
> > > classpath.
> > > it only works when I specify the jars one by one explicitly in
> classpath,
> > > using * doesn't work.
> > >
> > > e.g.
> > >
> > > The following command where I use * to specify classpath doesn't work,
> > > jzhang 4794 1.2 6.3 8408904 1048828 s007 S 4:30PM 0:33.90
> > >
> /Library/Java/JavaVirtualMachines/jdk1.8.0_211.jdk/Contents/Home/bin/java
> > > -Dfile.encoding=UTF-8
> > >
> > >
> >
> -Dlog4j.configuration=file:///Users/jzhang/github/zeppelin/conf/log4j.properties
> > >
> > >
> >
> -Dzeppelin.log.file=/Users/jzhang/github/zeppelin/logs/zeppelin-interpreter-flink-shared_process-jzhang-JeffdeMacBook-Pro.local.log
> > > -Xms1024m -Xmx2048m -XX:MaxPermSize=512m -cp
> > >
> > >
> >
> *:/Users/jzhang/github/flink/build-target/lib/**:/Users/jzhang/github/zeppelin/interpreter/flink/*:/Users/jzhang/github/zeppelin/zeppelin-interpreter-api/target/*:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes/*::/Users/jzhang/github/zeppelin/interpreter/zeppelin-interpreter-api-0.9.0-SNAPSHOT.jar:/Users/jzhang/github/zeppelin/zeppelin-interpreter/target/classes:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes:/Users/jzhang/github/flink/build-target/opt/flink-python_2.11-1.10-SNAPSHOT.jar
> > > org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer 0.0.0.0
> > > 52395 flink-shared_process :
> > >
> > >
> > > While this command where I specify jar one by one explicitly in
> classpath
> > > works
> > >
> > > jzhang5660 205.2  6.1  8399420 1015036 s005  R 4:43PM
> > > 0:24.82
> > >
> /Library/Java/JavaVirtualMachines/jdk1.8.0_211.jdk/Contents/Home/bin/java
> > > -Dfile.encoding=UTF-8
> > >
> > >
> >
> -Dlog4j.configuration=file:///Users/jzhang/github/zeppelin/conf/log4j.properties
> > >
> > >
> >
> -Dzeppelin.log.file=/Users/jzhang/github/zeppelin/logs/zeppelin-interpreter-flink-shared_process-jzhang-JeffdeMacBook-Pro.local.log
> > > -Xms1024m -Xmx2048m -XX:MaxPermSize=512m -cp
> > >
> > >
> >
> *:/Users/jzhang/github/flink/build-target/lib/slf4j-log4j12-1.7.15.jar:/Users/jzhang/github/flink/build-target/lib/flink-connector-hive_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/hive-exec-2.3.4.jar:/Users/jzhang/github/flink/build-target/lib/flink-shaded-hadoop-2-uber-2.7.5-7.0.jar:/Users/jzhang/github/flink/build-target/lib/log4j-1.2.17.jar:/Users/jzhang/github/flink/build-target/lib/flink-table-blink_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-table_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-dist_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-python_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-hadoop-compatibility_2.11-1.9.1.jar*:/Users/jzhang/github/zeppelin/interpreter/flink/*:/Users/jzhang/github/zeppelin/zeppelin-interpreter-api/target/*:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes/*::/Users/jzhang/github/zeppelin/interpreter/zeppelin-interpreter-api-0.9.0-SNAPSHOT.jar:/Users/jzhang/github/zeppelin/zeppelin-interpreter/target/classes:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes:/Users/jzhang/github/flink/build-target/opt/flink-python_2.11-1.10-SNAPSHOT.jar
> > >
> > >
> >
> /Users/jzhang/github/flink/build-target/opt/tmp/flink-python_2.11-1.10-SNAPSHOT.jar
> > > org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer 0.0.0.0
> > > 52603 flink-shared_process :
> > >
> > >
> > > Jingsong Li  于2020年2月6日周四 下午4:10写道:
> > >
> > > > Hi Jeff,
> > > >
> > > >
> > > > For FLINK-15935 [1],
> > > >
> > > > I try to think of it as a non blocker. But it's really an important
> > > issue.
> > > >
> > > >
> > > > The problem is the class loading order. We want to load the class in
> > the
> > > > blink-planner.jar, but actually load the class in the
> > flink-planner.jar.
> > > >
> > > >
> > > > First of all, the order of class loading is based on the order of
> > > > classpath.
> > > >
> > > >
> > > > I just tried, the order of classpath of the folder is depends on the
> > > order
> > > > of fi

[jira] [Created] (FLINK-15941) ConfluentSchemaRegistryCoder should not perform HTTP requests for all request

2020-02-06 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-15941:


 Summary: ConfluentSchemaRegistryCoder should not perform HTTP 
requests for all request
 Key: FLINK-15941
 URL: https://issues.apache.org/jira/browse/FLINK-15941
 Project: Flink
  Issue Type: Improvement
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Reporter: Dawid Wysakowicz


ConfluentSchemaRegistryCoder should cache ids of schemas that it has already 
seen.

I think it should be as simple as changing
{code}
@Override
public void writeSchema(Schema schema, OutputStream out) throws 
IOException {
try {
int registeredId = 
schemaRegistryClient.register(subject, schema);
out.write(CONFLUENT_MAGIC_BYTE);
byte[] schemaIdBytes = 
ByteBuffer.allocate(4).putInt(registeredId).array();
out.write(schemaIdBytes);
} catch (RestClientException e) {
throw new IOException("Could not register schema in 
registry", e);
}
}
{code}

to

{code}
@Override
public void writeSchema(Schema schema, OutputStream out) throws 
IOException {
try {
int registeredId = schemaRegistryClient.getId(subject, 
schema);
out.write(CONFLUENT_MAGIC_BYTE);
byte[] schemaIdBytes = 
ByteBuffer.allocate(4).putInt(registeredId).array();
out.write(schemaIdBytes);
} catch (RestClientException e) {
throw new IOException("Could not register schema in 
registry", e);
}
}
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.10.0, release candidate #2

2020-02-06 Thread Andrey Zagrebin
alright, thanks for confirming this Benchao!

On Thu, Feb 6, 2020 at 6:36 PM Benchao Li  wrote:

> Hi Andrey,
>
> I noticed that 1.10 has changed to enabling background cleanup by default
> just after I posted to this email.
> So it won't affect 1.10 any more, just affect 1.9.x. We can move to the
> Jira ticket to discuss further more.
>
> Andrey Zagrebin  于2020年2月6日周四 下午11:30写道:
>
> > Hi Benchao,
> >
> > Do you observe this issue FLINK-15938 with 1.9 or 1.10?
> > If with 1.9, I suggest to check with 1.10.
> >
> > Thanks,
> > Andrey
> >
> > On Thu, Feb 6, 2020 at 4:07 PM Benchao Li  wrote:
> >
> > > Hi all,
> > >
> > > I found another issue[1], I don't know if it should be a blocker. But
> it
> > > does affects joins without window in blink planner.
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-15938
> > >
> > > Jeff Zhang  于2020年2月6日周四 下午5:05写道:
> > >
> > > > Hi Jingsong,
> > > >
> > > > Thanks for the suggestion. It works for running it in IDE, but for
> > > > downstream project like Zeppelin where I will include flink jars in
> > > > classpath.
> > > > it only works when I specify the jars one by one explicitly in
> > classpath,
> > > > using * doesn't work.
> > > >
> > > > e.g.
> > > >
> > > > The following command where I use * to specify classpath doesn't
> work,
> > > > jzhang 4794 1.2 6.3 8408904 1048828 s007 S 4:30PM 0:33.90
> > > >
> > /Library/Java/JavaVirtualMachines/jdk1.8.0_211.jdk/Contents/Home/bin/java
> > > > -Dfile.encoding=UTF-8
> > > >
> > > >
> > >
> >
> -Dlog4j.configuration=file:///Users/jzhang/github/zeppelin/conf/log4j.properties
> > > >
> > > >
> > >
> >
> -Dzeppelin.log.file=/Users/jzhang/github/zeppelin/logs/zeppelin-interpreter-flink-shared_process-jzhang-JeffdeMacBook-Pro.local.log
> > > > -Xms1024m -Xmx2048m -XX:MaxPermSize=512m -cp
> > > >
> > > >
> > >
> >
> *:/Users/jzhang/github/flink/build-target/lib/**:/Users/jzhang/github/zeppelin/interpreter/flink/*:/Users/jzhang/github/zeppelin/zeppelin-interpreter-api/target/*:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes/*::/Users/jzhang/github/zeppelin/interpreter/zeppelin-interpreter-api-0.9.0-SNAPSHOT.jar:/Users/jzhang/github/zeppelin/zeppelin-interpreter/target/classes:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes:/Users/jzhang/github/flink/build-target/opt/flink-python_2.11-1.10-SNAPSHOT.jar
> > > > org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer
> 0.0.0.0
> > > > 52395 flink-shared_process :
> > > >
> > > >
> > > > While this command where I specify jar one by one explicitly in
> > classpath
> > > > works
> > > >
> > > > jzhang5660 205.2  6.1  8399420 1015036 s005  R 4:43PM
> > > > 0:24.82
> > > >
> > /Library/Java/JavaVirtualMachines/jdk1.8.0_211.jdk/Contents/Home/bin/java
> > > > -Dfile.encoding=UTF-8
> > > >
> > > >
> > >
> >
> -Dlog4j.configuration=file:///Users/jzhang/github/zeppelin/conf/log4j.properties
> > > >
> > > >
> > >
> >
> -Dzeppelin.log.file=/Users/jzhang/github/zeppelin/logs/zeppelin-interpreter-flink-shared_process-jzhang-JeffdeMacBook-Pro.local.log
> > > > -Xms1024m -Xmx2048m -XX:MaxPermSize=512m -cp
> > > >
> > > >
> > >
> >
> *:/Users/jzhang/github/flink/build-target/lib/slf4j-log4j12-1.7.15.jar:/Users/jzhang/github/flink/build-target/lib/flink-connector-hive_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/hive-exec-2.3.4.jar:/Users/jzhang/github/flink/build-target/lib/flink-shaded-hadoop-2-uber-2.7.5-7.0.jar:/Users/jzhang/github/flink/build-target/lib/log4j-1.2.17.jar:/Users/jzhang/github/flink/build-target/lib/flink-table-blink_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-table_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-dist_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-python_2.11-1.10-SNAPSHOT.jar:/Users/jzhang/github/flink/build-target/lib/flink-hadoop-compatibility_2.11-1.9.1.jar*:/Users/jzhang/github/zeppelin/interpreter/flink/*:/Users/jzhang/github/zeppelin/zeppelin-interpreter-api/target/*:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes/*::/Users/jzhang/github/zeppelin/interpreter/zeppelin-interpreter-api-0.9.0-SNAPSHOT.jar:/Users/jzhang/github/zeppelin/zeppelin-interpreter/target/classes:/Users/jzhang/github/zeppelin/zeppelin-zengine/target/test-classes:/Users/jzhang/github/flink/build-target/opt/flink-python_2.11-1.10-SNAPSHOT.jar
> > > >
> > > >
> > >
> >
> /Users/jzhang/github/flink/build-target/opt/tmp/flink-python_2.11-1.10-SNAPSHOT.jar
> > > > org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer
> 0.0.0.0
> > > > 52603 flink-shared_process :
> > > >
> > > >
> > > > Jingsong Li  于2020年2月6日周四 下午4:10写道:
> > > >
> > > > > Hi Jeff,
> > > > >
> > > > >
> > > > > For FLINK-15935 [1],
> > > > >
> > > > > I try to think of it as a non blocker. But it's really an important
> > > > issue.
> > > > >
> > > > >
> > > > > The problem is the class loading order. We want to load the class
>

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-06 Thread Jeff Zhang
I have another concern which may not be closely related to this thread.
Since flink doesn't include all the necessary jars, I think it is critical
for flink to display meaningful error message when any class is missing.
e.g. Here's the error message when I use kafka but miss
including flink-json.  To be honest, the kind of error message is hard to
understand for new users.


Reason: No factory implements
'org.apache.flink.table.factories.DeserializationSchemaFactory'. The
following properties are requested:
connector.properties.bootstrap.servers=localhost:9092
connector.properties.group.id=testGroup
connector.properties.zookeeper.connect=localhost:2181
connector.startup-mode=earliest-offset connector.topic=generated.events
connector.type=kafka connector.version=universal format.type=json
schema.0.data-type=VARCHAR(2147483647) schema.0.name=status
schema.1.data-type=VARCHAR(2147483647) schema.1.name=direction
schema.2.data-type=BIGINT schema.2.name=event_ts update-mode=append The
following factories have been considered:
org.apache.flink.table.catalog.hive.factories.HiveCatalogFactory
org.apache.flink.table.module.hive.HiveModuleFactory
org.apache.flink.table.module.CoreModuleFactory
org.apache.flink.table.catalog.GenericInMemoryCatalogFactory
org.apache.flink.table.sources.CsvBatchTableSourceFactory
org.apache.flink.table.sources.CsvAppendTableSourceFactory
org.apache.flink.table.sinks.CsvBatchTableSinkFactory
org.apache.flink.table.sinks.CsvAppendTableSinkFactory
org.apache.flink.table.planner.delegation.BlinkPlannerFactory
org.apache.flink.table.planner.delegation.BlinkExecutorFactory
org.apache.flink.table.planner.StreamPlannerFactory
org.apache.flink.table.executor.StreamExecutorFactory
org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactory at
org.apache.flink.table.factories.TableFactoryService.filterByFactoryClass(TableFactoryService.java:238)
at
org.apache.flink.table.factories.TableFactoryService.filter(TableFactoryService.java:185)
at
org.apache.flink.table.factories.TableFactoryService.findSingleInternal(TableFactoryService.java:143)
at
org.apache.flink.table.factories.TableFactoryService.find(TableFactoryService.java:113)
at
org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactoryBase.getDeserializationSchema(KafkaTableSourceSinkFactoryBase.java:277)
at
org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactoryBase.createStreamTableSource(KafkaTableSourceSinkFactoryBase.java:161)
at
org.apache.flink.table.factories.StreamTableSourceFactory.createTableSource(StreamTableSourceFactory.java:49)
at
org.apache.flink.table.factories.TableFactoryUtil.findAndCreateTableSource(TableFactoryUtil.java:53)
... 36 more



Till Rohrmann  于2020年2月6日周四 下午11:30写道:

> I would not object given that it is rather small at the moment. However, I
> also think that we should have a plan how to handle the ever growing Flink
> ecosystem and how to make it easily accessible to our users. E.g. one far
> fetched idea could be something like a configuration script which downloads
> the required components for the user. But this deserves definitely a
> separate discussion and does not really belong here.
>
> Cheers,
> Till
>
> On Thu, Feb 6, 2020 at 3:35 PM Hequn Cheng  wrote:
>
> >
> > Hi everyone,
> >
> > Thank you all for the great inputs!
> >
> > I think probably what we all agree on is we should try to make a leaner
> > flink-dist. However, we may also need to do some compromises considering
> > the user experience that users don't need to download the dependencies
> from
> > different places. Otherwise, we can move all the jars in the current opt
> > folder to the download page.
> >
> > The missing of clear rules for guiding such compromises makes things more
> > complicated now. I would agree that the decisive factor for what goes
> into
> > Flink's binary distribution should be how core it is to Flink. Meanwhile,
> > it's better to treat Flink API as a (core) core to Flink. Not only it is
> a
> > very clear rule that easy to be followed but also in most cases, API is
> > very significant and deserved to be included in the dist.
> >
> > Given this, it might make sense to put flink-ml-api and flink-ml-lib into
> > the opt.
> > What do you think?
> >
> > Best,
> > Hequn
> >
> > On Wed, Feb 5, 2020 at 12:39 AM Chesnay Schepler 
> > wrote:
> >
> >> Around a year ago I started a discussion
> >> <
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/DISCUSS-Towards-a-leaner-flink-dist-tp25615.html
> >
> >> on reducing the amount of jars we ship with the distribution.
> >>
> >> While there was no definitive conclusion there was a shared sentiment
> >> that APIs should be shipped with the distribution.
> >>
> >> On 04/02/2020 17:25, Till Rohrmann wrote:
> >>
> >> I think there is no such rule that APIs go automatically into opt/ and
> >> "libraries" not. The contents of opt/ have mainly grown over time w/o
> >> following a strict rule.
> >>
> >> I think the decisive f

[jira] [Created] (FLINK-15942) Improve logging of infinite resource profile

2020-02-06 Thread Andrey Zagrebin (Jira)
Andrey Zagrebin created FLINK-15942:
---

 Summary: Improve logging of infinite resource profile
 Key: FLINK-15942
 URL: https://issues.apache.org/jira/browse/FLINK-15942
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Configuration, Runtime / Task
Affects Versions: 1.10.0
Reporter: Andrey Zagrebin


After we set task memory and CPU to infinity in FLINK-15763, it spoiled the 
logs:
{code:java}
00:23:49,442 INFO org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl 
- Free slot TaskSlot(index:0, state:ACTIVE, resource profile: 
ResourceProfile{cpuCores=44942328371557892500.,
 taskHeapMemory=2097152.000tb (2305843009213693951 bytes), 
taskOffHeapMemory=2097152.000tb (2305843009213693951 bytes), 
managedMemory=20.000mb (20971520 bytes), networkMemory=16.000mb (16777216 
bytes)}, allocationId: 349dacfbf1ac4d0b44a2d11e1976d264, jobId: 
689a0cf24b40f16b6f45157f78754c46).
{code}
We should treat the infinity as a special case and print it accordingly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15943) Rowtime field name cannot be the same as the json field

2020-02-06 Thread gkgkgk (Jira)
gkgkgk created FLINK-15943:
--

 Summary: Rowtime  field name cannot be the same as the json field
 Key: FLINK-15943
 URL: https://issues.apache.org/jira/browse/FLINK-15943
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API, Table SQL / Planner
Affects Versions: 1.9.0
Reporter: gkgkgk


Run the following sql:

-- sql start 
--source
CREATE TABLE dwd_user_log (
  id VARCHAR,
  ctime TIMESTAMP,
  sessionId VARCHAR,
  pageId VARCHAR,
  eventId VARCHAR,
  deviceId Decimal
) WITH (
  'connector.type' = 'kafka',
  'connector.version' = 'universal',
  'connector.topic' = 'dev_dwd_user_log_02',
  'connector.startup-mode' = 'latest-offset',
  'connector.properties.0.key' = 'zookeeper.connect',
  'connector.properties.0.value' = 'node01:2181',
  'connector.properties.1.key' = 'bootstrap.servers',
  'connector.properties.1.value' = 'node01:9092',
  'connector.properties.2.key' = 'group.id',
  'connector.properties.2.value' = 'dev-group',
  'update-mode' = 'append',
  'format.type' = 'json',
  -- 'format.derive-schema' = 'true',
  'format.json-schema' = '{
"type": "object",
"properties": {
  "id": {
  "type": "string"
  },
  "ctime": {
  "type": "string",
  "format": "date-time"
  },
  "pageId": {
  "type": "string"
  },
  "eventId": {
  "type": "string"
  },
  "sessionId": {
  "type": "string"
  },
  "deviceId": {
  "type": "number"
  }
}
  }',
  'schema.1.rowtime.timestamps.type' = 'from-field',
  'schema.1.rowtime.timestamps.from' = 'ctime',
  'schema.1.rowtime.watermarks.type' = 'periodic-bounded',
  'schema.1.rowtime.watermarks.delay' = '1'
);  


-- sink
-- sink for pv
CREATE TABLE dws_pv (
windowStart TIMESTAMP,
  windowEnd TIMESTAMP,
  pageId VARCHAR,
  viewCount BIGINT
) WITH (
  'connector.type' = 'kafka',
  'connector.version' = 'universal',
  'connector.topic' = 'dev_dws_pvuv_02',
  'connector.startup-mode' = 'latest-offset',
  'connector.properties.0.key' = 'zookeeper.connect',
  'connector.properties.0.value' = 'node01:2181',
  'connector.properties.1.key' = 'bootstrap.servers',
  'connector.properties.1.value' = 'node01:9092',
  'connector.properties.2.key' = 'group.id',
  'connector.properties.2.value' = 'dev-group',
  'update-mode' = 'append',
  'format.type' = 'json',
  'format.derive-schema' = 'true'
);

-- pv
INSERT INTO dws_pv
SELECT
  TUMBLE_START(ctime, INTERVAL '20' SECOND)  AS windowStart,
  TUMBLE_END(ctime, INTERVAL '20' SECOND)  AS windowEnd,
  pageId,
  COUNT(deviceId) AS viewCount
FROM dwd_user_log
GROUP BY TUMBLE(ctime, INTERVAL '20' SECOND),pageId;
-- sql end
And hit the following error:
{code:java}
//Exception in thread "main" org.apache.flink.table.api.ValidationException: 
Field 'ctime' could not be resolved by the field mapping.Exception in thread 
"main" org.apache.flink.table.api.ValidationException: Field 'ctime' could not 
be resolved by the field mapping. at 
org.apache.flink.table.planner.sources.TableSourceUtil$.org$apache$flink$table$planner$sources$TableSourceUtil$$resolveInputField(TableSourceUtil.scala:357)
 at 
org.apache.flink.table.planner.sources.TableSourceUtil$$anonfun$org$apache$flink$table$planner$sources$TableSourceUtil$$resolveInputFields$1.apply(TableSourceUtil.scala:388)
 at 
org.apache.flink.table.planner.sources.TableSourceUtil$$anonfun$org$apache$flink$table$planner$sources$TableSourceUtil$$resolveInputFields$1.apply(TableSourceUtil.scala:388)
 at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
 at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
 at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
 at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at 
scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at 
scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186) at 
org.apache.flink.table.planner.sources.TableSourceUtil$.org$apache$flink$table$planner$sources$TableSourceUtil$$resolveInputFields(TableSourceUtil.scala:388)
 at 
org.apache.flink.table.planner.sources.TableSourceUtil$$anonfun$getRowtimeExtractionExpression$1.apply(TableSourceUtil.scala:275)
 at 
org.apache.flink.table.planner.sources.TableSourceUtil$$anonfun$getRowtimeExtractionExpression$1.apply(TableSourceUtil.scala:270)
 at scala.Option.map(Option.scala:146) at 
org.apache.flink.table.planner.sources.TableSourceUtil$.getRowtimeExtractionExpression(TableSourceUtil.scala:270)
 at 
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecTableSourceScan.translateToPlanInternal(StreamExecTableSourceScan.scala:117)
 at 
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecTableSourceScan.translateToPlanInternal(StreamExecTableSourceScan.scala:55)
 at 
org.apache.flink.table.planner.plan.nodes.exec.ExecNode

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-02-06 Thread Gyula Fóra
Hi Jeff & Till!

Thanks for the feedback, this is exactly the discussion I was looking for.
The JobListener looks very promising if we can expose the JobGraph somehow
(correct me if I am wrong but it is not accessible at the moment).

I did not know about this feature that's why I added my JobSubmission hook
which was pretty similar but only exposing the JobGraph. In general I like
the listener better and I would not like to add anything extra if we can
avoid it.

Actually the bigger part of the integration work that will need more
changes in Flink will be regarding the accessibility of sources/sinks from
the JobGraph and their specific properties. For instance at the moment the
Kafka sources and sinks do not expose anything publicly such as topics,
kafka configs, etc. Same goes for other data connectors that we need to
integrate in the long run. I guess there will be a separate thread on this
once we iron out the initial integration points :)

I will try to play around with the JobListener interface tomorrow and see
if I can extend it to meet our needs.

Cheers,
Gyula

On Thu, Feb 6, 2020 at 4:08 PM Jeff Zhang  wrote:

> Hi Gyula,
>
> Flink 1.10 introduced JobListener which is invoked after job submission and
> finished.  May we can add api on JobClient to get what info you needed for
> altas integration.
>
>
> https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/execution/JobListener.java#L46
>
>
> Gyula Fóra  于2020年2月5日周三 下午7:48写道:
>
> > Hi all!
> >
> > We have started some preliminary work on the Flink - Atlas integration at
> > Cloudera. It seems that the integration will require some new hook
> > interfaces at the jobgraph generation and submission phases, so I
> figured I
> > will open a discussion thread with my initial ideas to get some early
> > feedback.
> >
> > *Minimal background*
> > Very simply put Apache Atlas is a data governance framework that stores
> > metadata for our data and processing logic to track ownership, lineage
> etc.
> > It is already integrated with systems like HDFS, Kafka, Hive and many
> > others.
> >
> > Adding Flink integration would mean that we can track the input output
> data
> > of our Flink jobs, their owners and how different Flink jobs are
> connected
> > to each other through the data they produce (lineage). This seems to be a
> > very big deal for a lot of companies :)
> >
> > *Flink - Atlas integration in a nutshell*
> > In order to integrate with Atlas we basically need 2 things.
> >  - Flink entity definitions
> >  - Flink Atlas hook
> >
> > The entity definition is the easy part. It is a json that contains the
> > objects (entities) that we want to store for any give Flink job. As a
> > starter we could have a single FlinkApplication entity that has a set of
> > inputs and outputs. These inputs/outputs are other Atlas entities that
> are
> > already defines such as Kafka topic or Hbase table.
> >
> > The Flink atlas hook will be the logic that creates the entity instance
> and
> > uploads it to Atlas when we start a new Flink job. This is the part where
> > we implement the core logic.
> >
> > *Job submission hook*
> > In order to implement the Atlas hook we need a place where we can inspect
> > the pipeline, create and send the metadata when the job starts. When we
> > create the FlinkApplication entity we need to be able to easily determine
> > the sources and sinks (and their properties) of the pipeline.
> >
> > Unfortunately there is no JobSubmission hook in Flink that could execute
> > this logic and even if there was one there is a mismatch of abstraction
> > levels needed to implement the integration.
> > We could imagine a JobSubmission hook executed in the JobManager runner
> as
> > this:
> >
> > void onSuccessfulSubmission(JobGraph jobGraph, Configuration
> > configuration);
> >
> > This is nice but the JobGraph makes it super difficult to extract sources
> > and UDFs to create the metadata entity. The atlas entity however could be
> > easily created from the StreamGraph object (used to represent the logical
> > flow) before the JobGraph is generated. To go around this limitation we
> > could add a JobGraphGeneratorHook interface:
> >
> > void preProcess(StreamGraph streamGraph); void postProcess(JobGraph
> > jobGraph);
> >
> > We could then generate the atlas entity in the preprocess step and add a
> > jobmission hook in the postprocess step that will simply send the already
> > baked in entity.
> >
> > *This kinda works but...*
> > The approach outlined above seems to work and we have built a POC using
> it.
> > Unfortunately it is far from nice as it exposes non-public APIs such as
> the
> > StreamGraph. Also it feels a bit weird to have 2 hooks instead of one.
> >
> > It would be much nicer if we could somehow go back from JobGraph to
> > StreamGraph or at least have an easy way to access source/sink UDFS.
> >
> > What do you think?
> >
> > Cheers,
> > Gyula
> >
>
>
> --
> Best Regards
>
> Jeff Zhang
>


[jira] [Created] (FLINK-15944) Resolve the potential class conflict proplem when depend both planners

2020-02-06 Thread Jark Wu (Jira)
Jark Wu created FLINK-15944:
---

 Summary: Resolve the potential class conflict proplem when depend 
both planners
 Key: FLINK-15944
 URL: https://issues.apache.org/jira/browse/FLINK-15944
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / Legacy Planner, Table SQL / Planner
Reporter: Jark Wu


FLINK-15935 raised the potential class conflict problem when the 
project/application depend old planner and blink planner at the same time. 
Currently, Calcite classes (fix Calcite bugs) and 
{{PlannerExpressionParserImpl}} have the same classpath and may lead to 
problems when they are different. 

Currently, we keep these classes in sync in both planners manually. However, 
it's not safe and error-prone. We should figure out a solution for this (we 
can't remove old planner in the near future).

A viable solution is having a {{flink-table-planner-common}} module to keep the 
commonly used classes in both planner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[DISCUSS] Does removing deprecated interfaces needs another FLIP

2020-02-06 Thread Kurt Young
Hi dev,

Currently I want to remove some already deprecated methods from
TableEnvironment which annotated with @PublicEnvolving. And I also created
a discussion thread [1] to both dev and user mailing lists to gather
feedback on that. But I didn't find any matching rule in Flink bylaw [2] to
follow. Since this is definitely a API breaking change, but we already
voted for that back in the FLIP which deprecated these methods.

I'm not sure about how to proceed for now. Looks like I have 2 choices:

1. If no one raise any objections in discuss thread in like 72 hours, I
will create a jira to start working on it.
2. Since this is a API breaking change, I need to open another FLIP to tell
that I want to remove these deprecated methods. This seems a little
redundant with the first FLIP which deprecate the methods.

What do you think?

Best,
Kurt

[1]
https://lists.apache.org/thread.html/r98af66feb531ce9e6b94914e44391609cad857e16ea84db5357c1980%40%3Cdev.flink.apache.org%3E
[2] https://cwiki.apache.org/confluence/display/FLINK/Flink+Bylaws


[jira] [Created] (FLINK-15945) Remove MULTIPLEX_FLINK_STATE config from Stateful Functions

2020-02-06 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-15945:
---

 Summary: Remove MULTIPLEX_FLINK_STATE config from Stateful 
Functions
 Key: FLINK-15945
 URL: https://issues.apache.org/jira/browse/FLINK-15945
 Project: Flink
  Issue Type: Improvement
  Components: Stateful Functions
Affects Versions: statefun-1.1
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai


Currently, Stateful Functions support a {{MULTIPLEX_FLINK_STATE}} configuration 
that, when enabled, multiplexes registered state of all function types as a 
single {{MapState}}.

This is mostly to allow feasible use of the RocksDB state backend with Stateful 
Functions, since without multiplexing state, each {{PersistedValue}} of a 
single function type would require a separate column family in RocksDB and need 
64mb of memstore.

As of now, Stateful Functions allow enabling or disabling multiplex state, 
regardless of the state backend being used. This actually does not make sense, 
considering that:

* If the heap backend is used, there is no reason to multiplex state. Also 
taking into account that there are problems that exist for multiplexing state 
such as lack of support for state schema evolution.
* If RocksDB backend is used, it is highly encouraged to multiplex state, 
anyways.

Therefore, we propose to remove the {{MULTIPLEX_FLINK_STATE}} configuration and 
simply tie the behaviour of multiplexing state to the state backend being used, 
i.e. multiplex only when RocksDB state backend is used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15946) Task manager Kubernetes pods take long time to terminate

2020-02-06 Thread Andrey Zagrebin (Jira)
Andrey Zagrebin created FLINK-15946:
---

 Summary: Task manager Kubernetes pods take long time to terminate
 Key: FLINK-15946
 URL: https://issues.apache.org/jira/browse/FLINK-15946
 Project: Flink
  Issue Type: Bug
  Components: Deployment / Kubernetes, Runtime / Coordination
Affects Versions: 1.10.0
Reporter: Andrey Zagrebin


The problem is initially described in this [ML 
thread|https://mail-archives.apache.org/mod_mbox/flink-user/202002.mbox/browser].

We should investigate whether and if yes, why the TM pod killing/shutdown is 
delayed by reconnecting to the terminated JM.

cc [~fly_in_gis]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)