[jira] [Created] (FLINK-14165) HadoopRecoverableWriter do not support viewfs scheme

2019-09-23 Thread haoyuwen (Jira)
haoyuwen created FLINK-14165:


 Summary: HadoopRecoverableWriter do not support viewfs scheme
 Key: FLINK-14165
 URL: https://issues.apache.org/jira/browse/FLINK-14165
 Project: Flink
  Issue Type: Bug
  Components: Connectors / FileSystem
Affects Versions: 1.9.0
Reporter: haoyuwen


HadoopRecoverableWriter limits the scheme to hdfs and cannot use viewfs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14166) Reuse cache from previous history server run

2019-09-23 Thread David Moravek (Jira)
David Moravek created FLINK-14166:
-

 Summary: Reuse cache from previous history server run
 Key: FLINK-14166
 URL: https://issues.apache.org/jira/browse/FLINK-14166
 Project: Flink
  Issue Type: Improvement
Reporter: David Moravek


Currently history server is not able to reuse cache from previous run, even 
when `historyserver.web.tmpdir` is set. It could simply "warm up" cached job 
ids set, from previously parsed jobs.

https://github.com/apache/flink/blob/master/flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/history/HistoryServerArchiveFetcher.java#L129

This should be configurable, so it does not break backward compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14167) Move python-related scripts in flink-dist to flink-python

2019-09-23 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-14167:
-

 Summary: Move python-related scripts in flink-dist to flink-python
 Key: FLINK-14167
 URL: https://issues.apache.org/jira/browse/FLINK-14167
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: Wei Zhong


Currently, the scripts "pyflink-gateway-server.sh", "pyflink-shell.sh" and 
"pyflink-udf-runner.sh" are stored in the flink-dist module. Now the module 
flink-scala-shell and flink-sql-client store their scripts in their own module 
directory instead of flink-dist. It would be better if we move the flink-python 
related scripts from flink-dist to flink-python.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Contribute Pulsar Flink connector back to Flink

2019-09-23 Thread Stephan Ewen
Okay, I see your point, Becket.

Then let us prominently link the Pulsar connector from the Flink connector
docs then, so that users can find it easily.

As soon as FLIP 27 is done, we reach out the Pulsar folks to contribute a
new connector.

On Mon, Sep 23, 2019 at 3:11 AM Becket Qin  wrote:

> Hi Stephan,
>
> I have no doubt about the value of adding Pulsar connector to Flink repo.
> My concern is about how exactly we are going to do it.
>
> As mentioned before, I believe that we can handle connectors more
> > pragmatically and less strict than the core of Flink, if it helps
> unlocking
> > users.
>
> I can see the benefit of being less restrict for the initial connector code
> adoption. However, I don't think we should be less restrict on the
> maintenance commitment once the code is in Flink repo. It only makes sense
> to check in something and ask users to use if we plan to maintain it.
>
> If I understand correctly, the current plan so far is following:
> 1. release 1.10
>- Check in Pulsar connector on old interface and label it as beta
> version.
>- encourage users to try it and report bugs.
> 2. release 1.11
>- Check in Pulsar connector on new interface (a.k.a new Pulsar
> connector) and label it as beta version
>- Deprecate the old Pulsar connector
>- Fix bugs reported on old Pulsar connector from release 1.10
>- Ask users to migrate from old Pulsar connector to new Pulsar connector
> 3. release 1.12
>- Announce end of support for old Pulsar connector and remove the code
>- Fix bugs reported on new Pulsar connector.
>
> If this is the plan, it seems neither Flink nor the users trying the old
> Pulsar connector will benefit from this experimental old Pulsar connector,
> because whatever feedbacks we got or bugs we fix on the old Pulsar
> connector are immediately thrown away in one or two releases.
>
> If we check in the old Pulsar connector right now, the only option I see is
> to maintain it for a while (e.g. a year or more). IMO, the immediate
> deprecation and code removal hurts the users much more than asking them to
> wait for another release. I personally think that we can avoid this
> maintenance burden by going directly to the new Pulsar connector,
> especially given that users can still use the connector even if they are
> not in Flink repo. That said, I am OK with maintaining both old and new
> Pulsar connector if we believe that having the Pulsar connector available
> right now in Flink repo is more important.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Sun, Sep 22, 2019 at 9:10 PM Stephan Ewen  wrote:
>
> > My assumption is as Sijie's, that once the connector is either part of
> > Flink, or part of the streamnative repo. No double maintenance.
> >
> > I feel this discussion is very much caught in problems that are all
> > solvable if we want to solve them.
> > Maybe we can think what our goal for users and the communities is?
> >
> >   - Do we want to help build a relationship between the Pulsar and Flink
> > open source communities?
> >   - Will users find a connector in the streamnative repository?
> >   - Will users trust a connector that is not part of Flink as much?
> >
> > And then decide what is best according to the overall goals there.
> > As mentioned before, I believe that we can handle connectors more
> > pragmatically and less strict than the core of Flink, if it helps
> unlocking
> > users.
> >
> > Best,
> > Stephan
> >
> >
> >
> > On Fri, Sep 20, 2019 at 2:10 PM Sijie Guo  wrote:
> >
> > > Thanks Becket.
> > >
> > > I think it is better for the Flink community to judge the benefits of
> > doing
> > > this. I was trying to provide some views from outsiders.
> > >
> > > Thanks,
> > > Sijie
> > >
> > > On Fri, Sep 20, 2019 at 10:25 AM Becket Qin 
> > wrote:
> > >
> > > > Hi Sijie,
> > > >
> > > > Yes, we will have to support existing old connectors and new
> connectors
> > > in
> > > > parallel for a while. We have to take that maintenance overhead
> because
> > > > existing connectors have been used by the users for a long time. I
> > guess
> > > It
> > > > may take at least a year for us to fully remove the old connectors.
> > > >
> > > > Process wise, we can do the same for Pulsar connector. But I am not
> > sure
> > > if
> > > > we want to have the same burden on Pulsar connector, and I would like
> > to
> > > > understand the benefit of doing that.
> > > >
> > > > For users, the benefit of having the old Pulsar connector checked in
> > > seems
> > > > limited because 1) that code base will be immediately deprecated in
> the
> > > > next release in 3-4 months; 2) users can always use it even if it is
> > not
> > > in
> > > > the Flink code base. Admittedly it is not as convenient as having it
> in
> > > > Flink code base, but doesn't seem super either. And after 3-4 months,
> > > users
> > > > can just use the new connector in Flink repo.
> > > >
> > > > For Flink developers, the old connector code base is not something
> that
> > > we

Re: [DISCUSS] FLIP-63: Rework table partition support

2019-09-23 Thread JingsongLee
Thanks for your discussion on google document.
Comments addressed and added FileSystem connector chapter, and introduce code 
prototype for file system connector to unify flink file system and hive 
connectors. 

Looking forward to your feedbacks. Thank you.

Best,
Jingsong Lee


--
From:JingsongLee 
Send Time:2019年9月18日(星期三) 09:45
To:Kurt Young ; dev 
Subject:Re: [DISCUSS] FLIP-63: Rework table partition support

Thanks for your reply and google doc comments. It has been discussed
 for two weeks now. I will start a vote thread.

Best,
Jingsong Lee


--
From:Kurt Young 
Send Time:2019年9月16日(星期一) 15:55
To:dev 
Cc:JingsongLee 
Subject:Re: [DISCUSS] FLIP-63: Rework table partition support

+1 to this feature, I left some comments on google doc.

Another comment is I think we should do some reorganize about the content
when you converting this to a cwiki page. I will have some offline discussion 
with you.

Since this feature seems to be a fairly big efforts, so I suggest we can settle
down the design doc ASAP and start vote process.
Best,
Kurt


On Thu, Sep 12, 2019 at 12:43 PM Biao Liu  wrote:
Hi Jingsong,

 Thanks for explaining. It looks cool!

 Thanks,
 Biao /'bɪ.aʊ/



 On Wed, 11 Sep 2019 at 11:37, JingsongLee 
 wrote:

 > Hi biao, thanks for your feedbacks:
 >
 > Actually, the runtime source partition of runtime is similar to split,
 > which concerns data reading, parallelism and fault tolerance, all the
 > runtime concepts.
 > While table partition is only a virtual concept. Users are more likely to
 > choose which partition to read and which partition to write. Users can
 > manage their partitions.
 > One is physical implementation correlation, the other is logical concept
 > correlation.
 > So I think they are two completely different things.
 >
 > About [2], The main problem is that how to write data to a catalog file
 > system in stream mode, it is a general problem and has little to do with
 > partition.
 >
 > [2]
 > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html
 >
 > Best,
 > Jingsong Lee
 >
 >
 > --
 > From:Biao Liu 
 > Send Time:2019年9月10日(星期二) 14:57
 > To:dev ; JingsongLee 
 > Subject:Re: [DISCUSS] FLIP-63: Rework table partition support
 >
 > Hi Jingsong,
 >
 > Thank you for bringing this discussion. Since I don't have much experience
 > of Flink table/SQL, I'll ask some questions from runtime or engine
 > perspective.
 >
 > > ... where we describe how to partition support in flink and how to
 > integrate to hive partition.
 >
 > FLIP-27 [1] introduces "partition" concept officially. The changes of
 > FLIP-27 are not only about source interface but also about the whole
 > infrastructure.
 > Have you ever thought how to integrate your proposal with these changes?
 > Or you just want to support "partition" in table layer, there will be no
 > requirement of underlying infrastructure?
 >
 > I have seen a discussion [2] that seems be a requirement of infrastructure
 > to support your proposal. So I have some concerns there might be some
 > conflicts between this proposal and FLIP-27.
 >
 > 1.
 > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
 > 2.
 > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html
 >
 > Thanks,
 > Biao /'bɪ.aʊ/
 >
 >
 >
 > On Fri, 6 Sep 2019 at 13:22, JingsongLee 
 > wrote:
 > Hi everyone, thank you for your comments. Mail name was updated
 >  and streaming-related concepts were added.
 >
 >  We would like to start a discussion thread on "FLIP-63: Rework table
 >  partition support"(Design doc: [1]), where we describe how to partition
 >  support in flink and how to integrate to hive partition.
 >
 >  This FLIP addresses:
 > - Introduce whole story about partition support.
 > - Introduce and discuss DDL of partition support.
 > - Introduce static and dynamic partition insert.
 > - Introduce partition pruning
 > - Introduce dynamic partition implementation
 > - Introduce FileFormatSink to deal with streaming exactly-once and
 >   partition-related logic.
 >
 >  Details can be seen in the design document.
 >  Looking forward to your feedbacks. Thank you.
 >
 >  [1]
 > https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing
 >
 >  Best,
 >  Jingsong Lee
 >
 >



Re: Confluence permission for FLIP creation

2019-09-23 Thread Fabian Hueske
Hi Forward,

I gave you the permissions.

Best, Fabian

Am So., 22. Sept. 2019 um 13:03 Uhr schrieb Forward Xu <
forwardxu...@gmail.com>:

> Hi devs,
>
> I'd like to create a page about the Support SQL 2016-2017 JSON functions in
> Flink SQL
> <
> https://docs.google.com/document/d/1JfaFYIFOAY8P2pFhOYNCQ9RTzwF4l85_bnTvImOLKMk/edit?ts=5d84d314#heading=h.76mb88ca6yjp
> >
> FLIP. Could you grant me Confluence permission for FLIP creation?
>
> My Confluence ID is forwardxu.
>
> Best,
> Forward.
>


[jira] [Created] (FLINK-14168) Remove unused BootstrapTools#generateTaskManagerConfiguration

2019-09-23 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14168:
---

 Summary: Remove unused 
BootstrapTools#generateTaskManagerConfiguration
 Key: FLINK-14168
 URL: https://issues.apache.org/jira/browse/FLINK-14168
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Affects Versions: 1.10.0
Reporter: Zhu Zhu
 Fix For: 1.10.0


{{BootstrapTools#generateTaskManagerConfiguration}} is not used anymore while 
it adds {{scala.concurrent.duration.FiniteDuration}} dependency to 
{{BootstrapTools}}.
I think we can remove it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14169) Cleanup expired jobs from history server

2019-09-23 Thread David Moravek (Jira)
David Moravek created FLINK-14169:
-

 Summary: Cleanup expired jobs from history server
 Key: FLINK-14169
 URL: https://issues.apache.org/jira/browse/FLINK-14169
 Project: Flink
  Issue Type: Improvement
Reporter: David Moravek


Cleanup jobs, that are no longer in history refresh locations during 
JobArchiveFetcher::run.

https://github.com/apache/flink/blob/master/flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/history/HistoryServerArchiveFetcher.java#L138



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14170) Support hadoop < 2.7 with StreamingFileSink.BulkFormatBuilder

2019-09-23 Thread Bhagavan (Jira)
Bhagavan created FLINK-14170:


 Summary: Support hadoop < 2.7 with 
StreamingFileSink.BulkFormatBuilder
 Key: FLINK-14170
 URL: https://issues.apache.org/jira/browse/FLINK-14170
 Project: Flink
  Issue Type: Improvement
  Components: API / DataSet
Affects Versions: 1.9.0, 1.8.2, 1.8.1, 1.8.0
Reporter: Bhagavan


Currently, StreamingFileSink is supported only with Hadoop >= 2.7 irrespective 
of Row/bulk format builder. This restriction is due to truncate is not 
supported in  Hadoop < 2.7

However, BulkFormatBuilder does not use truncate method to restore the file. So 
the restricting StreamingFileSink.BulkFormatBuilder to be used only with Hadoop 
>= 2.7 is not necessary.

So requested improvement is to remove the precondition on 
HadoopRecoverableWriter and allow  BulkFormatBuilder (Parquet) to be used in 
Hadoop 2.6 ( Most of the enterprises still on CDH 5.x)

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

2019-09-23 Thread Fabian Hueske
+1 for CREATE TEMPORARY SYSTEM FUNCTION xxx

Cheers, Fabian

Am Sa., 21. Sept. 2019 um 06:58 Uhr schrieb Bowen Li :

> "SYSTEM" sounds good to me. FYI, this FLIP only impacts low level of the
> SQL function stack and won't actually involve any DDL, thus I will just
> document the decision and we should keep it in mind when it's time to
> implement the DDLs.
>
> I'm in the process of updating the FLIP to reflect changes required for
> option #2, will send a new version for review soon.
>
>
>
> On Fri, Sep 20, 2019 at 4:02 PM Dawid Wysakowicz 
> wrote:
>
> > I also like the 'System' keyword. I think we can assume we reached
> > consensus on this topic.
> >
> > On Sat, 21 Sep 2019, 06:37 Xuefu Z,  wrote:
> >
> > > +1 for using the keyword "SYSTEM". Thanks to Timo for chiming in!
> > >
> > > --Xuefu
> > >
> > > On Fri, Sep 20, 2019 at 3:28 PM Timo Walther 
> wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > sorry, for the late replay. I give also +1 for option #2. Thus, I
> guess
> > > > we have a clear winner.
> > > >
> > > > I would also like to find a better keyword/syntax for this statement.
> > > > Esp. the BUILTIN keyword can confuse people, because it could be
> > written
> > > > as BUILTIN, BUILDIN, BUILT_IN, or BUILD_IN. And we would need to
> > > > introduce a new reserved keyword in the parser which affects also
> > > > non-DDL queries. How about:
> > > >
> > > > CREATE TEMPORARY SYSTEM FUNCTION xxx
> > > >
> > > > The SYSTEM keyword is already a reserved keyword and in FLIP-66 we
> are
> > > > discussing to prefix some of the function with a SYSTEM_ prefix like
> > > > SYSTEM_WATERMARK. Also SQL defines syntax like "FOR SYSTEM_TIME AS
> OF".
> > > >
> > > > What do you think?
> > > >
> > > > Thanks,
> > > > Timo
> > > >
> > > >
> > > > On 20.09.19 05:45, Bowen Li wrote:
> > > > > Another reason I prefer "CREATE TEMPORARY BUILTIN FUNCTION" over
> > "ALTER
> > > > > BUILTIN FUNCTION xxx TEMPORARILY" is - what if users want to drop
> the
> > > > > temporary built-in function in the same session? With the former
> one,
> > > > they
> > > > > can run something like "DROP TEMPORARY BUILTIN FUNCTION"; With the
> > > latter
> > > > > one, I'm not sure how users can "restore" the original builtin
> > function
> > > > > easily from an "altered" function without introducing further
> > > nonstandard
> > > > > SQL syntax.
> > > > >
> > > > > Also please pardon me as I realized using net may not be a good
> > idea...
> > > > I'm
> > > > > trying to fit this vote into cases listed in Flink Bylaw [1].
> > > > >
> > > > > >From the following result, the majority seems to be #2 too as it
> has
> > > the
> > > > > most approval so far and doesn't have strong "-1".
> > > > >
> > > > > #1:3 (+1), 1 (0), 4(-1)
> > > > > #2:4(0), 3 (+1), 1(+0.5)
> > > > > * Dawid -1/0 depending on keyword
> > > > > #3:2(+1), 3(-1), 3(0)
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026
> > > > >
> > > > > On Thu, Sep 19, 2019 at 10:30 AM Bowen Li 
> > wrote:
> > > > >
> > > > >> Hi,
> > > > >>
> > > > >> Thanks everyone for your votes. I summarized the result as
> > following:
> > > > >>
> > > > >> #1:3 (+1), 1 (0), 4(-1) - net: -1
> > > > >> #2:4(0), 2 (+1), 1(+0.5)  - net: +2.5
> > > > >>  Dawid -1/0 depending on keyword
> > > > >> #3:2(+1), 3(-1), 3(0)   - net: -1
> > > > >>
> > > > >> Given the result, I'd like to change my vote for #2 from 0 to +1,
> to
> > > > make
> > > > >> it a stronger case with net +3.5. So the votes so far are:
> > > > >>
> > > > >> #1:3 (+1), 1 (0), 4(-1) - net: -1
> > > > >> #2:4(0), 3 (+1), 1(+0.5)  - net: +3.5
> > > > >>  Dawid -1/0 depending on keyword
> > > > >> #3:2(+1), 3(-1), 3(0)   - net: -1
> > > > >>
> > > > >> What do you think? Do you think we can conclude with this result?
> Or
> > > > would
> > > > >> you like to take it as a formal FLIP vote with 3 days voting
> period?
> > > > >>
> > > > >> BTW, I'd prefer "CREATE TEMPORARY BUILTIN FUNCTION" over "ALTER
> > > BUILTIN
> > > > >> FUNCTION xxx TEMPORARILY" because
> > > > >> 1. the syntax is more consistent with "CREATE FUNCTION" and
> "CREATE
> > > > >> TEMPORARY FUNCTION"
> > > > >> 2. "ALTER BUILTIN FUNCTION xxx TEMPORARILY" implies it alters a
> > > built-in
> > > > >> function but it actually doesn't, the logic only creates a temp
> > > function
> > > > >> with higher priority than that built-in function in ambiguous
> > > resolution
> > > > >> order; and it would behave inconsistently with "ALTER FUNCTION".
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Thu, Sep 19, 2019 at 2:58 AM Fabian Hueske 
> > > > wrote:
> > > > >>
> > > > >>> I agree, it's very similar from the implementation point of view
> > and
> > > > the
> > > > >>> implications.
> > > > >>>
> > > > >>> IMO, the difference is mostly on the mental model for the user.
> > > > >>> Instead of having a special class of temporary functions that
> have
> > > > >>> 

Re: Per Key Grained Watermark Support

2019-09-23 Thread Congxian Qiu
Hi
There was a discussion about this issue[1], as the previous discussion said
at the moment this is not supported out of the box by Flink, I think you
can try keyed process function as Lasse said.

[1]
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Watermark-for-each-key-td27485.html#a27516
Best,
Congxian


Lasse Nedergaard  于2019年9月23日周一 下午12:42写道:

> Hi Jiayi
>
> We have face the same challenge as we deal with IoT unit and they do not
> necessarily share the same timestamp. Watermark or. Key would be perfect
> match here. We tried to workaround with handle late events as special case
> with sideoutputs but isn’t the perfect solution.
> My conclusion is to skip watermark and create a keyed processed function
> and handle the time for each key my self.
>
> Med venlig hilsen / Best regards
> Lasse Nedergaard
>
>
> Den 23. sep. 2019 kl. 06.16 skrev 廖嘉逸 :
>
> Hi all,
>
> Currently Watermark can only be supported on task’s level(or partition
> level), which means that the data belonging to the faster key has to share
> the same watermark with the data belonging to the slower key in the same
> key group of a KeyedStream. This will lead to two problems:
>
>
> 1. Latency. For example, every key has its own window state but they have
> to trigger it after the window’s end time is exceeded by the watermark
> which is determined by the data belonging to the slowest key usually. (Same
> in CepOperator and other operators which are using watermark to fire result)
>
> 2. States Size. Because the faster key delayes its firing on result, it
> has to store more redundant states which should be pruned earlier.
>
>
> However, since the watermark has been introduced for a long time and not
> been designed to be more fine-grained in the first place, I find that it’s
> very hard to solve this problem without a big change. I wonder if there is
> anyone in community having some successful experience on this or maybe
> there is a shortcut way? If not, I can try to draft a design if this is
> needed in community.
>
>
>
> Best Regards,
>
> Jiayi Liao
>
>
>
>
>


Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

2019-09-23 Thread Jark Wu
"SYSTEM" sounds good to me too.

Best,
Jark

On Mon, 23 Sep 2019 at 19:04, Fabian Hueske  wrote:

> +1 for CREATE TEMPORARY SYSTEM FUNCTION xxx
>
> Cheers, Fabian
>
> Am Sa., 21. Sept. 2019 um 06:58 Uhr schrieb Bowen Li  >:
>
> > "SYSTEM" sounds good to me. FYI, this FLIP only impacts low level of the
> > SQL function stack and won't actually involve any DDL, thus I will just
> > document the decision and we should keep it in mind when it's time to
> > implement the DDLs.
> >
> > I'm in the process of updating the FLIP to reflect changes required for
> > option #2, will send a new version for review soon.
> >
> >
> >
> > On Fri, Sep 20, 2019 at 4:02 PM Dawid Wysakowicz  >
> > wrote:
> >
> > > I also like the 'System' keyword. I think we can assume we reached
> > > consensus on this topic.
> > >
> > > On Sat, 21 Sep 2019, 06:37 Xuefu Z,  wrote:
> > >
> > > > +1 for using the keyword "SYSTEM". Thanks to Timo for chiming in!
> > > >
> > > > --Xuefu
> > > >
> > > > On Fri, Sep 20, 2019 at 3:28 PM Timo Walther 
> > wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > sorry, for the late replay. I give also +1 for option #2. Thus, I
> > guess
> > > > > we have a clear winner.
> > > > >
> > > > > I would also like to find a better keyword/syntax for this
> statement.
> > > > > Esp. the BUILTIN keyword can confuse people, because it could be
> > > written
> > > > > as BUILTIN, BUILDIN, BUILT_IN, or BUILD_IN. And we would need to
> > > > > introduce a new reserved keyword in the parser which affects also
> > > > > non-DDL queries. How about:
> > > > >
> > > > > CREATE TEMPORARY SYSTEM FUNCTION xxx
> > > > >
> > > > > The SYSTEM keyword is already a reserved keyword and in FLIP-66 we
> > are
> > > > > discussing to prefix some of the function with a SYSTEM_ prefix
> like
> > > > > SYSTEM_WATERMARK. Also SQL defines syntax like "FOR SYSTEM_TIME AS
> > OF".
> > > > >
> > > > > What do you think?
> > > > >
> > > > > Thanks,
> > > > > Timo
> > > > >
> > > > >
> > > > > On 20.09.19 05:45, Bowen Li wrote:
> > > > > > Another reason I prefer "CREATE TEMPORARY BUILTIN FUNCTION" over
> > > "ALTER
> > > > > > BUILTIN FUNCTION xxx TEMPORARILY" is - what if users want to drop
> > the
> > > > > > temporary built-in function in the same session? With the former
> > one,
> > > > > they
> > > > > > can run something like "DROP TEMPORARY BUILTIN FUNCTION"; With
> the
> > > > latter
> > > > > > one, I'm not sure how users can "restore" the original builtin
> > > function
> > > > > > easily from an "altered" function without introducing further
> > > > nonstandard
> > > > > > SQL syntax.
> > > > > >
> > > > > > Also please pardon me as I realized using net may not be a good
> > > idea...
> > > > > I'm
> > > > > > trying to fit this vote into cases listed in Flink Bylaw [1].
> > > > > >
> > > > > > >From the following result, the majority seems to be #2 too as it
> > has
> > > > the
> > > > > > most approval so far and doesn't have strong "-1".
> > > > > >
> > > > > > #1:3 (+1), 1 (0), 4(-1)
> > > > > > #2:4(0), 3 (+1), 1(+0.5)
> > > > > > * Dawid -1/0 depending on keyword
> > > > > > #3:2(+1), 3(-1), 3(0)
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026
> > > > > >
> > > > > > On Thu, Sep 19, 2019 at 10:30 AM Bowen Li 
> > > wrote:
> > > > > >
> > > > > >> Hi,
> > > > > >>
> > > > > >> Thanks everyone for your votes. I summarized the result as
> > > following:
> > > > > >>
> > > > > >> #1:3 (+1), 1 (0), 4(-1) - net: -1
> > > > > >> #2:4(0), 2 (+1), 1(+0.5)  - net: +2.5
> > > > > >>  Dawid -1/0 depending on keyword
> > > > > >> #3:2(+1), 3(-1), 3(0)   - net: -1
> > > > > >>
> > > > > >> Given the result, I'd like to change my vote for #2 from 0 to
> +1,
> > to
> > > > > make
> > > > > >> it a stronger case with net +3.5. So the votes so far are:
> > > > > >>
> > > > > >> #1:3 (+1), 1 (0), 4(-1) - net: -1
> > > > > >> #2:4(0), 3 (+1), 1(+0.5)  - net: +3.5
> > > > > >>  Dawid -1/0 depending on keyword
> > > > > >> #3:2(+1), 3(-1), 3(0)   - net: -1
> > > > > >>
> > > > > >> What do you think? Do you think we can conclude with this
> result?
> > Or
> > > > > would
> > > > > >> you like to take it as a formal FLIP vote with 3 days voting
> > period?
> > > > > >>
> > > > > >> BTW, I'd prefer "CREATE TEMPORARY BUILTIN FUNCTION" over "ALTER
> > > > BUILTIN
> > > > > >> FUNCTION xxx TEMPORARILY" because
> > > > > >> 1. the syntax is more consistent with "CREATE FUNCTION" and
> > "CREATE
> > > > > >> TEMPORARY FUNCTION"
> > > > > >> 2. "ALTER BUILTIN FUNCTION xxx TEMPORARILY" implies it alters a
> > > > built-in
> > > > > >> function but it actually doesn't, the logic only creates a temp
> > > > function
> > > > > >> with higher priority than that built-in function in ambiguous
> > > > resolution
> > > > > >> order; and it would behave inconsistently with "ALTER FUNCTION".
> > > > > >>
> > > > > >>
> > > > 

[jira] [Created] (FLINK-14171) Add 'SHOW USER FUNCTIONS' support in SQL CLI

2019-09-23 Thread Canbin Zheng (Jira)
Canbin Zheng created FLINK-14171:


 Summary: Add 'SHOW USER FUNCTIONS' support in SQL CLI
 Key: FLINK-14171
 URL: https://issues.apache.org/jira/browse/FLINK-14171
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / Client
Affects Versions: 1.9.0
Reporter: Canbin Zheng
 Fix For: 1.10.0


Currently *listUserDefinedFunctions* has been supported in Executor, I think we 
can add 'SHOW USER FUNCTIONS' support to make it a end-2-end functionality.--



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Add ARM CI build to Flink (information-only)

2019-09-23 Thread Till Rohrmann
This sounds good Xiyuan. I'd also be in favour of running the ARM builds
regularly as cron jobs and once we see that they are stable we could run
them for every master commit. Hence, I'd say let's fix the above mentioned
problems and then set the nightly cron job up.

Cheers,
Till

On Fri, Sep 20, 2019 at 8:57 AM Xiyuan Wang 
wrote:

> Sure,  we can run daily ARM job as Travis CI nightly jobs firstly. Once
> it's stable enough, we can consider adding it to peer PR.
>
> BTW, I tested flink-end-to-end-test on ARM in last few days. Keeping the
> same as Travis, all 7 scenarios were tested:
>
> 1. split_checkpoints.sh
> 2. split_sticky.sh
> 3. split_ha.sh
> 4. split_heavy.sh
> 5. split_misc_hadoopfree.sh
> 6. split_misc.sh
> 7. split_container.sh
>
> The 1st-6th scenarios works well within some hacking and bug fixing
> locally:
> 1. frocksdb doesn't have official ARM release, so I built and install
> it locally for ARM.
>   https://issues.apache.org/jira/browse/FLINK-13598
> 2. Prometheus has ARM release but the test always download x86 version.
> Download the correct version can fix the issue.
>   https://issues.apache.org/jira/browse/FLINK-14086
> 3. Elasticsearch 6.0+ enables Xpack machine learning feature by
> default, but this feature doesn't support ARM. So Elasticsearch 6.0+ failed
> to start on ARM. Set `Xpack.ml.enabled: false` can fix this issue.
>   https://issues.apache.org/jira/browse/FLINK-14126
>
> The 7th scenario for container failed because:
> 1. docker-compose doesn't have official ARM package. Use `apt install
> docker-compose` can solve the problem.
> 2. minikube doesn't support ARM arch. Use kubeadm for K8S installation
> can solve the problem.
>
> Fixing the problem mentioned above is not hard. So I think we can add flink
> build, unit-test and e2e test as nightly jobs now.
>
> Any idea?
>
> Thanks.
>
> Stephan Ewen  于2019年9月19日周四 下午5:44写道:
>
> > My gut feeling is that having a CI that only runs on a specific command
> > will not help too much.
> >
> > What about going with nightly builds then? We could set up the ARM CI the
> > same way as the Travis CI nightly builds (cron builds). They report build
> > failures to "bui...@flink.apache.org".
> > Maybe Chesnay or Jark could help with what needs to be done to post to
> that
> > mailing list?
> >
> > A requirement would be that the builds are stable, from the ARM
> > perspective, meaning that there are no failures at the moment caused by
> ARM
> > specific issue.
> >
> > What do the others think?
> >
> >
> > On Tue, Sep 3, 2019 at 4:40 AM Xiyuan Wang 
> > wrote:
> >
> > > The ARM CI trigger has been changed to `github comment` way only. It
> > means
> > > that every PR won't start ARM test unless a comment `check_arm` is
> added.
> > > Like what I did in the PR[1].
> > >
> > > A POC for Flink nightly end to end test job is created as well[2]. I'll
> > > improve it then.
> > >
> > > Any feedback or question?
> > >
> > >
> > > [1]: https://github.com/apache/flink/pull/9416
> > >  https://github.com/apache/flink/pull/9416#issuecomment-527268203
> > > [2]: https://github.com/theopenlab/openlab-zuul-jobs/pull/631
> > >
> > >
> > > Thanks
> > >
> > > Xiyuan Wang  于2019年8月26日周一 下午7:41写道:
> > >
> > > > Before ARM CI is ready, I can close the CI test for each PR and let
> it
> > > > only be triggered by PR comment.  It's quite easy for OpenLab to do
> > this.
> > > >
> > > > OpenLab have many job piplines[1].  Now I use `check` pipline in
> > > > https://github.com/apache/flink/pull/9416. The job trigger contains
> > > > github_action and github_comment[2]. I can create a new pipline for
> > > Flink,
> > > > the new trigger can only contain github_coment like:
> > > >
> > > > trigger:
> > > >   github:
> > > >  - event: pull_request
> > > >action: comment
> > > >comment: (?i)^\s*recheck_arm_build\s*$
> > > >
> > > > So that the ARM job will not be ran for every PR. It'll be just ran
> for
> > > > the PR which have `recheck_arm_build` comment.
> > > >
> > > > Then once ARM CI is ready, I can add it back.
> > > >
> > > >
> > > > nightly tests can be added as well of couse. There is a kind of job
> in
> > > > OpenLab called `periodic job`. We can use it for Flink daily nightly
> > > tests.
> > > > If any error occur, the report can be sent to
> bui...@flink.apache.org
> > > as
> > > > well.
> > > >
> > > > [1]:
> > > >
> > >
> >
> https://github.com/theopenlab/openlab-zuul-jobs/blob/master/zuul.d/pipelines.yaml
> > > > [2]:
> > > >
> > >
> >
> https://github.com/theopenlab/openlab-zuul-jobs/blob/master/zuul.d/pipelines.yaml#L10-L19
> > > >
> > > > Stephan Ewen  于2019年8月26日周一 下午6:13写道:
> > > >
> > > >> Adding CI builds for ARM makes only sense when we actually take them
> > > into
> > > >> account as "blocking a merge", otherwise there is no point in having
> > > them.
> > > >> So we would need to be prepared to do that.
> > > >>
> > > >> The cases where something runs in UNIX/x64 but fails on AR

Re: java8 lambdas and exceptions lead to compile error

2019-09-23 Thread Till Rohrmann
If there is such a check, then I'd say let's enable it for the moment.

Cheers,
Till

On Fri, Sep 20, 2019 at 1:50 PM zz  wrote:

> thanks for reply. "add some context/comment" is very necessary, but I am
> not sure where to add to remind others for avoiding similar mistakes, so Is
> that a better way to add corresponding grammar checkstyle
> in checkstyle.xml? we can remove corresponding grammar checkstyle when we
> upgrade new Java version.In this way others committers can
> avoid similar problem.
>
> Till Rohrmann  于2019年9月19日周四 下午3:37写道:
>
> > Hi,
> >
> > if there is an easy way to make it also work with Java 1.8.0_77 I guess
> we
> > could change it. That way we would make the life of our users easier.
> >
> > The solution proposed by JDK-8054569 seems quite simple. The only
> downside
> > I see is that it could easily fell victim of a future refactoring/clean
> up
> > if we don't add some context/comment why the explicit type has been
> > introduced. Alternatively, we could state on the website which Java
> version
> > you need to build Flink.
> >
> > Cheers,
> > Till
> >
> > On Thu, Sep 19, 2019 at 8:53 AM zz  wrote:
> >
> > > Hey all,
> > > Recently, I used flink to do secondary development, when compile flink
> > > master(up-to-date) by using Java 1.8.0_77, got errors as follow:
> > >
> > > compile (default-compile) on project flink-table-api-java: Compilation
> > > failure
> > >
> > >
> >
> /home/*/zzsmdfj/sflink/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/operations/utils/factories/Cal
> > > culatedTableFactory.java:[90,53] unreported exception X; must be caught
> > or
> > > declared to be thrownat
> > > org.apache.maven.lifecycle.internal.MojoExecutor.execute
> > > (MojoExecutor.java:213)
> > > at org.apache.maven.lifecycle.internal.MojoExecutor.execute
> > > (MojoExecutor.java:154)
> > > at org.apache.maven.lifecycle.internal.MojoExecutor.execute
> > > (MojoExecutor.java:146)
> > > at
> > > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject
> > > (LifecycleModuleBuilder.java:117)
> > > at
> > > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject
> > > (LifecycleModuleBuilder.java:81)
> > > at
> > >
> > >
> >
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build
> > > (SingleThreadedBuilder.java:51)
> > > at org.apache.maven.lifecycle.internal.LifecycleStarter.execute
> > > (LifecycleStarter.java:128)
> > > at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:309)
> > > at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:194)
> > > at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:107)
> > > at org.apache.maven.cli.MavenCli.execute (MavenCli.java:955)
> > > at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:290)
> > > at org.apache.maven.cli.MavenCli.main (MavenCli.java:194)
> > > at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
> > > at sun.reflect.NativeMethodAccessorImpl.invoke
> > > (NativeMethodAccessorImpl.java:62)
> > > at sun.reflect.DelegatingMethodAccessorImpl.invoke
> > > (DelegatingMethodAccessorImpl.java:43)
> > > at java.lang.reflect.Method.invoke (Method.java:498)
> > > at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced
> > > (Launcher.java:289)
> > > at org.codehaus.plexus.classworlds.launcher.Launcher.launch
> > > (Launcher.java:229)
> > > at
> org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode
> > > (Launcher.java:415)
> > > at org.codehaus.plexus.classworlds.launcher.Launcher.main
> > > (Launcher.java:356)
> > > Caused by:
> org.apache.maven.plugin.compiler.CompilationFailureException:
> > > Compilation failure
> > >
> > > if using Java 1.8.0_102 to compile, it build success. it maybe a case
> of
> > > bug JDK-8054569 .
> > >
> > > Is that a problem?and What should I do with this?any comments would be
> > > appreciated.
> > >
> > > issue:https://issues.apache.org/jira/browse/FLINK-14093
> > >
> >
>


Re: [DISCUSS] FLIP-66: Support time attribute in SQL DDL

2019-09-23 Thread Jark Wu
Hi,

Thanks Fabian for your reply. I agree with your point that the
histogram-based case need the function to be stateful which is not
supported currently and in this design.
Maybe we can support stateful scalar function like TableAggregateFunction.
We can further discuss how to support this in the future.
I added this limitation in the "Complex Watermark Strategies" section.

Btw, I also updated how to automatically apply the watermark assigner by
the planner at the end of "Implementation" section [1].
This can avoid every TableSource extending DefinedProctimeAttribute to
carry time attribute information.

If there is no objection, I would like to update the cwiki FLIP page and
start a new voting process in the next days.

Best,
Jark

[1]:
https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit#heading=h.qx7j56dotywd


On Fri, 20 Sep 2019 at 22:18, Fabian Hueske  wrote:

> Hi Jark,
>
> Thanks for the summary!
> I like the proposal!
>
> It makes it very clear that an event time attribute is an existing column
> on which watermark metadata is defined whereas a processing time attribute
> is a computed field.
>
> I have one comment regarding the section on "Complex Watermark Strategies".
> The proposal says that you can also use a scalar function.
> I don't think that a "text book" scalar function would be sufficient for
> more advanced strategies.
> For example a histogram-based approach would need to remember the values of
> the last x records.
> The interface of a scalar function would still work for that, but it would
> be a stateful function (which would not be OK for a scalar function).
> I don't think it's a problem, but wanted to mention it here.
>
> Best, Fabian
>
> Am Do., 19. Sept. 2019 um 18:05 Uhr schrieb Jark Wu :
>
> > Hi everyone,
> >
> > Thanks all for the valuable suggestions and feedbacks so far.
> > Before starting the vote, I would like to summarize the proposed DDL
> syntax
> > in the mailing list.
> >
> > ## Rowtime Attribute (Watermark Syntax)
> >
> > CREATE TABLE table_name (
> >   WATERMARK FOR  AS 
> > ) WITH (
> >   ...
> > )
> >
> > It marks an existing field  as the rowtime attribute, and the
> > watermark is generated by the expression .
> >  can be arbitrary expression which
> returns a
> > nullable BIGINT or TIMESTAMP as the watermark value.
> >
> > For common cases, users can use the following expressions to define a
> > strategy.
> > 1. Bounded Out of Orderness, the strategy can be "rowtimeField - INTERVAL
> > 'string' timeUnit".
> > 2. Preserve Watermark From Source, the strategy can be
> > "SYSTEM_WATERMARK()".
> >
> > ## Proctime Attribute
> >
> > CREATE TABLE table_name (
> >   ...
> >   proc AS SYSTEM_PROCTIME()
> > ) WITH (
> >   ...
> > )
> >
> > It uses the computed column syntax to add an additional column with
> > proctime attribute. Here SYSTEM_PROCTIME() is a built-in function.
> >
> > For more details and the implementations, please refer to the design doc:
> >
> >
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit?ts=5d822dba
> >
> > Feel free to leave your further feedbacks!
> >
> > Thanks,
> > Jark
> >
> > On Thu, 19 Sep 2019 at 11:23, Kurt Young  wrote:
> >
> > > +1 to start vote process.
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Thu, Sep 19, 2019 at 10:54 AM Jark Wu  wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > Thanks all for joining the discussion in the doc[1].
> > > > It seems that the discussion is converged and there is a consensus on
> > the
> > > > current FLIP document.
> > > > If there is no objection, I would like to convert it into cwiki FLIP
> > page
> > > > and start voting process.
> > > >
> > > > For more details, please refer to the design doc (it is slightly
> > changed
> > > > since the initial proposal).
> > > >
> > > > Thanks,
> > > > Jark
> > > >
> > > > [1]:
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit?ts=5d8258cd
> > > >
> > > > On Mon, 16 Sep 2019 at 16:12, Kurt Young  wrote:
> > > >
> > > > > After some review and discussion in the google document, I think
> it's
> > > > time
> > > > > to
> > > > > convert this design to a cwiki flip page and start voting process.
> > > > >
> > > > > Best,
> > > > > Kurt
> > > > >
> > > > >
> > > > > On Mon, Sep 9, 2019 at 7:46 PM Jark Wu  wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > Thanks all for so much feedbacks received in the doc so far.
> > > > > > I saw a general agreement on using computed column to support
> > > proctime
> > > > > > attribute and extract timestamps.
> > > > > > So we will prepare a computed column FLIP and share in the dev ML
> > > soon.
> > > > > >
> > > > > > Feel free to leave more comments!
> > > > > >
> > > > > > Best,
> > > > > > Jark
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, 6 Sep 2019 at 13:50, Dian Fu 
> > wrote:
> > > > > >
> > > > > > > Hi Jark,
> > > > > > >
> > > > > > > Thanks for brin

Re: [FLINK-14076] non-deserializable root cause in DeclineCheckpoint

2019-09-23 Thread Till Rohrmann
Hi Jeffrey,

thanks for reporting this issue and starting a discussion how to solve this
problem. I've pulled in Piotr who is working on the checkpointing part of
Flink.

If a user generated exception can get reported, then we need to make sure
that it is properly handled. Approach 2. would be easier if we are ok with
not having direct access to the root cause (w/o cumbersomely deserializing
user defined exceptions). Approach 1. would make the fact that the decline
checkpoint message might contain a user defined exception more explicit.
However, if we don't use the concrete exception except for reporting, then
approach 1. should be good enough.

Cheers,
Till

On Sat, Sep 21, 2019 at 5:49 AM Jeffrey Martin 
wrote:

> To be clear -- I'm happy to make a PR for either option below. (Either is
> <10 lines diff.) It's just the contributor guidelines said to get consensus
> first and then only make a PR if I'm assigned to do the work.
>
> On Fri, Sep 20, 2019 at 12:23 PM Jeffrey Martin <
> jeffrey.martin...@gmail.com>
> wrote:
>
> > (possible dupe; I wasn't subscribed before and the previous message
> didn't
> > seem to go through)
> >
> > I'm on Flink v1.9 with the Kafka connector and a standalone JM.
> > If FlinkKafkaProducer fails while checkpointing, it throws a
> > KafkaException which gets wrapped in a CheckpointException which is sent
> to
> > the JM as a DeclineCheckpoint. KafkaException isn't on the JM default
> > classpath, so the JM throws a fairly cryptic ClassNotFoundException. The
> > details of the KafkaException wind up suppressed so it's impossible to
> > figure out what actually went wrong.
> >
> > I can think of two fixes that would prevent this from occurring in the
> > Kafka or other connectors in the future:
> > 1. DeclineCheckpoint should always send a SerializedThrowable to the JM
> > rather than allowing CheckpointExceptions with non-deserializable root
> > causes to slip through
> > 2. CheckpointException should always capture its wrapped exception as a
> > SerializedThrowable (i.e., use 'super(new SerializedThrowable(cause))'
> > rather than 'super(cause)').
> >
> > Thoughts?
> >
>


Re: [DISCUSS] Releasing Flink 1.9.1

2019-09-23 Thread Till Rohrmann
+1 for the 1.9.1 release and for Jark being the RM. I'll help with the
review of FLINK-14010.

Cheers,
Till

On Mon, Sep 23, 2019 at 8:32 AM Debasish Ghosh 
wrote:

> I hope https://issues.apache.org/jira/browse/FLINK-12501 will also be part
> of 1.9.1 ..
>
> regards.
>
> On Mon, Sep 23, 2019 at 11:39 AM Jeff Zhang  wrote:
>
> > FLINK-13708 is also very critical IMO. This would cause invalid flink job
> > (doubled output)
> >
> > https://issues.apache.org/jira/browse/FLINK-13708
> >
> > Jark Wu  于2019年9月23日周一 下午2:03写道:
> >
> > > Hi everyone,
> > >
> > > It has already been a month since we released Flink 1.9.0.
> > > We already have many important bug fixes from which our users can
> benefit
> > > in the release-1.9 branch (83 resolved issues).
> > > Therefore, I propose to create the next bug fix release for Flink 1.9.
> > >
> > > Most notable fixes are:
> > >
> > > - [FLINK-13526] When switching to a non existing catalog or database in
> > the
> > > SQL Client the client crashes.
> > > - [FLINK-13568] It is not possible to create a table with a "STRING"
> data
> > > type via the SQL DDL.
> > > - [FLINK-13941] Prevent data-loss by not cleaning up small part files
> > from
> > > S3.
> > > - [FLINK-13490][jdbc] If one column value is null when reading JDBC,
> the
> > > following values will all be null.
> > > - [FLINK-14107][kinesis] When using event time alignment with the
> > Kinsesis
> > > Consumer the consumer might deadlock in one corner case.
> > >
> > > Furthermore, I would like the following critical issues to be merged
> > before
> > > 1.9.1 release:
> > >
> > > - [FLINK-14118] Reduce the unnecessary flushing when there is no data
> > > available for flush which can save 20% ~ 40% CPU. (reviewing)
> > > - [FLINK-13386] Fix A couple of issues with the new dashboard have
> > already
> > > been filed. (PR is created, need review)
> > > - [FLINK-14010][yarn] The Flink YARN cluster can get into an
> inconsistent
> > > state in some cases, where
> > > leaderhship for JobManager, ResourceManager and Dispatcher components
> is
> > > split between two master processes. (PR is created, need review)
> > >
> > > I would volunteer as release manager and kick off the release process
> > once
> > > blocker issues has been merged. What do you think?
> > >
> > > If there is any other blocker issues need to be fixed in 1.9.1, please
> > let
> > > me know.
> > >
> > > Cheers,
> > > Jark
> > >
> >
> >
> > --
> > Best Regards
> >
> > Jeff Zhang
> >
>
>
> --
> Debasish Ghosh
> http://manning.com/ghosh2
> http://manning.com/ghosh
>
> Twttr: @debasishg
> Blog: http://debasishg.blogspot.com
> Code: http://github.com/debasishg
>


Re: Per Key Grained Watermark Support

2019-09-23 Thread bupt_ljy
Hi Congxian,
Thanks but by doing that, we will lose some features like output of the late 
data. 


 Original Message 
Sender: Congxian Qiu
Recipient: Lasse Nedergaard
Cc: 廖嘉逸; u...@flink.apache.org; 
dev@flink.apache.org
Date: Monday, Sep 23, 2019 19:56
Subject: Re: Per Key Grained Watermark Support


Hi
There was a discussion about this issue[1], as the previous discussion said at 
the moment this is not supported out of the box by Flink, I think you can try 
keyed process function as Lasse said.


[1] 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Watermark-for-each-key-td27485.html#a27516

Best,
Congxian




Lasse Nedergaard  于2019年9月23日周一 下午12:42写道:

Hi Jiayi


We have face the same challenge as we deal with IoT unit and they do not 
necessarily share the same timestamp. Watermark or. Key would be perfect match 
here. We tried to workaround with handle late events as special case with 
sideoutputs but isn’t the perfect solution. 
My conclusion is to skip watermark and create a keyed processed function and 
handle the time for each key my self. 


Med venlig hilsen / Best regards
Lasse Nedergaard



Den 23. sep. 2019 kl. 06.16 skrev 廖嘉逸 :


Hi all,
Currently Watermark can only be supported on task’s level(or partition level), 
which means that the data belonging to the faster key has to share the same 
watermark with the data belonging to the slower key in the same key group of a 
KeyedStream. This will lead to two problems:


1. Latency. For example, every key has its own window state but they have to 
trigger it after the window’s end time is exceeded by the watermark which is 
determined by the data belonging to the slowest key usually. (Same in 
CepOperator and other operators which are using watermark to fire result)
2. States Size. Because the faster key delayes its firing on result, it has to 
store more redundant states which should be pruned earlier.


However, since the watermark has been introduced for a long time and not been 
designed to be more fine-grained in the first place, I find that it’s very hard 
to solve this problem without a big change. I wonder if there is anyone in 
community having some successful experience on this or maybe there is a 
shortcut way? If not, I can try to draft a design if this is needed in 
community.




Best Regards,
Jiayi Liao

Re: [FLINK-14076] non-deserializable root cause in DeclineCheckpoint

2019-09-23 Thread Piotr Nowojski
Hi,

I guess the TaskManager should have logged the original exception somewhere 
(I’m not saying that we shouldn’t solve this, just to make sure that the basics 
are covered), so you should already be able to deduce the reason of failure, 
right?

I think that option 2. would not only be easier, but cleaner. 
`CheckpointException` is a Flink class, there might be reasons for 
CheckpointCoordinator to access it, while there should be no reasons for the JM 
code to ever touch anything below (like a cause of the `CheckpointException` 
originating from user code).

Piotrek

> On 23 Sep 2019, at 14:56, Till Rohrmann  wrote:
> 
> Hi Jeffrey,
> 
> thanks for reporting this issue and starting a discussion how to solve this
> problem. I've pulled in Piotr who is working on the checkpointing part of
> Flink.
> 
> If a user generated exception can get reported, then we need to make sure
> that it is properly handled. Approach 2. would be easier if we are ok with
> not having direct access to the root cause (w/o cumbersomely deserializing
> user defined exceptions). Approach 1. would make the fact that the decline
> checkpoint message might contain a user defined exception more explicit.
> However, if we don't use the concrete exception except for reporting, then
> approach 1. should be good enough.
> 
> Cheers,
> Till
> 
> On Sat, Sep 21, 2019 at 5:49 AM Jeffrey Martin 
> wrote:
> 
>> To be clear -- I'm happy to make a PR for either option below. (Either is
>> <10 lines diff.) It's just the contributor guidelines said to get consensus
>> first and then only make a PR if I'm assigned to do the work.
>> 
>> On Fri, Sep 20, 2019 at 12:23 PM Jeffrey Martin <
>> jeffrey.martin...@gmail.com>
>> wrote:
>> 
>>> (possible dupe; I wasn't subscribed before and the previous message
>> didn't
>>> seem to go through)
>>> 
>>> I'm on Flink v1.9 with the Kafka connector and a standalone JM.
>>> If FlinkKafkaProducer fails while checkpointing, it throws a
>>> KafkaException which gets wrapped in a CheckpointException which is sent
>> to
>>> the JM as a DeclineCheckpoint. KafkaException isn't on the JM default
>>> classpath, so the JM throws a fairly cryptic ClassNotFoundException. The
>>> details of the KafkaException wind up suppressed so it's impossible to
>>> figure out what actually went wrong.
>>> 
>>> I can think of two fixes that would prevent this from occurring in the
>>> Kafka or other connectors in the future:
>>> 1. DeclineCheckpoint should always send a SerializedThrowable to the JM
>>> rather than allowing CheckpointExceptions with non-deserializable root
>>> causes to slip through
>>> 2. CheckpointException should always capture its wrapped exception as a
>>> SerializedThrowable (i.e., use 'super(new SerializedThrowable(cause))'
>>> rather than 'super(cause)').
>>> 
>>> Thoughts?
>>> 
>> 



Re: [VOTE] FLIP-56: Dynamic Slot Allocation

2019-09-23 Thread Xintong Song
@Till @Andrey

According to the comments, I just updated the FLIP document [1], with the
following changes:

   - Remove SlotID (in the section Protocol Changes)
   - Updated implementation steps to reduce separated code paths. As far as
   I can see at the moment, we do not need the feature option. We can add it
   if later we find it necessary in the implementation.


Thank you~

Xintong Song


[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation

On Fri, Sep 20, 2019 at 11:01 AM Xintong Song  wrote:

> I'm not sure if I understand the implementation plan you suggested
> correctly. To my understanding, it seems that all the steps except for step
> 5 have to happen in strict order.
>
>- Profiles to be used in step 2 is reported with step 1.
>- SlotProfile in TaskExecutorGateway#requestSlot in step 3 comes from
>profiles used in step 2.
>- Only if RM request slots from TM with profiles (step 3), would TM be
>able to do the proper bookkeeping (step 4)
>- Step 5 can be done as long as we have step 2.
>- Step 6 relies on both step 4  and step 5, for proper bookkeepings on
>both TM and RM sides before enabling non-default profiles.
>
> That means we can only work on the steps in the following order.
> 1-2-3-4-6
>\-5-/
>
> What I'm trying to achieve with the current plan, is to have most of the
> implementation steps paralleled, as the following. So that Andrey and I can
> work concurrently without blocking each other too much.
> 1-2-3-4
>\5-6-7
>
>
> I also agree that it would be good to not add too much separate codes. I
> would suggest leave that decision to the implementation time. E.g., if by
> the time we do the TM side bookkeeping, the RM side has already implemented
> requesting slots with profiles, then we do not need to separate the code
> paths.
>
>
> To that end, I think it makes sense to adjust step 5-7 to first use
> default slot resource profiles for all the bookkeepings, and replace it
> with the requested profiles at the end.
>
>
> What do you think?
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Thu, Sep 19, 2019 at 7:59 PM Till Rohrmann 
> wrote:
>
>> I think besides of point 1. and 3. there are no dependencies between the
>> RM
>> and TM side changes. Also, I'm not sure whether it makes sense to split
>> the
>> slot manager changes up into the proposed steps 5, 6 and 7.
>>
>> I would highly recommend to not add too much duplicate logic/separate code
>> paths because it just adds blind spots which are probably not as well
>> tested as the old code paths.
>>
>> Cheers,
>> Till
>>
>> On Thu, Sep 19, 2019 at 11:58 AM Xintong Song 
>> wrote:
>>
>> > Thanks for the comments, Till.
>> >
>> > - Agree on removing SlotID.
>> >
>> > - Regarding the implementation plan, it is true that we can possibly
>> reduce
>> > codes separated by the feature option. But I think to do that we need to
>> > introduce more dependencies between implementation steps. With the
>> current
>> > plan, we can easily separate steps on the RM side and the TM side, and
>> > start concurrently working on them after quickly updating the
>> interfaces in
>> > between. The feature will come alive when the steps on both RM/TM sides
>> are
>> > finished. Since we are planning to have two persons (Andrey and I)
>> working
>> > on this FLIP, I think the current plan is probably more convenient.
>> >
>> > Thank you~
>> >
>> > Xintong Song
>> >
>> >
>> >
>> > On Thu, Sep 19, 2019 at 5:09 PM Till Rohrmann 
>> > wrote:
>> >
>> > > Hi Xintong,
>> > >
>> > > thanks for starting the vote. The general plan looks good. Hence +1
>> from
>> > my
>> > > side. I still have some minor comments one could think about:
>> > >
>> > > * As we no longer have predetermined slots on the TaskExecutor, I
>> think
>> > we
>> > > can get rid of the SlotID. Instead, an allocated slot will be
>> identified
>> > by
>> > > the AllocationID and the TaskManager's ResourceID in order to
>> > differentiate
>> > > duplicate registrations.
>> > > * For the implementation plan, I believe there is only one tiny part
>> on
>> > the
>> > > SlotManager for which we need a separate code path/feature flag which
>> is
>> > > how we find a matching slot. Everything else should be possible to
>> > > implement in a way that it works with dynamic and static slot
>> allocation:
>> > > 1. Let TMs register with default slot profile at RM
>> > > 2. Change SlotManager to use reported slot profiles instead of
>> > > pre-calculated profiles
>> > > 3. Replace SlotID with SlotProfile in TaskExecutorGateway#requestSlot
>> > > 4. Extend TM to support dynamic slot allocation (aka proper
>> bookkeeping)
>> > > (can happen concurrently to any of steps 2-3)
>> > > 5. Add bookkeeping to SlotManager (for pending TMs and registered TMs)
>> > but
>> > > still only use default slot profiles for matching with slot requests
>> > > 6. Allow to match slot requests with reported resources instead of
>> > default
>> > > slot profiles (h

Re: [VOTE] FLIP-56: Dynamic Slot Allocation

2019-09-23 Thread Till Rohrmann
Thanks for updating the Flip. It looks good to me.

+1 (binding)

Cheers,
Till

On Mon, Sep 23, 2019 at 4:12 PM Xintong Song  wrote:

> @Till @Andrey
>
> According to the comments, I just updated the FLIP document [1], with the
> following changes:
>
>- Remove SlotID (in the section Protocol Changes)
>- Updated implementation steps to reduce separated code paths. As far as
>I can see at the moment, we do not need the feature option. We can add
> it
>if later we find it necessary in the implementation.
>
>
> Thank you~
>
> Xintong Song
>
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation
>
> On Fri, Sep 20, 2019 at 11:01 AM Xintong Song 
> wrote:
>
> > I'm not sure if I understand the implementation plan you suggested
> > correctly. To my understanding, it seems that all the steps except for
> step
> > 5 have to happen in strict order.
> >
> >- Profiles to be used in step 2 is reported with step 1.
> >- SlotProfile in TaskExecutorGateway#requestSlot in step 3 comes from
> >profiles used in step 2.
> >- Only if RM request slots from TM with profiles (step 3), would TM be
> >able to do the proper bookkeeping (step 4)
> >- Step 5 can be done as long as we have step 2.
> >- Step 6 relies on both step 4  and step 5, for proper bookkeepings on
> >both TM and RM sides before enabling non-default profiles.
> >
> > That means we can only work on the steps in the following order.
> > 1-2-3-4-6
> >\-5-/
> >
> > What I'm trying to achieve with the current plan, is to have most of the
> > implementation steps paralleled, as the following. So that Andrey and I
> can
> > work concurrently without blocking each other too much.
> > 1-2-3-4
> >\5-6-7
> >
> >
> > I also agree that it would be good to not add too much separate codes. I
> > would suggest leave that decision to the implementation time. E.g., if by
> > the time we do the TM side bookkeeping, the RM side has already
> implemented
> > requesting slots with profiles, then we do not need to separate the code
> > paths.
> >
> >
> > To that end, I think it makes sense to adjust step 5-7 to first use
> > default slot resource profiles for all the bookkeepings, and replace it
> > with the requested profiles at the end.
> >
> >
> > What do you think?
> >
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Thu, Sep 19, 2019 at 7:59 PM Till Rohrmann 
> > wrote:
> >
> >> I think besides of point 1. and 3. there are no dependencies between the
> >> RM
> >> and TM side changes. Also, I'm not sure whether it makes sense to split
> >> the
> >> slot manager changes up into the proposed steps 5, 6 and 7.
> >>
> >> I would highly recommend to not add too much duplicate logic/separate
> code
> >> paths because it just adds blind spots which are probably not as well
> >> tested as the old code paths.
> >>
> >> Cheers,
> >> Till
> >>
> >> On Thu, Sep 19, 2019 at 11:58 AM Xintong Song 
> >> wrote:
> >>
> >> > Thanks for the comments, Till.
> >> >
> >> > - Agree on removing SlotID.
> >> >
> >> > - Regarding the implementation plan, it is true that we can possibly
> >> reduce
> >> > codes separated by the feature option. But I think to do that we need
> to
> >> > introduce more dependencies between implementation steps. With the
> >> current
> >> > plan, we can easily separate steps on the RM side and the TM side, and
> >> > start concurrently working on them after quickly updating the
> >> interfaces in
> >> > between. The feature will come alive when the steps on both RM/TM
> sides
> >> are
> >> > finished. Since we are planning to have two persons (Andrey and I)
> >> working
> >> > on this FLIP, I think the current plan is probably more convenient.
> >> >
> >> > Thank you~
> >> >
> >> > Xintong Song
> >> >
> >> >
> >> >
> >> > On Thu, Sep 19, 2019 at 5:09 PM Till Rohrmann 
> >> > wrote:
> >> >
> >> > > Hi Xintong,
> >> > >
> >> > > thanks for starting the vote. The general plan looks good. Hence +1
> >> from
> >> > my
> >> > > side. I still have some minor comments one could think about:
> >> > >
> >> > > * As we no longer have predetermined slots on the TaskExecutor, I
> >> think
> >> > we
> >> > > can get rid of the SlotID. Instead, an allocated slot will be
> >> identified
> >> > by
> >> > > the AllocationID and the TaskManager's ResourceID in order to
> >> > differentiate
> >> > > duplicate registrations.
> >> > > * For the implementation plan, I believe there is only one tiny part
> >> on
> >> > the
> >> > > SlotManager for which we need a separate code path/feature flag
> which
> >> is
> >> > > how we find a matching slot. Everything else should be possible to
> >> > > implement in a way that it works with dynamic and static slot
> >> allocation:
> >> > > 1. Let TMs register with default slot profile at RM
> >> > > 2. Change SlotManager to use reported slot profiles instead of
> >> > > pre-calculated profiles
> >> > > 3. Replace SlotID with SlotProfile in
> TaskExecutorGa

[jira] [Created] (FLINK-14172) Implement KubeClient with official Java client library for kubernetes

2019-09-23 Thread Yang Wang (Jira)
Yang Wang created FLINK-14172:
-

 Summary: Implement KubeClient with official Java client library 
for kubernetes
 Key: FLINK-14172
 URL: https://issues.apache.org/jira/browse/FLINK-14172
 Project: Flink
  Issue Type: New Feature
Reporter: Yang Wang


Official Java client library for kubernetes is become more and more active. The 
new features(such as leader election) and some client implementations(informer, 
lister, cache) are better. So we should use the official java client for 
kubernetes in flink.

https://github.com/kubernetes-client/java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14173) ANSI-style JOIN with Temporal Table Function fails

2019-09-23 Thread Jira
Benoît Paris created FLINK-14173:


 Summary: ANSI-style JOIN with Temporal Table Function fails
 Key: FLINK-14173
 URL: https://issues.apache.org/jira/browse/FLINK-14173
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Legacy Planner, Table SQL / Planner
Affects Versions: 1.9.0
 Environment: Java 1.8, Scala 2.11, Flink 1.9 (pom.xml file attached)
Reporter: Benoît Paris
 Attachments: flink-test-temporal-tables-1.9.zip

The planner fails to generate a plan for ANSI-style joins with Temporal Table 
Functions. The Blink planners throws with a "Missing conversion is 
LogicalTableFunctionScan[convention: NONE -> LOGICAL]" message (and some very 
fancy graphviz stuff). The old planner does a "This exception indicates that 
the query uses an unsupported SQL feature."

This fails:
{code:java}
 SELECT 
   o_amount * r_amount AS amount 
 FROM Orders 
 JOIN LATERAL TABLE (Rates(o_proctime)) 
   ON r_currency = o_currency {code}
This works:
{code:java}
 SELECT 
   o_amount * r_amount AS amount 
 FROM Orders 
, LATERAL TABLE (Rates(o_proctime)) 
 WHERE r_currency = o_currency{code}
Reproduction with the attached Java and pom.xml files. Also included: stack 
traces for both Blink and the old planner.

I think this is a regression. I remember using ANSI-style joins with a temporal 
table function successfully in 1.8.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14174) Don't swallow exception when rethrowing type mismatches with side outputs

2019-09-23 Thread John Lonergan (Jira)
John Lonergan created FLINK-14174:
-

 Summary: Don't swallow exception when rethrowing type mismatches 
with side outputs
 Key: FLINK-14174
 URL: https://issues.apache.org/jira/browse/FLINK-14174
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Affects Versions: 1.9.0, 1.8.1
Reporter: John Lonergan


The change made by https://github.com/apache/flink/pull/4663/files swallows the 
original exception.

Whilst I am in favour of adding additional helpful tips (which was the purpose 
of FLINK-4663) I don't agree with throwing away or masking causal exceptions.

IMHO the correct approach is to add the helpful hint as the first arg to "new 
ExceptionInChainedOperatorException(msg, ex)" and pass the original class cast 
ex as the cause.

Ie change this .. 
https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/OperatorChain.java#L672




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[VOTE] FLIP-68: Extend Core Table System with Modular Plugins

2019-09-23 Thread Bowen Li
Hi all,

I'd like to start a vote for FLIP-68 [1], since there's no more concern in
the discussion thread [2]

The vote will be open for minimum 3 days till 5:30pm UTC, Sep 26.

Thanks,
Bowen

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Modular+Plugins
[2] https://www.mail-archive.com/dev@flink.apache.org/msg29894.html


[VOTE] FLIP-57: Rework FunctionCatalog

2019-09-23 Thread Bowen Li
Hi all,

I'd like to start a voting thread for FLIP-57 [1], which we've reached
consensus in [2].

This voting will be open for minimum 3 days till 6:30pm UTC, Sep 26.

Thanks,
Bowen

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-57%3A+Rework+FunctionCatalog
[2]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-57-Rework-FunctionCatalog-td32291.html#a32613


Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

2019-09-23 Thread Bowen Li
Thanks all for your input!

I've updated FLIP-57 accordingly. To summarize the changes:

   - introduced new concept of "Temporary system functions", which has no
   namespace and override built-in functions
   - repositioned "temporary functions" to be those with namespaces and
   override catalog functions
   - updated FunctionCatalog APIs
   - redefined the ambiguous function resolution order to be:


   1. temporary system functions
  2. builtin functions
  3. temporary functions, of the current catalog/db
  4. catalog functions, in the current catalog/db

Since we've reached consensus on several most critical pieces of the FLIP,
I've started a separate voting thread on it.

Cheers,
Bowen


Re: [DISCUSS] FLIP-63: Rework table partition support

2019-09-23 Thread Bowen Li
Hi Jingsong,

Thanks for driving this effort!

Besides a few further comments on Catalog APIs that I just left, it LGTM.

Not sure why, but the voting thread in gmail shows in the same thread as
the discussion's. After addressing all the comments, could you start a new,
separate thread to let other people be aware of it?

Thanks,
Bowen

On Mon, Sep 23, 2019 at 1:25 AM JingsongLee 
wrote:

>  Thanks for your discussion on google document.
> Comments addressed and added FileSystem connector chapter, and introduce
> code prototype for file system connector to unify flink file system and
> hive connectors.
>
> Looking forward to your feedbacks. Thank you.
>
> Best,
> Jingsong Lee
>
>
> --
> From:JingsongLee 
> Send Time:2019年9月18日(星期三) 09:45
> To:Kurt Young ; dev 
> Subject:Re: [DISCUSS] FLIP-63: Rework table partition support
>
> Thanks for your reply and google doc comments. It has been discussed
>  for two weeks now. I will start a vote thread.
>
> Best,
> Jingsong Lee
>
>
> --
> From:Kurt Young 
> Send Time:2019年9月16日(星期一) 15:55
> To:dev 
> Cc:JingsongLee 
> Subject:Re: [DISCUSS] FLIP-63: Rework table partition support
>
> +1 to this feature, I left some comments on google doc.
>
> Another comment is I think we should do some reorganize about the content
> when you converting this to a cwiki page. I will have some offline
> discussion
> with you.
>
> Since this feature seems to be a fairly big efforts, so I suggest we can
> settle
> down the design doc ASAP and start vote process.
> Best,
> Kurt
>
>
> On Thu, Sep 12, 2019 at 12:43 PM Biao Liu  wrote:
> Hi Jingsong,
>
>  Thanks for explaining. It looks cool!
>
>  Thanks,
>  Biao /'bɪ.aʊ/
>
>
>
>  On Wed, 11 Sep 2019 at 11:37, JingsongLee  .invalid>
>  wrote:
>
>  > Hi biao, thanks for your feedbacks:
>  >
>  > Actually, the runtime source partition of runtime is similar to split,
>  > which concerns data reading, parallelism and fault tolerance, all the
>  > runtime concepts.
>  > While table partition is only a virtual concept. Users are more likely
> to
>  > choose which partition to read and which partition to write. Users can
>  > manage their partitions.
>  > One is physical implementation correlation, the other is logical concept
>  > correlation.
>  > So I think they are two completely different things.
>  >
>  > About [2], The main problem is that how to write data to a catalog file
>  > system in stream mode, it is a general problem and has little to do with
>  > partition.
>  >
>  > [2]
>  >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html
>  >
>  > Best,
>  > Jingsong Lee
>  >
>  >
>  > --
>  > From:Biao Liu 
>  > Send Time:2019年9月10日(星期二) 14:57
>  > To:dev ; JingsongLee 
>  > Subject:Re: [DISCUSS] FLIP-63: Rework table partition support
>  >
>  > Hi Jingsong,
>  >
>  > Thank you for bringing this discussion. Since I don't have much
> experience
>  > of Flink table/SQL, I'll ask some questions from runtime or engine
>  > perspective.
>  >
>  > > ... where we describe how to partition support in flink and how to
>  > integrate to hive partition.
>  >
>  > FLIP-27 [1] introduces "partition" concept officially. The changes of
>  > FLIP-27 are not only about source interface but also about the whole
>  > infrastructure.
>  > Have you ever thought how to integrate your proposal with these changes?
>  > Or you just want to support "partition" in table layer, there will be no
>  > requirement of underlying infrastructure?
>  >
>  > I have seen a discussion [2] that seems be a requirement of
> infrastructure
>  > to support your proposal. So I have some concerns there might be some
>  > conflicts between this proposal and FLIP-27.
>  >
>  > 1.
>  >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
>  > 2.
>  >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html
>  >
>  > Thanks,
>  > Biao /'bɪ.aʊ/
>  >
>  >
>  >
>  > On Fri, 6 Sep 2019 at 13:22, JingsongLee  .invalid>
>  > wrote:
>  > Hi everyone, thank you for your comments. Mail name was updated
>  >  and streaming-related concepts were added.
>  >
>  >  We would like to start a discussion thread on "FLIP-63: Rework table
>  >  partition support"(Design doc: [1]), where we describe how to partition
>  >  support in flink and how to integrate to hive partition.
>  >
>  >  This FLIP addresses:
>  > - Introduce whole story about partition support.
>  > - Introduce and discuss DDL of partition support.
>  > - Introduce static and dynamic partition insert.
>  > - Introduce partition pruning
>  > - Introduce dynamic partition implementation
>  > - Introduce FileFormatSink to deal with s

Re: [DISCUSS] FLIP 69 - Flink SQL DDL Enhancement

2019-09-23 Thread Bowen Li
Hi Terry,

Thanks for driving the effort! I left some comments in the doc.

AFAIU, the biggest motivation is to support DDLs in sql parser so that both
Table API and SQL CLI can share the stack, despite that SQL CLI has already
supported some commands itself. However, I don't see details on how SQL CLI
would migrate and depend on sql parser, and how Table API and SQL CLI would
actually share SQL parser. I'm not sure yet how much work that will take,
just want to double check that you didn't include them because they are
very trivial according to your estimate?


On Mon, Sep 16, 2019 at 1:46 AM Terry Wang  wrote:

> Hi everyone,
>
> In flink 1.9, we have introduced some awesome features such as complete
> catalog support[1] and sql ddl support[2]. These features have been a
> critical integration for Flink to be able to manage data and metadata like
> a classic RDBMS and make developers more easy to construct their
> real-time/off-line warehouse or sth similar base on flink.
>
> But there is still a lack of support on how Flink SQL DDL to manage
> metadata and data like classic RDBMS such as `alter table rename` and so on.
>
> So I’d like to kick off a discussion on enhancing Flink Sql Ddls:
>
> https://docs.google.com/document/d/1mhZmx1h2ecfL0x8OzYD1n-nVRn4yE7pwk4jGed4k7kc/edit?usp=sharing
> <
> https://docs.google.com/document/d/1mhZmx1h2ecfL0x8OzYD1n-nVRn4yE7pwk4jGed4k7kc/edit?usp=sharing
> >
>
> In short, it:
> - Add Catalog DDL enhancement support:  show catalogs / describe
> catalog / use catalog
> - Add Database DDL enhancement support:  show databses / create
> database / drop database/ alter database
> - Add Table DDL enhancement support:show tables/ describe
> table / alter table
> - Add Function DDL enhancement support: show functions/ create
> function /drop function
>
> Looking forward to your opinions.
>
> Best,
> Terry Wang
>
>
>
> [1]:https://issues.apache.org/jira/browse/FLINK-11275 <
> https://issues.apache.org/jira/browse/FLINK-11275>
> [2]:https://issues.apache.org/jira/browse/FLINK-1 <
> https://issues.apache.org/jira/browse/FLINK-11275>0232
>  


[jira] [Created] (FLINK-14175) Upgrade KPL version in flink-connector-kinesis to fix application OOM

2019-09-23 Thread Abhilasha Seth (Jira)
Abhilasha Seth created FLINK-14175:
--

 Summary: Upgrade KPL version in flink-connector-kinesis to fix 
application OOM
 Key: FLINK-14175
 URL: https://issues.apache.org/jira/browse/FLINK-14175
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kinesis
Affects Versions: 1.9.0, 1.8.2, 1.8.1, 1.8.0, 1.7.2, 1.6.4, 1.6.3
 Environment: [link title|http://example.com][link 
title|http://example.com]
Reporter: Abhilasha Seth


The [KPL 
version|https://github.com/apache/flink/blob/release-1.9/flink-connectors/flink-connector-kinesis/pom.xml#L38]
 currently in use (0.12.9) has a thread leak bug that causes applications to 
run out of memory after frequent restarts:

KPL Issue - [https://github.com/awslabs/amazon-kinesis-producer/issues/224]

Fix - [https://github.com/awslabs/amazon-kinesis-producer/pull/225/files]

Upgrading KPL to 0.12.10 or higher is necessary to avoid this issue.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[COMMITTER] repo locked due to synchronization issues

2019-09-23 Thread Bowen Li
Hi committers,

Recently I've run a repo issue multiple times in different days. When I
tried to push a commit to master, git reports the following error:

```
remote: This repository has been locked due to synchronization issues:
remote:  - /x1/gitbox/broken/flink.txt exists due to a previous error, and
prevents pushes.
remote: This could either be a benign issue, or the repositories could be
out of sync.
remote: Please contact us...@infra.apache.org to have infrastructure
resolve the issue.
remote:
To https://gitbox.apache.org/repos/asf/flink.git
 ! [remote rejected]   master -> master (pre-receive hook declined)
error: failed to push some refs to '
https://gitbox.apache.org/repos/asf/flink.git'
```

This is quite a new issue that didn't come till two or three weeks ago. I
researched online with no luck. I also reported it to ASF INFRA [1] but
their suggested solution doesn't work.

The issue however usually goes away the next morning in PST, so I assume
someone from a different timezone in Asia or Europe fixes it somehow? Has
anyone run into it before? How did you fix it?

Thanks,
Bowen

[1] https://issues.apache.org/jira/projects/INFRA/issues/INFRA-18992


Re: [SURVEY] How many people are using customized RestartStrategy(s)

2019-09-23 Thread Steven Wu
When we setup alert like "fullRestarts > 1" for some rolling window, we
want to use counter. if it is a Gauge, "fullRestarts" will never go below 1
after a first full restart. So alert condition will always be true after
first job restart. If we can apply a derivative to the Gauge value, I guess
alert can probably work. I can explore if that is an option or not.

Yeah. Understood that "fullRestart" won't increment when fine grained
recovery happened. I think "task_failures" counter already exists in Flink.



On Sun, Sep 22, 2019 at 7:59 PM Zhu Zhu  wrote:

> Steven,
>
> Thanks for the information. If we can determine this a common issue, we
> can solve it in Flink core.
> To get to that state, I have two questions which need your help:
> 1. Why is gauge not good for alerting? The metric "fullRestart" is a
> Gauge. Does the metric reporter you use report Counter and
> Gauge to external services in different ways? Or anything else can be
> different due to the metric type?
> 2. Is the "number of restarts" what you actually need, rather than
> the "fullRestart" count? If so, I believe we will have such a counter
> metric in 1.10, since the previous "fullRestart" metric value is not the
> number of restarts when grained recovery (feature added 1.9.0) is enabled.
> "fullRestart" reveals how many times entire job graph has been
> restarted. If grained recovery (feature added 1.9.0) is enabled, the graph
> would not be restarted when task failures happen and the "fullRestart"
> value will not increment in such cases.
>
> I'd appreciate if you can help with these questions and we can make better
> decisions for Flink.
>
> Thanks,
> Zhu Zhu
>
> Steven Wu  于2019年9月22日周日 上午3:31写道:
>
>> Zhu Zhu,
>>
>> Flink fullRestart metric is a Gauge, which is not good for alerting on.
>> We publish an equivalent Counter metric for alerting purpose.
>>
>> Thanks,
>> Steven
>>
>> On Thu, Sep 19, 2019 at 7:45 PM Zhu Zhu  wrote:
>>
>>> Thanks Steven for the feedback!
>>> Could you share more information about the metrics you add in you
>>> customized restart strategy?
>>>
>>> Thanks,
>>> Zhu Zhu
>>>
>>> Steven Wu  于2019年9月20日周五 上午7:11写道:
>>>
 We do use config like "restart-strategy:
 org.foobar.MyRestartStrategyFactoryFactory". Mainly to add additional
 metrics than the Flink provided ones.

 On Thu, Sep 19, 2019 at 4:50 AM Zhu Zhu  wrote:

> Thanks everyone for the input.
>
> The RestartStrategy customization is not recognized as a public
> interface as it is not explicitly documented.
> As it is not used from the feedbacks of this survey, I'll conclude
> that we do not need to support customized RestartStrategy for the new
> scheduler in Flink 1.10
>
> Other usages are still supported, including all the strategies and
> configuring ways described in
> https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/task_failure_recovery.html#restart-strategies
> .
>
> Feel free to share in this thread if you has any concern for it.
>
> Thanks,
> Zhu Zhu
>
> Zhu Zhu  于2019年9月12日周四 下午10:33写道:
>
>> Thanks Oytun for the reply!
>>
>> Sorry for not have stated it clearly. When saying "customized
>> RestartStrategy", we mean that users implement an
>> *org.apache.flink.runtime.executiongraph.restart.RestartStrategy* by
>> themselves and use it by configuring like "restart-strategy:
>> org.foobar.MyRestartStrategyFactoryFactory".
>>
>> The usage of restart strategies you mentioned will keep working with
>> the new scheduler.
>>
>> Thanks,
>> Zhu Zhu
>>
>> Oytun Tez  于2019年9月12日周四 下午10:05写道:
>>
>>> Hi Zhu,
>>>
>>> We are using custom restart strategy like this:
>>>
>>> environment.setRestartStrategy(failureRateRestart(2,
>>> Time.minutes(1), Time.minutes(10)));
>>>
>>>
>>> ---
>>> Oytun Tez
>>>
>>> *M O T A W O R D*
>>> The World's Fastest Human Translation Platform.
>>> oy...@motaword.com — www.motaword.com
>>>
>>>
>>> On Thu, Sep 12, 2019 at 7:11 AM Zhu Zhu  wrote:
>>>
 Hi everyone,

 I wanted to reach out to you and ask how many of you are using a
 customized RestartStrategy[1] in production jobs.

 We are currently developing the new Flink scheduler[2] which
 interacts with restart strategies in a different way. We have to 
 re-design
 the interfaces for the new restart strategies (so called
 RestartBackoffTimeStrategy). Existing customized RestartStrategy will 
 not
 work any more with the new scheduler.

 We want to know whether we should keep the way
 to customized RestartBackoffTimeStrategy so that existing customized
 RestartStrategy can be migrated.

 I'd appreciate if you can share the status if you are
 using customized RestartStrategy. That

Re: [DISCUSS] FLIP-48: Pluggable Intermediate Result Storage

2019-09-23 Thread Becket Qin
Hi Stephan,

In terms of the performance concern, please see my understanding below.

## Breaking the pipeline v.s. adding a sink.
If two operators are initially chained, they will belong to the same stage
in the DAG and the same task, therefore the main processing path will just
have one task without serde in the middle. I was trying to see the overhead
of adding a new sink or breaking the pipeline.

  - Adding a new sink introduces serialization cost, and potentially
network IO if the sink writes to a remote storage instead of local file
system.
  - Breaking the pipeline introduces a new stage, a new task, additional
serialization / deserialization cost and potential network IO.

Therefore I thought that adding a new sink will have better performance
than breaking the pipeline because it has lower cost in general.
Please let me know if I missed something.

The above scenarios assume that users want to cache the result in the
middle of an operator chain, but not at the shuffle boundary. If the cache
is at the shuffle boundary, it would duplicate the records unless the
pluggable shuffle service is also the pluggable intermediate result storage
at the same time. In that case, there will be just one copy of the records,
but could be read by either the pluggable shuffle service or the pluggable
intermediate result storage.

## Reading a subset of record
You are right. Any additional indexing / compression / columnizing of the
raw intermediate result introduces overhead. So it only makes sense if the
saving is greater than the overhead. One such example is iteration. In that
case, the cached intermediate results may be read for some undefined times
and the initial overhead of columnizing would be worthwhile.


In general, I am with you that this could be put in an external library. It
is achievable if we only address the cross-session intermediate result
sharing. However, an external library is not sufficient to provide
optimized in-session intermediate result sharing. This is mainly because
when the job exits, RM needs to clean up the intermediate results. So
basically we are choosing between the following two options.

Option 1: in-session sharing is only served by shuffle service, no special
performance optimization.
Option 2: In-session sharing is served by shuffle service by default,
performance optimization can be provided by pluggable intermediate result
storage.

It would be helpful for us to first agree on whether we want to have
performance optimization for in-session intermediate result sharing? If
not, option 1 is good enough. Otherwise, we would need something pluggable
for the in-session intermediate result.

Thoughts?

Thanks,

Jiangjie (Becket) Qin




On Sun, Sep 22, 2019 at 8:44 PM Stephan Ewen  wrote:

> ## About the improvements you mentioned in (1)
>
>   - I am not sure that this helps to improve performance by avoiding to
> break the pipeline.
> Attaching an additional sink, would in virtually any case add even more
> overhead than the pipeline breaking.
> What is your reasoning why it would be faster, all in all?
>
>   - About reading only a subset of the records:
>  - If this is about reading the data once or twice, then
> columnarizing/indexing/compressing the data is more expensive than just
> reading it twice more.
>  - This means turning the mechanism into something like materialized
> view matching, rather than result caching. That should happen in different
> parts of the stack (view matching needs schema, semantics, etc.). I am not
> sure mixing both is even a good idea.
>
>
> ## The way I see the trade-offs are:
>
> Pro in core Flink:
>   - Small improvement to API experience, compared to a library
>
> Contra in core Flink:
>   - added API complexity, maintenance and evolution overhead
>   - not clear what impacts mixing materialized view matching and result
> caching has on the system architecture
>   - Not yet a frequent use case, possibly a frequent use case in the
> future.
>   - Starting as a library allows for merging into the core later when this
> use case becomes major and experience improvement proves big.
>
> Unclear
>   - is breaking the pipeline by introducing a blocking intermediate result
> really worse than duplicating the data into an additional sink?
>
>
> ==> Especially because so we can still make it part of Flink later once the
> use case and approach are a bit more fleshed out, this looks like a strong
> case for starting with a library approach here.
>
> Best,
> Stephan
>
>
>
> On Thu, Sep 19, 2019 at 2:41 AM Becket Qin  wrote:
>
> > Hi Stephan,
> >
> > Sorry for the belated reply. You are right that the functionality
> proposed
> > in this FLIP can be implemented out of the Flink core as an ecosystem
> > project.
> >
> > The main motivation of this FLIP is two folds:
> >
> > 1. Improve the performance of intermediate result sharing in the same
> > session.
> > Using the internal shuffle service to store cached result has two
> potential
> 

Re: [DISCUSS] Contribute Pulsar Flink connector back to Flink

2019-09-23 Thread Becket Qin
Thanks, Stephan.

Sounds good to me. We can still try our best to get new Pulsar connector in
Flink 1.10. In case we do not have time to do that, we will prominently
link the Pulsar connector from the Flink connector docs.

Thanks,

Jiangjie (Becket) Qin

On Mon, Sep 23, 2019 at 4:11 PM Stephan Ewen  wrote:

> Okay, I see your point, Becket.
>
> Then let us prominently link the Pulsar connector from the Flink connector
> docs then, so that users can find it easily.
>
> As soon as FLIP 27 is done, we reach out the Pulsar folks to contribute a
> new connector.
>
> On Mon, Sep 23, 2019 at 3:11 AM Becket Qin  wrote:
>
> > Hi Stephan,
> >
> > I have no doubt about the value of adding Pulsar connector to Flink repo.
> > My concern is about how exactly we are going to do it.
> >
> > As mentioned before, I believe that we can handle connectors more
> > > pragmatically and less strict than the core of Flink, if it helps
> > unlocking
> > > users.
> >
> > I can see the benefit of being less restrict for the initial connector
> code
> > adoption. However, I don't think we should be less restrict on the
> > maintenance commitment once the code is in Flink repo. It only makes
> sense
> > to check in something and ask users to use if we plan to maintain it.
> >
> > If I understand correctly, the current plan so far is following:
> > 1. release 1.10
> >- Check in Pulsar connector on old interface and label it as beta
> > version.
> >- encourage users to try it and report bugs.
> > 2. release 1.11
> >- Check in Pulsar connector on new interface (a.k.a new Pulsar
> > connector) and label it as beta version
> >- Deprecate the old Pulsar connector
> >- Fix bugs reported on old Pulsar connector from release 1.10
> >- Ask users to migrate from old Pulsar connector to new Pulsar
> connector
> > 3. release 1.12
> >- Announce end of support for old Pulsar connector and remove the code
> >- Fix bugs reported on new Pulsar connector.
> >
> > If this is the plan, it seems neither Flink nor the users trying the old
> > Pulsar connector will benefit from this experimental old Pulsar
> connector,
> > because whatever feedbacks we got or bugs we fix on the old Pulsar
> > connector are immediately thrown away in one or two releases.
> >
> > If we check in the old Pulsar connector right now, the only option I see
> is
> > to maintain it for a while (e.g. a year or more). IMO, the immediate
> > deprecation and code removal hurts the users much more than asking them
> to
> > wait for another release. I personally think that we can avoid this
> > maintenance burden by going directly to the new Pulsar connector,
> > especially given that users can still use the connector even if they are
> > not in Flink repo. That said, I am OK with maintaining both old and new
> > Pulsar connector if we believe that having the Pulsar connector available
> > right now in Flink repo is more important.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Sun, Sep 22, 2019 at 9:10 PM Stephan Ewen  wrote:
> >
> > > My assumption is as Sijie's, that once the connector is either part of
> > > Flink, or part of the streamnative repo. No double maintenance.
> > >
> > > I feel this discussion is very much caught in problems that are all
> > > solvable if we want to solve them.
> > > Maybe we can think what our goal for users and the communities is?
> > >
> > >   - Do we want to help build a relationship between the Pulsar and
> Flink
> > > open source communities?
> > >   - Will users find a connector in the streamnative repository?
> > >   - Will users trust a connector that is not part of Flink as much?
> > >
> > > And then decide what is best according to the overall goals there.
> > > As mentioned before, I believe that we can handle connectors more
> > > pragmatically and less strict than the core of Flink, if it helps
> > unlocking
> > > users.
> > >
> > > Best,
> > > Stephan
> > >
> > >
> > >
> > > On Fri, Sep 20, 2019 at 2:10 PM Sijie Guo  wrote:
> > >
> > > > Thanks Becket.
> > > >
> > > > I think it is better for the Flink community to judge the benefits of
> > > doing
> > > > this. I was trying to provide some views from outsiders.
> > > >
> > > > Thanks,
> > > > Sijie
> > > >
> > > > On Fri, Sep 20, 2019 at 10:25 AM Becket Qin 
> > > wrote:
> > > >
> > > > > Hi Sijie,
> > > > >
> > > > > Yes, we will have to support existing old connectors and new
> > connectors
> > > > in
> > > > > parallel for a while. We have to take that maintenance overhead
> > because
> > > > > existing connectors have been used by the users for a long time. I
> > > guess
> > > > It
> > > > > may take at least a year for us to fully remove the old connectors.
> > > > >
> > > > > Process wise, we can do the same for Pulsar connector. But I am not
> > > sure
> > > > if
> > > > > we want to have the same burden on Pulsar connector, and I would
> like
> > > to
> > > > > understand the benefit of doing that.
> > > > >
> > > > > For users, the benefit o

[jira] [Created] (FLINK-14176) add taskmanager link in vertex‘s page of taskmanager

2019-09-23 Thread lining (Jira)
lining created FLINK-14176:
--

 Summary: add taskmanager link in vertex‘s page of taskmanager
 Key: FLINK-14176
 URL: https://issues.apache.org/jira/browse/FLINK-14176
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Reporter: lining


Add taskmanager's link in vertex's page of taskmanager, so user could go to 
taskmanegr's page.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14177) Bump Curator to 3.5.5

2019-09-23 Thread lamber-ken (Jira)
lamber-ken created FLINK-14177:
--

 Summary: Bump Curator to 3.5.5
 Key: FLINK-14177
 URL: https://issues.apache.org/jira/browse/FLINK-14177
 Project: Flink
  Issue Type: Improvement
Reporter: lamber-ken






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-63: Rework table partition support

2019-09-23 Thread JingsongLee
Thanks for you review, I will send a another vote thread from my apache email.

Best,
Jingsong Lee 


--
From:Bowen Li 
Send Time:2019年9月24日(星期二) 03:06
To:JingsongLee 
Cc:dev 
Subject:Re: [DISCUSS] FLIP-63: Rework table partition support

Hi Jingsong,

Thanks for driving this effort!

Besides a few further comments on Catalog APIs that I just left, it LGTM.

Not sure why, but the voting thread in gmail shows in the same thread as
the discussion's. After addressing all the comments, could you start a new,
separate thread to let other people be aware of it?

Thanks,
Bowen

On Mon, Sep 23, 2019 at 1:25 AM JingsongLee 
wrote:

>  Thanks for your discussion on google document.
> Comments addressed and added FileSystem connector chapter, and introduce
> code prototype for file system connector to unify flink file system and
> hive connectors.
>
> Looking forward to your feedbacks. Thank you.
>
> Best,
> Jingsong Lee
>
>
> --
> From:JingsongLee 
> Send Time:2019年9月18日(星期三) 09:45
> To:Kurt Young ; dev 
> Subject:Re: [DISCUSS] FLIP-63: Rework table partition support
>
> Thanks for your reply and google doc comments. It has been discussed
>  for two weeks now. I will start a vote thread.
>
> Best,
> Jingsong Lee
>
>
> --
> From:Kurt Young 
> Send Time:2019年9月16日(星期一) 15:55
> To:dev 
> Cc:JingsongLee 
> Subject:Re: [DISCUSS] FLIP-63: Rework table partition support
>
> +1 to this feature, I left some comments on google doc.
>
> Another comment is I think we should do some reorganize about the content
> when you converting this to a cwiki page. I will have some offline
> discussion
> with you.
>
> Since this feature seems to be a fairly big efforts, so I suggest we can
> settle
> down the design doc ASAP and start vote process.
> Best,
> Kurt
>
>
> On Thu, Sep 12, 2019 at 12:43 PM Biao Liu  wrote:
> Hi Jingsong,
>
>  Thanks for explaining. It looks cool!
>
>  Thanks,
>  Biao /'bɪ.aʊ/
>
>
>
>  On Wed, 11 Sep 2019 at 11:37, JingsongLee  .invalid>
>  wrote:
>
>  > Hi biao, thanks for your feedbacks:
>  >
>  > Actually, the runtime source partition of runtime is similar to split,
>  > which concerns data reading, parallelism and fault tolerance, all the
>  > runtime concepts.
>  > While table partition is only a virtual concept. Users are more likely
> to
>  > choose which partition to read and which partition to write. Users can
>  > manage their partitions.
>  > One is physical implementation correlation, the other is logical concept
>  > correlation.
>  > So I think they are two completely different things.
>  >
>  > About [2], The main problem is that how to write data to a catalog file
>  > system in stream mode, it is a general problem and has little to do with
>  > partition.
>  >
>  > [2]
>  >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html
>  >
>  > Best,
>  > Jingsong Lee
>  >
>  >
>  > --
>  > From:Biao Liu 
>  > Send Time:2019年9月10日(星期二) 14:57
>  > To:dev ; JingsongLee 
>  > Subject:Re: [DISCUSS] FLIP-63: Rework table partition support
>  >
>  > Hi Jingsong,
>  >
>  > Thank you for bringing this discussion. Since I don't have much
> experience
>  > of Flink table/SQL, I'll ask some questions from runtime or engine
>  > perspective.
>  >
>  > > ... where we describe how to partition support in flink and how to
>  > integrate to hive partition.
>  >
>  > FLIP-27 [1] introduces "partition" concept officially. The changes of
>  > FLIP-27 are not only about source interface but also about the whole
>  > infrastructure.
>  > Have you ever thought how to integrate your proposal with these changes?
>  > Or you just want to support "partition" in table layer, there will be no
>  > requirement of underlying infrastructure?
>  >
>  > I have seen a discussion [2] that seems be a requirement of
> infrastructure
>  > to support your proposal. So I have some concerns there might be some
>  > conflicts between this proposal and FLIP-27.
>  >
>  > 1.
>  >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
>  > 2.
>  >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html
>  >
>  > Thanks,
>  > Biao /'bɪ.aʊ/
>  >
>  >
>  >
>  > On Fri, 6 Sep 2019 at 13:22, JingsongLee  .invalid>
>  > wrote:
>  > Hi everyone, thank you for your comments. Mail name was updated
>  >  and streaming-related concepts were added.
>  >
>  >  We would like to start a discussion thread on "FLIP-63: Rework table
>  >  partition support"(Design doc: [1]), where we describe how to partition
>  >  support in flink and how to integrate to hive partition.
>  >
>  >  This FLIP addresses:
>  > - Intr

[VOTE] FLIP-63: Rework table partition support

2019-09-23 Thread Jingsong Lee
Hi Flink devs, after another round of discussion.

I would like to re-start the voting for FLIP-63
Rework table partition support.

FLIP wiki:


https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support

Discussion thread:


http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-63-Rework-table-partition-support-td32770.html

Google Doc:

https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing

Thanks,

Best,
Jingsong Lee


[jira] [Created] (FLINK-14178) maven-shade-plugin 3.2.1 doesn't work on ARM for Flink

2019-09-23 Thread wangxiyuan (Jira)
wangxiyuan created FLINK-14178:
--

 Summary: maven-shade-plugin 3.2.1 doesn't work on ARM for Flink
 Key: FLINK-14178
 URL: https://issues.apache.org/jira/browse/FLINK-14178
 Project: Flink
  Issue Type: Sub-task
  Components: Build System
Affects Versions: 2.0.0
Reporter: wangxiyuan
 Fix For: 2.0.0


recently, maven-shade-plugin  is bumped from 3.0.0 to 3.2.1 by the 
[commit|https://github.com/apache/flink/commit/e7216eebc846a69272c21375af0f4db8009c2e3e].
 While with my test locally on ARM, The Flink build process will be jammed. 
After debugging, I found there is an infinite loop.

Downgrade maven-shade-plugin to 3.1.0 can solve this problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] FLIP-57: Rework FunctionCatalog

2019-09-23 Thread Kurt Young
+1

Best,
Kurt


On Tue, Sep 24, 2019 at 2:30 AM Bowen Li  wrote:

> Hi all,
>
> I'd like to start a voting thread for FLIP-57 [1], which we've reached
> consensus in [2].
>
> This voting will be open for minimum 3 days till 6:30pm UTC, Sep 26.
>
> Thanks,
> Bowen
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-57%3A+Rework+FunctionCatalog
> [2]
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-57-Rework-FunctionCatalog-td32291.html#a32613
>


Re: [DISCUSS] Releasing Flink 1.9.1

2019-09-23 Thread Kurt Young
+1 for the 1.9.1 release and for Jark being the RM.
Thanks Jark for the volunteering.

Best,
Kurt


On Mon, Sep 23, 2019 at 9:17 PM Till Rohrmann  wrote:

> +1 for the 1.9.1 release and for Jark being the RM. I'll help with the
> review of FLINK-14010.
>
> Cheers,
> Till
>
> On Mon, Sep 23, 2019 at 8:32 AM Debasish Ghosh 
> wrote:
>
> > I hope https://issues.apache.org/jira/browse/FLINK-12501 will also be
> part
> > of 1.9.1 ..
> >
> > regards.
> >
> > On Mon, Sep 23, 2019 at 11:39 AM Jeff Zhang  wrote:
> >
> > > FLINK-13708 is also very critical IMO. This would cause invalid flink
> job
> > > (doubled output)
> > >
> > > https://issues.apache.org/jira/browse/FLINK-13708
> > >
> > > Jark Wu  于2019年9月23日周一 下午2:03写道:
> > >
> > > > Hi everyone,
> > > >
> > > > It has already been a month since we released Flink 1.9.0.
> > > > We already have many important bug fixes from which our users can
> > benefit
> > > > in the release-1.9 branch (83 resolved issues).
> > > > Therefore, I propose to create the next bug fix release for Flink
> 1.9.
> > > >
> > > > Most notable fixes are:
> > > >
> > > > - [FLINK-13526] When switching to a non existing catalog or database
> in
> > > the
> > > > SQL Client the client crashes.
> > > > - [FLINK-13568] It is not possible to create a table with a "STRING"
> > data
> > > > type via the SQL DDL.
> > > > - [FLINK-13941] Prevent data-loss by not cleaning up small part files
> > > from
> > > > S3.
> > > > - [FLINK-13490][jdbc] If one column value is null when reading JDBC,
> > the
> > > > following values will all be null.
> > > > - [FLINK-14107][kinesis] When using event time alignment with the
> > > Kinsesis
> > > > Consumer the consumer might deadlock in one corner case.
> > > >
> > > > Furthermore, I would like the following critical issues to be merged
> > > before
> > > > 1.9.1 release:
> > > >
> > > > - [FLINK-14118] Reduce the unnecessary flushing when there is no data
> > > > available for flush which can save 20% ~ 40% CPU. (reviewing)
> > > > - [FLINK-13386] Fix A couple of issues with the new dashboard have
> > > already
> > > > been filed. (PR is created, need review)
> > > > - [FLINK-14010][yarn] The Flink YARN cluster can get into an
> > inconsistent
> > > > state in some cases, where
> > > > leaderhship for JobManager, ResourceManager and Dispatcher components
> > is
> > > > split between two master processes. (PR is created, need review)
> > > >
> > > > I would volunteer as release manager and kick off the release process
> > > once
> > > > blocker issues has been merged. What do you think?
> > > >
> > > > If there is any other blocker issues need to be fixed in 1.9.1,
> please
> > > let
> > > > me know.
> > > >
> > > > Cheers,
> > > > Jark
> > > >
> > >
> > >
> > > --
> > > Best Regards
> > >
> > > Jeff Zhang
> > >
> >
> >
> > --
> > Debasish Ghosh
> > http://manning.com/ghosh2
> > http://manning.com/ghosh
> >
> > Twttr: @debasishg
> > Blog: http://debasishg.blogspot.com
> > Code: http://github.com/debasishg
> >
>


Re: [SURVEY] How many people are using customized RestartStrategy(s)

2019-09-23 Thread Zhu Zhu
Steven,

In my mind, Flink counter only stores its accumulated count and reports
that value. Are you using an external counter directly?
Maybe Flink Meter/MeterView is what you need? It stores the count and
calculates the rate. And it will report its "count" as well as "rate" to
external metric services.

The counter "task_failures" only works if the individual failover strategy
is enabled. However, it is not a public interface and is not suggested to
use, as the fine grained recovery (region failover) now supersedes it.
I've opened a ticket[1] to add a metric to show failovers that respects
fine grained recovery.

[1] https://issues.apache.org/jira/browse/FLINK-14164

Thanks,
Zhu Zhu

Steven Wu  于2019年9月24日周二 上午6:41写道:

>
> When we setup alert like "fullRestarts > 1" for some rolling window, we
> want to use counter. if it is a Gauge, "fullRestarts" will never go below 1
> after a first full restart. So alert condition will always be true after
> first job restart. If we can apply a derivative to the Gauge value, I guess
> alert can probably work. I can explore if that is an option or not.
>
> Yeah. Understood that "fullRestart" won't increment when fine grained
> recovery happened. I think "task_failures" counter already exists in Flink.
>
>
>
> On Sun, Sep 22, 2019 at 7:59 PM Zhu Zhu  wrote:
>
>> Steven,
>>
>> Thanks for the information. If we can determine this a common issue, we
>> can solve it in Flink core.
>> To get to that state, I have two questions which need your help:
>> 1. Why is gauge not good for alerting? The metric "fullRestart" is a
>> Gauge. Does the metric reporter you use report Counter and
>> Gauge to external services in different ways? Or anything else can be
>> different due to the metric type?
>> 2. Is the "number of restarts" what you actually need, rather than
>> the "fullRestart" count? If so, I believe we will have such a counter
>> metric in 1.10, since the previous "fullRestart" metric value is not the
>> number of restarts when grained recovery (feature added 1.9.0) is enabled.
>> "fullRestart" reveals how many times entire job graph has been
>> restarted. If grained recovery (feature added 1.9.0) is enabled, the graph
>> would not be restarted when task failures happen and the "fullRestart"
>> value will not increment in such cases.
>>
>> I'd appreciate if you can help with these questions and we can make
>> better decisions for Flink.
>>
>> Thanks,
>> Zhu Zhu
>>
>> Steven Wu  于2019年9月22日周日 上午3:31写道:
>>
>>> Zhu Zhu,
>>>
>>> Flink fullRestart metric is a Gauge, which is not good for alerting on.
>>> We publish an equivalent Counter metric for alerting purpose.
>>>
>>> Thanks,
>>> Steven
>>>
>>> On Thu, Sep 19, 2019 at 7:45 PM Zhu Zhu  wrote:
>>>
 Thanks Steven for the feedback!
 Could you share more information about the metrics you add in you
 customized restart strategy?

 Thanks,
 Zhu Zhu

 Steven Wu  于2019年9月20日周五 上午7:11写道:

> We do use config like "restart-strategy:
> org.foobar.MyRestartStrategyFactoryFactory". Mainly to add additional
> metrics than the Flink provided ones.
>
> On Thu, Sep 19, 2019 at 4:50 AM Zhu Zhu  wrote:
>
>> Thanks everyone for the input.
>>
>> The RestartStrategy customization is not recognized as a public
>> interface as it is not explicitly documented.
>> As it is not used from the feedbacks of this survey, I'll conclude
>> that we do not need to support customized RestartStrategy for the new
>> scheduler in Flink 1.10
>>
>> Other usages are still supported, including all the strategies and
>> configuring ways described in
>> https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/task_failure_recovery.html#restart-strategies
>> .
>>
>> Feel free to share in this thread if you has any concern for it.
>>
>> Thanks,
>> Zhu Zhu
>>
>> Zhu Zhu  于2019年9月12日周四 下午10:33写道:
>>
>>> Thanks Oytun for the reply!
>>>
>>> Sorry for not have stated it clearly. When saying "customized
>>> RestartStrategy", we mean that users implement an
>>> *org.apache.flink.runtime.executiongraph.restart.RestartStrategy*
>>> by themselves and use it by configuring like "restart-strategy:
>>> org.foobar.MyRestartStrategyFactoryFactory".
>>>
>>> The usage of restart strategies you mentioned will keep working with
>>> the new scheduler.
>>>
>>> Thanks,
>>> Zhu Zhu
>>>
>>> Oytun Tez  于2019年9月12日周四 下午10:05写道:
>>>
 Hi Zhu,

 We are using custom restart strategy like this:

 environment.setRestartStrategy(failureRateRestart(2,
 Time.minutes(1), Time.minutes(10)));


 ---
 Oytun Tez

 *M O T A W O R D*
 The World's Fastest Human Translation Platform.
 oy...@motaword.com — www.motaword.com


 On Thu, Sep 12, 2019 at 7:11 AM Zhu Zhu  wrot

Re: [DISCUSS] FLIP-66: Support time attribute in SQL DDL

2019-09-23 Thread Kurt Young
For histogram-based watermark strategy, one possible solution is that we
still use the stateless scalar function, and keep the stateful objects
directly
in the function. By doing that we will loose some information after the job
get restarted, but I think it might acceptable because histogram-based is
an approximate algorithm after all.

But I agree we will meet some troubles if we want to have some accurate
watermark computation logic. In this case, I would suggest to create a
dedicated upstream job to do the watermark calculation, save the value
into a field. Then in current job, we can just reference to the calculated
field and specify it as this job's watermark.

Best,
Kurt


On Mon, Sep 23, 2019 at 8:49 PM Jark Wu  wrote:

> Hi,
>
> Thanks Fabian for your reply. I agree with your point that the
> histogram-based case need the function to be stateful which is not
> supported currently and in this design.
> Maybe we can support stateful scalar function like TableAggregateFunction.
> We can further discuss how to support this in the future.
> I added this limitation in the "Complex Watermark Strategies" section.
>
> Btw, I also updated how to automatically apply the watermark assigner by
> the planner at the end of "Implementation" section [1].
> This can avoid every TableSource extending DefinedProctimeAttribute to
> carry time attribute information.
>
> If there is no objection, I would like to update the cwiki FLIP page and
> start a new voting process in the next days.
>
> Best,
> Jark
>
> [1]:
>
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit#heading=h.qx7j56dotywd
>
>
> On Fri, 20 Sep 2019 at 22:18, Fabian Hueske  wrote:
>
> > Hi Jark,
> >
> > Thanks for the summary!
> > I like the proposal!
> >
> > It makes it very clear that an event time attribute is an existing column
> > on which watermark metadata is defined whereas a processing time
> attribute
> > is a computed field.
> >
> > I have one comment regarding the section on "Complex Watermark
> Strategies".
> > The proposal says that you can also use a scalar function.
> > I don't think that a "text book" scalar function would be sufficient for
> > more advanced strategies.
> > For example a histogram-based approach would need to remember the values
> of
> > the last x records.
> > The interface of a scalar function would still work for that, but it
> would
> > be a stateful function (which would not be OK for a scalar function).
> > I don't think it's a problem, but wanted to mention it here.
> >
> > Best, Fabian
> >
> > Am Do., 19. Sept. 2019 um 18:05 Uhr schrieb Jark Wu :
> >
> > > Hi everyone,
> > >
> > > Thanks all for the valuable suggestions and feedbacks so far.
> > > Before starting the vote, I would like to summarize the proposed DDL
> > syntax
> > > in the mailing list.
> > >
> > > ## Rowtime Attribute (Watermark Syntax)
> > >
> > > CREATE TABLE table_name (
> > >   WATERMARK FOR  AS 
> > > ) WITH (
> > >   ...
> > > )
> > >
> > > It marks an existing field  as the rowtime attribute, and
> the
> > > watermark is generated by the expression
> .
> > >  can be arbitrary expression which
> > returns a
> > > nullable BIGINT or TIMESTAMP as the watermark value.
> > >
> > > For common cases, users can use the following expressions to define a
> > > strategy.
> > > 1. Bounded Out of Orderness, the strategy can be "rowtimeField -
> INTERVAL
> > > 'string' timeUnit".
> > > 2. Preserve Watermark From Source, the strategy can be
> > > "SYSTEM_WATERMARK()".
> > >
> > > ## Proctime Attribute
> > >
> > > CREATE TABLE table_name (
> > >   ...
> > >   proc AS SYSTEM_PROCTIME()
> > > ) WITH (
> > >   ...
> > > )
> > >
> > > It uses the computed column syntax to add an additional column with
> > > proctime attribute. Here SYSTEM_PROCTIME() is a built-in function.
> > >
> > > For more details and the implementations, please refer to the design
> doc:
> > >
> > >
> >
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit?ts=5d822dba
> > >
> > > Feel free to leave your further feedbacks!
> > >
> > > Thanks,
> > > Jark
> > >
> > > On Thu, 19 Sep 2019 at 11:23, Kurt Young  wrote:
> > >
> > > > +1 to start vote process.
> > > >
> > > > Best,
> > > > Kurt
> > > >
> > > >
> > > > On Thu, Sep 19, 2019 at 10:54 AM Jark Wu  wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > Thanks all for joining the discussion in the doc[1].
> > > > > It seems that the discussion is converged and there is a consensus
> on
> > > the
> > > > > current FLIP document.
> > > > > If there is no objection, I would like to convert it into cwiki
> FLIP
> > > page
> > > > > and start voting process.
> > > > >
> > > > > For more details, please refer to the design doc (it is slightly
> > > changed
> > > > > since the initial proposal).
> > > > >
> > > > > Thanks,
> > > > > Jark
> > > > >
> > > > > [1]:
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsm

[jira] [Created] (FLINK-14179) Wrong description of SqlCommand.SHOW_FUNCTIONS

2019-09-23 Thread Canbin Zheng (Jira)
Canbin Zheng created FLINK-14179:


 Summary: Wrong description of SqlCommand.SHOW_FUNCTIONS
 Key: FLINK-14179
 URL: https://issues.apache.org/jira/browse/FLINK-14179
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Affects Versions: 1.9.0
Reporter: Canbin Zheng
 Fix For: 1.10.0
 Attachments: image-2019-09-24-10-59-26-286.png

Currently '*SHOW FUNCTIONS*' lists not only user-defined functions, but also 
system-defined ones, the description {color:#172b4d}*'Shows all registered 
user-defined functions.'* not correctly depicts this functionality. I think we 
can change the description to '*Shows all system-defined and user-defined 
functions.*'{color}

 

{color:#172b4d}!image-2019-09-24-10-59-26-286.png!{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Confluence permission for FLIP creation

2019-09-23 Thread Danny Chan
Hi all,

As communicated in an email thread, I'm proposing to support computed column in 
Flink SQL . I have a draft design doc that I'd like to convert it to a FLIP. 
Thus, it would be great if anyone who can grant me the write access to 
Confluence. My Confluence ID is danny0405.

It would be nice if any of you can help on this.

Best,
Danny Chan


Re: [VOTE] FLIP-63: Rework table partition support

2019-09-23 Thread Kurt Young
Looks like the wiki is not aligned with latest google doc, could
you update it first?

Best,
Kurt


On Tue, Sep 24, 2019 at 10:19 AM Jingsong Lee 
wrote:

> Hi Flink devs, after another round of discussion.
>
> I would like to re-start the voting for FLIP-63
> Rework table partition support.
>
> FLIP wiki:
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-33%3A+Standardize+Connector+Metrics
> >
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-51%3A+Rework+of+the+Expression+Design
> >
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
>
> Discussion thread:
> <
> https://lists.apache.org/thread.html/65078bad6e047578d502e1e5d92026f13fd9648725f5b74ed330@%3Cdev.flink.apache.org%3E
> >
> <
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-51-Rework-of-the-Expression-Design-td31653.html
> >
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-63-Rework-table-partition-support-td32770.html
>
> Google Doc:
> <
> https://docs.google.com/document/d/1yFDyquMo_-VZ59vyhaMshpPtg7p87b9IYdAtMXv5XmM/edit?usp=sharing
> >
>
> https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing
>
> Thanks,
>
> Best,
> Jingsong Lee
>


Re: [VOTE] FLIP-63: Rework table partition support

2019-09-23 Thread Jingsong Lee
Thank you for your reminder.
Updated.

Best,
Jingsong Lee

On Tue, Sep 24, 2019 at 11:36 AM Kurt Young  wrote:

> Looks like the wiki is not aligned with latest google doc, could
> you update it first?
>
> Best,
> Kurt
>
>
> On Tue, Sep 24, 2019 at 10:19 AM Jingsong Lee 
> wrote:
>
> > Hi Flink devs, after another round of discussion.
> >
> > I would like to re-start the voting for FLIP-63
> > Rework table partition support.
> >
> > FLIP wiki:
> > <
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-33%3A+Standardize+Connector+Metrics
> > >
> > <
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-51%3A+Rework+of+the+Expression+Design
> > >
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> >
> > Discussion thread:
> > <
> >
> https://lists.apache.org/thread.html/65078bad6e047578d502e1e5d92026f13fd9648725f5b74ed330@%3Cdev.flink.apache.org%3E
> > >
> > <
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-51-Rework-of-the-Expression-Design-td31653.html
> > >
> >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-63-Rework-table-partition-support-td32770.html
> >
> > Google Doc:
> > <
> >
> https://docs.google.com/document/d/1yFDyquMo_-VZ59vyhaMshpPtg7p87b9IYdAtMXv5XmM/edit?usp=sharing
> > >
> >
> >
> https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing
> >
> > Thanks,
> >
> > Best,
> > Jingsong Lee
> >
>


-- 
Best, Jingsong Lee


[jira] [Created] (FLINK-14180) Enable config of maximum capacity of FileArchivedExecutionGraphStore.

2019-09-23 Thread Yingjie Cao (Jira)
Yingjie Cao created FLINK-14180:
---

 Summary: Enable config of maximum capacity of 
FileArchivedExecutionGraphStore.
 Key: FLINK-14180
 URL: https://issues.apache.org/jira/browse/FLINK-14180
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Reporter: Yingjie Cao
 Fix For: 1.10.0


Currently, Flink session cluster uses FileArchivedExecutionGraphStore to keep 
finished jobs for historic requests. The FileArchivedExecutionGraphStore purges 
archived ExecutionGraphs only by an expiration time. In a session cluster on 
which runs many batch jobs, it is hard to config the jobstore.expiration-time, 
if configured too short, the historical information may have been deleted when 
the user want to check it, and if configured too long, the web front end may 
response very slowly when the number of finished job is too large. We'd better 
add a new config option to allow config of the maximum capacity of the 
FileArchivedExecutionGraphStore, which is well supported by Guava Cache. Then 
we can set the expiration time to a relative long value and set the maximum 
capacity to an appropriate value which does not make the web ui become too slow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Releasing Flink 1.9.1

2019-09-23 Thread Dian Fu
+1 for 1.9.1 release and Jark being the RM. Thanks Jark for kicking off this 
release and the volunteering. 

Regards,
Dian

> 在 2019年9月24日,上午10:45,Kurt Young  写道:
> 
> +1 for the 1.9.1 release and for Jark being the RM.
> Thanks Jark for the volunteering.
> 
> Best,
> Kurt
> 
> 
> On Mon, Sep 23, 2019 at 9:17 PM Till Rohrmann  wrote:
> 
>> +1 for the 1.9.1 release and for Jark being the RM. I'll help with the
>> review of FLINK-14010.
>> 
>> Cheers,
>> Till
>> 
>> On Mon, Sep 23, 2019 at 8:32 AM Debasish Ghosh 
>> wrote:
>> 
>>> I hope https://issues.apache.org/jira/browse/FLINK-12501 will also be
>> part
>>> of 1.9.1 ..
>>> 
>>> regards.
>>> 
>>> On Mon, Sep 23, 2019 at 11:39 AM Jeff Zhang  wrote:
>>> 
 FLINK-13708 is also very critical IMO. This would cause invalid flink
>> job
 (doubled output)
 
 https://issues.apache.org/jira/browse/FLINK-13708
 
 Jark Wu  于2019年9月23日周一 下午2:03写道:
 
> Hi everyone,
> 
> It has already been a month since we released Flink 1.9.0.
> We already have many important bug fixes from which our users can
>>> benefit
> in the release-1.9 branch (83 resolved issues).
> Therefore, I propose to create the next bug fix release for Flink
>> 1.9.
> 
> Most notable fixes are:
> 
> - [FLINK-13526] When switching to a non existing catalog or database
>> in
 the
> SQL Client the client crashes.
> - [FLINK-13568] It is not possible to create a table with a "STRING"
>>> data
> type via the SQL DDL.
> - [FLINK-13941] Prevent data-loss by not cleaning up small part files
 from
> S3.
> - [FLINK-13490][jdbc] If one column value is null when reading JDBC,
>>> the
> following values will all be null.
> - [FLINK-14107][kinesis] When using event time alignment with the
 Kinsesis
> Consumer the consumer might deadlock in one corner case.
> 
> Furthermore, I would like the following critical issues to be merged
 before
> 1.9.1 release:
> 
> - [FLINK-14118] Reduce the unnecessary flushing when there is no data
> available for flush which can save 20% ~ 40% CPU. (reviewing)
> - [FLINK-13386] Fix A couple of issues with the new dashboard have
 already
> been filed. (PR is created, need review)
> - [FLINK-14010][yarn] The Flink YARN cluster can get into an
>>> inconsistent
> state in some cases, where
> leaderhship for JobManager, ResourceManager and Dispatcher components
>>> is
> split between two master processes. (PR is created, need review)
> 
> I would volunteer as release manager and kick off the release process
 once
> blocker issues has been merged. What do you think?
> 
> If there is any other blocker issues need to be fixed in 1.9.1,
>> please
 let
> me know.
> 
> Cheers,
> Jark
> 
 
 
 --
 Best Regards
 
 Jeff Zhang
 
>>> 
>>> 
>>> --
>>> Debasish Ghosh
>>> http://manning.com/ghosh2
>>> http://manning.com/ghosh
>>> 
>>> Twttr: @debasishg
>>> Blog: http://debasishg.blogspot.com
>>> Code: http://github.com/debasishg
>>> 
>> 



[jira] [Created] (FLINK-14181) TableEnvironment explain maybe causes StreamGraph to be inconsistent

2019-09-23 Thread jianlong miao (Jira)
jianlong miao created FLINK-14181:
-

 Summary: TableEnvironment explain maybe causes StreamGraph to be 
inconsistent
 Key: FLINK-14181
 URL: https://issues.apache.org/jira/browse/FLINK-14181
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Reporter: jianlong miao


Now TableEnviroment.explain will use StreamEnv to generateStreamGraph. That 
will use execEnv::addOperator to add transformations. 
If we have done to the current table after the explain some of the other, 
making sink again, there will be a repeat of transformations, leading to 
repeated job node in the job



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Releasing Flink 1.9.1

2019-09-23 Thread Jark Wu
Thanks Till for reviewing FLINK-14010.

Hi Jeff, I think it makes sense to merge FLINK-13708 before the release (PR
has been reviewed).

Hi Debasish, FLINK-12501 has already been merged in 1.10.0. I'm fine to
cherry-pick it to 1.9 if we
have a consensus this issue could be viewed as a bug. We can continue the
discussion in the JIRA.

Best,
Jark


On Tue, 24 Sep 2019 at 13:39, Dian Fu  wrote:

> +1 for 1.9.1 release and Jark being the RM. Thanks Jark for kicking off
> this release and the volunteering.
>
> Regards,
> Dian
>
> > 在 2019年9月24日,上午10:45,Kurt Young  写道:
> >
> > +1 for the 1.9.1 release and for Jark being the RM.
> > Thanks Jark for the volunteering.
> >
> > Best,
> > Kurt
> >
> >
> > On Mon, Sep 23, 2019 at 9:17 PM Till Rohrmann 
> wrote:
> >
> >> +1 for the 1.9.1 release and for Jark being the RM. I'll help with the
> >> review of FLINK-14010.
> >>
> >> Cheers,
> >> Till
> >>
> >> On Mon, Sep 23, 2019 at 8:32 AM Debasish Ghosh <
> ghosh.debas...@gmail.com>
> >> wrote:
> >>
> >>> I hope https://issues.apache.org/jira/browse/FLINK-12501 will also be
> >> part
> >>> of 1.9.1 ..
> >>>
> >>> regards.
> >>>
> >>> On Mon, Sep 23, 2019 at 11:39 AM Jeff Zhang  wrote:
> >>>
>  FLINK-13708 is also very critical IMO. This would cause invalid flink
> >> job
>  (doubled output)
> 
>  https://issues.apache.org/jira/browse/FLINK-13708
> 
>  Jark Wu  于2019年9月23日周一 下午2:03写道:
> 
> > Hi everyone,
> >
> > It has already been a month since we released Flink 1.9.0.
> > We already have many important bug fixes from which our users can
> >>> benefit
> > in the release-1.9 branch (83 resolved issues).
> > Therefore, I propose to create the next bug fix release for Flink
> >> 1.9.
> >
> > Most notable fixes are:
> >
> > - [FLINK-13526] When switching to a non existing catalog or database
> >> in
>  the
> > SQL Client the client crashes.
> > - [FLINK-13568] It is not possible to create a table with a "STRING"
> >>> data
> > type via the SQL DDL.
> > - [FLINK-13941] Prevent data-loss by not cleaning up small part files
>  from
> > S3.
> > - [FLINK-13490][jdbc] If one column value is null when reading JDBC,
> >>> the
> > following values will all be null.
> > - [FLINK-14107][kinesis] When using event time alignment with the
>  Kinsesis
> > Consumer the consumer might deadlock in one corner case.
> >
> > Furthermore, I would like the following critical issues to be merged
>  before
> > 1.9.1 release:
> >
> > - [FLINK-14118] Reduce the unnecessary flushing when there is no data
> > available for flush which can save 20% ~ 40% CPU. (reviewing)
> > - [FLINK-13386] Fix A couple of issues with the new dashboard have
>  already
> > been filed. (PR is created, need review)
> > - [FLINK-14010][yarn] The Flink YARN cluster can get into an
> >>> inconsistent
> > state in some cases, where
> > leaderhship for JobManager, ResourceManager and Dispatcher components
> >>> is
> > split between two master processes. (PR is created, need review)
> >
> > I would volunteer as release manager and kick off the release process
>  once
> > blocker issues has been merged. What do you think?
> >
> > If there is any other blocker issues need to be fixed in 1.9.1,
> >> please
>  let
> > me know.
> >
> > Cheers,
> > Jark
> >
> 
> 
>  --
>  Best Regards
> 
>  Jeff Zhang
> 
> >>>
> >>>
> >>> --
> >>> Debasish Ghosh
> >>> http://manning.com/ghosh2
> >>> http://manning.com/ghosh
> >>>
> >>> Twttr: @debasishg
> >>> Blog: http://debasishg.blogspot.com
> >>> Code: http://github.com/debasishg
> >>>
> >>
>
>


Re: [DISCUSS] Releasing Flink 1.9.1

2019-09-23 Thread Terry Wang
+1 for the 1.9.1 release and for Jark being the RM.
Thanks Jark for driving on this.

Best,
Terry Wang



> 在 2019年9月24日,下午2:19,Jark Wu  写道:
> 
> Thanks Till for reviewing FLINK-14010.
> 
> Hi Jeff, I think it makes sense to merge FLINK-13708 before the release (PR
> has been reviewed).
> 
> Hi Debasish, FLINK-12501 has already been merged in 1.10.0. I'm fine to
> cherry-pick it to 1.9 if we
> have a consensus this issue could be viewed as a bug. We can continue the
> discussion in the JIRA.
> 
> Best,
> Jark
> 
> 
> On Tue, 24 Sep 2019 at 13:39, Dian Fu  wrote:
> 
>> +1 for 1.9.1 release and Jark being the RM. Thanks Jark for kicking off
>> this release and the volunteering.
>> 
>> Regards,
>> Dian
>> 
>>> 在 2019年9月24日,上午10:45,Kurt Young  写道:
>>> 
>>> +1 for the 1.9.1 release and for Jark being the RM.
>>> Thanks Jark for the volunteering.
>>> 
>>> Best,
>>> Kurt
>>> 
>>> 
>>> On Mon, Sep 23, 2019 at 9:17 PM Till Rohrmann 
>> wrote:
>>> 
 +1 for the 1.9.1 release and for Jark being the RM. I'll help with the
 review of FLINK-14010.
 
 Cheers,
 Till
 
 On Mon, Sep 23, 2019 at 8:32 AM Debasish Ghosh <
>> ghosh.debas...@gmail.com>
 wrote:
 
> I hope https://issues.apache.org/jira/browse/FLINK-12501 will also be
 part
> of 1.9.1 ..
> 
> regards.
> 
> On Mon, Sep 23, 2019 at 11:39 AM Jeff Zhang  wrote:
> 
>> FLINK-13708 is also very critical IMO. This would cause invalid flink
 job
>> (doubled output)
>> 
>> https://issues.apache.org/jira/browse/FLINK-13708
>> 
>> Jark Wu  于2019年9月23日周一 下午2:03写道:
>> 
>>> Hi everyone,
>>> 
>>> It has already been a month since we released Flink 1.9.0.
>>> We already have many important bug fixes from which our users can
> benefit
>>> in the release-1.9 branch (83 resolved issues).
>>> Therefore, I propose to create the next bug fix release for Flink
 1.9.
>>> 
>>> Most notable fixes are:
>>> 
>>> - [FLINK-13526] When switching to a non existing catalog or database
 in
>> the
>>> SQL Client the client crashes.
>>> - [FLINK-13568] It is not possible to create a table with a "STRING"
> data
>>> type via the SQL DDL.
>>> - [FLINK-13941] Prevent data-loss by not cleaning up small part files
>> from
>>> S3.
>>> - [FLINK-13490][jdbc] If one column value is null when reading JDBC,
> the
>>> following values will all be null.
>>> - [FLINK-14107][kinesis] When using event time alignment with the
>> Kinsesis
>>> Consumer the consumer might deadlock in one corner case.
>>> 
>>> Furthermore, I would like the following critical issues to be merged
>> before
>>> 1.9.1 release:
>>> 
>>> - [FLINK-14118] Reduce the unnecessary flushing when there is no data
>>> available for flush which can save 20% ~ 40% CPU. (reviewing)
>>> - [FLINK-13386] Fix A couple of issues with the new dashboard have
>> already
>>> been filed. (PR is created, need review)
>>> - [FLINK-14010][yarn] The Flink YARN cluster can get into an
> inconsistent
>>> state in some cases, where
>>> leaderhship for JobManager, ResourceManager and Dispatcher components
> is
>>> split between two master processes. (PR is created, need review)
>>> 
>>> I would volunteer as release manager and kick off the release process
>> once
>>> blocker issues has been merged. What do you think?
>>> 
>>> If there is any other blocker issues need to be fixed in 1.9.1,
 please
>> let
>>> me know.
>>> 
>>> Cheers,
>>> Jark
>>> 
>> 
>> 
>> --
>> Best Regards
>> 
>> Jeff Zhang
>> 
> 
> 
> --
> Debasish Ghosh
> http://manning.com/ghosh2
> http://manning.com/ghosh
> 
> Twttr: @debasishg
> Blog: http://debasishg.blogspot.com
> Code: http://github.com/debasishg
> 
 
>> 
>>