from:"Biao Liu"

Re: [ANNOUNCE] Rong Rong becomes a Flink committer

2019-07-12 Thread Biao Liu

Congrats, Rong!

Hequn Cheng  于2019年7月12日周五 下午1:09写道：

> Congratulations Rong!
>
> Best, Hequn
>
> On Fri, Jul 12, 2019 at 12:19 PM Jeff Zhang  wrote:
>
>> Congrats, Rong!
>>
>>
>> vino yang  于2019年7月12日周五 上午10:08写道：
>>
>>> congratulations Rong Rong!
>>>
>>> Fabian Hueske  于2019年7月11日周四 下午10:25写道：
>>>
 Hi everyone,

 I'm very happy to announce that Rong Rong accepted the offer of the
 Flink PMC to become a committer of the Flink project.

 Rong has been contributing to Flink for many years, mainly working on
 SQL and Yarn security features. He's also frequently helping out on the
 user@f.a.o mailing lists.

 Congratulations Rong!

 Best, Fabian
 (on behalf of the Flink PMC)

>>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>

Re: [DISCUSS] Drop stale class Program

2019-07-19 Thread Biao Liu

Hi Zili,

Thank you for bring us this discussion.

My gut feeling is +1 for dropping it.
Usually it costs some time to deprecate a public (actually it's
`PublicEvolving`) API. Ideally it should be marked as `Deprecated` first.
Then it might be abandoned it in some later version.

I'm not sure how big the burden is to make it compatible with the enhanced
client API. If it's a critical blocker, I support dropping it radically in
1.10. Of course a survey is necessary. And the result of survey is
acceptable.



Zili Chen  于2019年7月19日周五 下午1:44写道：

> Hi devs,
>
> Participating the thread "Flink client api enhancement"[1], I just notice
> that inside submission codepath of Flink we always has a branch dealing
> with the case that main class of user program is a subclass of
> o.a.f.api.common.Program, which is defined as
>
> @PublicEvolving
> public interface Program {
>   Plan getPhan(String... args);
> }
>
> This class, as user-facing interface, asks user to implement #getPlan
> which return a almost Flink internal class. FLINK-10862[2] pointed out
> this confusion.
>
> Since our codebase contains quite a bit code handling this stale class,
> and also it obstructs the effort enhancing Flink cilent api,
> I'd like to propose dropping it. Or we can start a survey on user list
> to see if there is any user depending on this class.
>
> best,
> tison.
>
> [1]
>
> https://lists.apache.org/thread.html/ce99cba4a10b9dc40eb729d39910f315ae41d80ec74f09a356c73938@%3Cdev.flink.apache.org%3E
> [2] https://issues.apache.org/jira/browse/FLINK-10862
>

Re: [DISCUSS] Drop stale class Program

2019-07-19 Thread Biao Liu

To Flavio, good point for the integration suggestion.

I think it should be considered in the "Flink client api enhancement"
discussion. But the outdated API should be deprecated somehow.

Flavio Pompermaier  于2019年7月19日周五 下午4:21写道：

> In my experience a basic "official" (but optional) program description
> would be very useful indeed (in order to ease the integration with other
> frameworks).
>
> Of course it should be extended and integrated with the REST services and
> the Web UI (when defined) in order to be useful..
> It ease to show to the user what a job does and which parameters it
> requires (optional or mandatory) and with a proper help description.
> Indeed, when we write a Flink job we implement the following interface:
>
> public interface FlinkJob {
>   String getDescription();
>   List getParameters();
>  boolean isStreamingOrBatch();
> }
>
> public class ClusterJobParameter {
>
>   private String paramName;
>   private String paramType = "string";
>   private String paramDesc;
>   private String paramDefaultValue;
>   private Set choices;
>   private boolean mandatory;
> }
>
> This really helps to launch a Flink job by a frontend (if the rest services
> returns back those infos).
>
> Best,
> Flavio
>
> On Fri, Jul 19, 2019 at 9:57 AM Biao Liu  wrote:
>
> > Hi Zili,
> >
> > Thank you for bring us this discussion.
> >
> > My gut feeling is +1 for dropping it.
> > Usually it costs some time to deprecate a public (actually it's
> > `PublicEvolving`) API. Ideally it should be marked as `Deprecated` first.
> > Then it might be abandoned it in some later version.
> >
> > I'm not sure how big the burden is to make it compatible with the
> enhanced
> > client API. If it's a critical blocker, I support dropping it radically
> in
> > 1.10. Of course a survey is necessary. And the result of survey is
> > acceptable.
> >
> >
> >
> > Zili Chen  于2019年7月19日周五 下午1:44写道：
> >
> > > Hi devs,
> > >
> > > Participating the thread "Flink client api enhancement"[1], I just
> notice
> > > that inside submission codepath of Flink we always has a branch dealing
> > > with the case that main class of user program is a subclass of
> > > o.a.f.api.common.Program, which is defined as
> > >
> > > @PublicEvolving
> > > public interface Program {
> > >   Plan getPhan(String... args);
> > > }
> > >
> > > This class, as user-facing interface, asks user to implement #getPlan
> > > which return a almost Flink internal class. FLINK-10862[2] pointed out
> > > this confusion.
> > >
> > > Since our codebase contains quite a bit code handling this stale class,
> > > and also it obstructs the effort enhancing Flink cilent api,
> > > I'd like to propose dropping it. Or we can start a survey on user list
> > > to see if there is any user depending on this class.
> > >
> > > best,
> > > tison.
> > >
> > > [1]
> > >
> > >
> >
> https://lists.apache.org/thread.html/ce99cba4a10b9dc40eb729d39910f315ae41d80ec74f09a356c73938@%3Cdev.flink.apache.org%3E
> > > [2] https://issues.apache.org/jira/browse/FLINK-10862
> > >
> >
>

Re: Request for permission as a contributor

2019-07-19 Thread Biao Liu

Hi,

There is no need for contribution permission anymore, see [1].

1.
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-JIRA-permissions-changed-Only-committers-can-assign-somebody-to-a-ticket-td30695.html

蒋正贵  于2019年7月19日周五 下午5:21写道：

> HiGuys,
>
> I want to contribute to ApacheFlink.Would you please give me the
> permission as a contributor?
> My JIRA ID is zhenggui.
>
>
> best regards!

Re: [DISCUSS] Setup a bui...@flink.apache.org mailing list for travis builds

2019-07-22 Thread Biao Liu

+1, make sense to me.
Mailing list seems to be a more "community" way.

Timo Walther  于2019年7月22日周一 下午4:06写道：

> +1 sounds good to inform people about instabilities or other issues
>
> Regards,
> Timo
>
>
> Am 22.07.19 um 09:58 schrieb Haibo Sun:
> > +1. Sounds good.Letting the PR creators know the build results of the
> master branch can help to determine more quickly whether Travis failures of
> their PR are an exact failure or because of the instability of test case.
> By the way, if the PR creator can abort their own Travis build, I think it
> can improve the efficient use of Travis resources (of course, this
> discussion does not deal with this issue).
> >
> >
> > Best,
> > Haibo
> > At 2019-07-22 12:36:31, "Yun Tang"  wrote:
> >> +1 sounds good, more people could be encouraged to involve in paying
> attention to failure commit.
> >>
> >> Best
> >> Yun Tang
> >> 
> >> From: Becket Qin 
> >> Sent: Monday, July 22, 2019 9:44
> >> To: dev 
> >> Subject: Re: [DISCUSS] Setup a bui...@flink.apache.org mailing list
> for travis builds
> >>
> >> +1. Sounds a good idea to me.
> >>
> >> On Sat, Jul 20, 2019 at 7:07 PM Dian Fu  wrote:
> >>
> >>> Thanks Jark for the proposal, sounds reasonable for me. +1. This ML
> could
> >>> be used for all the build notifications including master and CRON jobs.
> >>>
>  在 2019年7月20日，下午2:55，Xu Forward  写道：
> 
>  +1 ,Thanks jark for the proposal.
>  Best
>  Forward
> 
>  Jark Wu  于2019年7月20日周六 下午12:14写道：
> 
> > Hi all,
> >
> > As far as I know, currently, email notifications of Travis builds for
> > master branch are sent to the commit author when a build was just
> >>> broken or
> > still is broken. And there is no email notifications for CRON builds.
> >
> > Recently, we are suffering from compile errors for scala-2.12 and
> java-9
> > which are only ran in CRON jobs. So I'm figuring out a way to get
> > notifications of CRON builds (or all builds) to quick fix compile
> errors
> > and failed cron tests.
> >
> > After reaching out to @Chesnay Schepler  (thanks
> >>> for
> > the helping), I know that we are using a Slack channel to receive all
> > failed build notifications. However, IMO, email notification might
> be a
> > better way than Slack channel to encourage more people to pay
> attention
> >>> on
> > the builds.
> >
> > So I'm here to propose to setup a bui...@flink.apache.org mailing
> list
> >>> for
> > receiving build notifications. I also find that Beam has such mailing
> >>> list
> > too[1]. After we have such a mailing list, we can integrate it to
> travis
> > according to the travis doc[2].
> >
> > What do you think? Do we need a formal vote for this?
> >
> > Best and thanks,
> > Jark
> >
> > [1]: https://beam.apache.org/community/contact-us/
> > [2]:
> >
> >
> >>>
> https://docs.travis-ci.com/user/notifications/#configuring-email-notifications
> > <
> >
> >>>
> https://docs.travis-ci.com/user/notifications/#configuring-email-notifications
> > <
> >
> >>>
> https://docs.travis-ci.com/user/notifications/#configuring-email-notifications
> >>>
>
>

Re: [ANNOUNCE] Zhijiang Wang has been added as a committer to the Flink project

2019-07-22 Thread Biao Liu

Congrats Zhijiang. Well deserved!

SHI Xiaogang 于2019年7月23日 周二08:35写道：

> Congratulations Zhijiang!
>
> Regards,
> Xiaogang
>
> Guowei Ma  于2019年7月23日周二 上午8:08写道：
>
> > Congratulations Zhijiang
> >
> > 发自我的 iPhone
> >
> > > 在 2019年7月23日，上午12:55，Xuefu Z  写道：
> > >
> > > Congratulations, Zhijiang!
> > >
> > >> On Mon, Jul 22, 2019 at 7:42 AM Bo WANG 
> > wrote:
> > >>
> > >> Congratulations Zhijiang!
> > >>
> > >>
> > >> Best,
> > >>
> > >> Bo WANG
> > >>
> > >>
> > >> On Mon, Jul 22, 2019 at 10:12 PM Robert Metzger 
> > >> wrote:
> > >>
> > >>> Hey all,
> > >>>
> > >>> We've added another committer to the Flink project: Zhijiang Wang.
> > >>>
> > >>> Congratulations Zhijiang!
> > >>>
> > >>> Best,
> > >>> Robert
> > >>> (on behalf of the Flink PMC)
> > >>>
> > >>
> > >
> > >
> > > --
> > > Xuefu Zhang
> > >
> > > "In Honey We Trust!"
> >
>

Re: [ANNOUNCE] Kete Young is now part of the Flink PMC

2019-07-23 Thread Biao Liu

Congrats Kurt!
Well deserved!

Danny Chan  于2019年7月23日周二 下午6:01写道：

> Congratulations Kurt, Well deserved.
>
> Best,
> Danny Chan
> 在 2019年7月23日 +0800 PM5:24，dev@flink.apache.org，写道：
> >
> > Congratulations Kurt, Well deserved.
>

Re: [DISCUSS] Allow at-most-once delivery in case of failures

2019-07-23 Thread Biao Liu

Hi Stephan & Xiaogang,

It's great to see this discussion active again!

It makes sense to me that doing some private optimization and trial through
plugin. I understand that the community could not satisfy every one and
every requirement due to limited resources. The pluggable strategy is a
good way to compromise. In that way, it might be also helpful for improving
the pluggable strategy itself since there might be some reasonable
requirements from the plugin.

Regarding to the "at-most-once" or "best-effort" semantics, I think it
worths going further since we heard these requirements several times.
However I think we need more investigations of implementing based on
pluggable shuffle service and scheduler (or some more components?). There
might be a public discussion when we are ready. I hope it would happen soon.


On Wed, Jul 24, 2019 at 9:43 AM SHI Xiaogang  wrote:

> Hi Stephan,
>
> I agree with you that  the implementation of "at-most-once" or
> "best-effort" recovery will benefit from pluggable shuffle service and
> pluggable scheduler.  Actually we made some attempts in our private
> repository and it turns out that it requires quite a lot of work to
> implement this with exsiting network stack. We can start the work on this
> when pluggable shuffle service and pluggable scheduler are ready.
>
> The suggestion of external implementation is a very good idea. That way, we
> can implement both "at-most-once" and "best-effort" guarantees as different
> checkpoint/failover strategies. If so, i think we should focus on the
> components that are changed in different strategies. These components may
> include a pluggable checkpoint barrier handler and a pluggable failover
> strategy. We can list these components and discuss implementation details
> then.
>
> What do you think, Biao Liu and Zhu Zhu?
>
> Regards,
> Xiaogang
>
>
> Stephan Ewen  于2019年7月24日周三 上午1:31写道：
>
> > Hi all!
> >
> > This is an interesting discussion for sure.
> >
> > Concerning user requests for changes modes, I also hear the following
> quite
> > often:
> >   - reduce the expensiveness of checkpoint alignment (unaligned
> > checkpoints) to make checkpoints fast/stable under high backpressure
> >   - more fine-grained failover while maintaining exactly-once (even if
> > costly)
> >
> > Having also "at most once" to the mix is quite a long list of big changes
> > to the system.
> >
> > My feeling is that on such a core system, the community can not push all
> > these efforts at the same time, especially because they touch overlapping
> > areas of the system and need the same committers involved.
> >
> > On the other hand, the pluggable shuffle service and pluggable scheduler
> > could make it possible to have an external implementation of that.
> >   - of a network stack that supports "reconnects" of failed tasks with
> > continuing tasks
> >   - a scheduling strategy that restarts tasks individually even in
> > pipelined regions
> >
> > I think contributors/committers could implements this separate from the
> > Flink core. The feature would be trial-run it through the community
> > packages. If it gains a lot of traction, the community could decide to
> put
> > in the effort to merge this into the core.
> >
> > Best,
> > Stephan
> >
> >
> > On Tue, Jun 11, 2019 at 2:10 PM SHI Xiaogang 
> > wrote:
> >
> > > Hi All,
> > >
> > > It definitely requires a massive effort to allow at-most-once delivery
> in
> > > Flink. But as the feature is urgently demanded by many Flink users, i
> > think
> > > every effort we made is worthy. Actually, the inability to support
> > > at-most-once delivery has become a major obstacle for Storm users to
> turn
> > > to Flink. It's undesirable for us to run different stream processing
> > > systems for different scenarios.
> > >
> > > I agree with Zhu Zhu that the guarantee we provide is the very first
> > thing
> > > to be discussed. Recovering with checkpoints will lead to duplicated
> > > records, thus break the guarantee on at-most-once delivery.
> > >
> > > A method to achieve at-most-once guarantee is to completely disable
> > > checkpointing and let sources only read those records posted after they
> > > start. The method requires sources to allow the configuration to read
> > > latest records, which luckily is supported by many message queues
> > including
> > > Kafka. As Flink relies sources' ability to rollback to

Re: [DISCUSS] FLIP-27: Refactor Source Interface

2019-07-25 Thread Biao Liu

Hi devs,

Since 1.9 is nearly released, I think we could get back to FLIP-27. I
believe it should be included in 1.10.

There are so many things mentioned in document of FLIP-27. [1] I think we'd
better discuss them separately. However the wiki is not a good place to
discuss. I wrote google doc about SplitReader API which misses some details
in the document. [2]

1.
https://cwiki.apache.org/confluence/display/FLINK/FLIP-27:+Refactor+Source+Interface
2.
https://docs.google.com/document/d/1R1s_89T4S3CZwq7Tf31DciaMCqZwrLHGZFqPASu66oE/edit?usp=sharing

CC Stephan, Aljoscha, Piotrek, Becket


On Thu, Mar 28, 2019 at 4:38 PM Biao Liu  wrote:

> Hi Steven,
> Thank you for the feedback. Please take a look at the document FLIP-27
> <https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface>
>  which
> is updated recently. A lot of details of enumerator were added in this
> document. I think it would help.
>
> Steven Wu  于2019年3月28日周四 下午12:52写道：
>
>> This proposal mentioned that SplitEnumerator might run on the JobManager
>> or
>> in a single task on a TaskManager.
>>
>> if enumerator is a single task on a taskmanager, then the job DAG can
>> never
>> been embarrassingly parallel anymore. That will nullify the leverage of
>> fine-grained recovery for embarrassingly parallel jobs.
>>
>> It's not clear to me what's the implication of running enumerator on the
>> jobmanager. So I will leave that out for now.
>>
>> On Mon, Jan 28, 2019 at 3:05 AM Biao Liu  wrote:
>>
>> > Hi Stephan & Piotrek,
>> >
>> > Thank you for feedback.
>> >
>> > It seems that there are a lot of things to do in community. I am just
>> > afraid that this discussion may be forgotten since there so many
>> proposals
>> > recently.
>> > Anyway, wish to see the split topics soon :)
>> >
>> > Piotr Nowojski  于2019年1月24日周四 下午8:21写道：
>> >
>> > > Hi Biao!
>> > >
>> > > This discussion was stalled because of preparations for the open
>> sourcing
>> > > & merging Blink. I think before creating the tickets we should split
>> this
>> > > discussion into topics/areas outlined by Stephan and create Flips for
>> > that.
>> > >
>> > > I think there is no chance for this to be completed in couple of
>> > remaining
>> > > weeks/1 month before 1.8 feature freeze, however it would be good to
>> aim
>> > > with those changes for 1.9.
>> > >
>> > > Piotrek
>> > >
>> > > > On 20 Jan 2019, at 16:08, Biao Liu  wrote:
>> > > >
>> > > > Hi community,
>> > > > The summary of Stephan makes a lot sense to me. It is much clearer
>> > indeed
>> > > > after splitting the complex topic into small ones.
>> > > > I was wondering is there any detail plan for next step? If not, I
>> would
>> > > > like to push this thing forward by creating some JIRA issues.
>> > > > Another question is that should version 1.8 include these features?
>> > > >
>> > > > Stephan Ewen  于2018年12月1日周六 上午4:20写道：
>> > > >
>> > > >> Thanks everyone for the lively discussion. Let me try to summarize
>> > > where I
>> > > >> see convergence in the discussion and open issues.
>> > > >> I'll try to group this by design aspect of the source. Please let
>> me
>> > > know
>> > > >> if I got things wrong or missed something crucial here.
>> > > >>
>> > > >> For issues 1-3, if the below reflects the state of the discussion,
>> I
>> > > would
>> > > >> try and update the FLIP in the next days.
>> > > >> For the remaining ones we need more discussion.
>> > > >>
>> > > >> I would suggest to fork each of these aspects into a separate mail
>> > > thread,
>> > > >> or will loose sight of the individual aspects.
>> > > >>
>> > > >> *(1) Separation of Split Enumerator and Split Reader*
>> > > >>
>> > > >>  - All seem to agree this is a good thing
>> > > >>  - Split Enumerator could in the end live on JobManager (and assign
>> > > splits
>> > > >> via RPC) or in a task (and assign splits via data streams)
>> > > >>  - this discussion is orthogonal and should come later, when the
>> > > interface
>

Re: [DISCUSS] FLIP-27: Refactor Source Interface

2019-07-26 Thread Biao Liu

Hi Stephan,

Thank you for feedback!
Will take a look at your branch before public discussing.


On Fri, Jul 26, 2019 at 12:01 AM Stephan Ewen  wrote:

> Hi Biao!
>
> Thanks for reviving this. I would like to join this discussion, but am
> quite occupied with the 1.9 release, so can we maybe pause this discussion
> for a week or so?
>
> In the meantime I can share some suggestion based on prior experiments:
>
> How to do watermarks / timestamp extractors in a simpler and more flexible
> way. I think that part is quite promising should be part of the new source
> interface.
>
> https://github.com/StephanEwen/flink/tree/source_interface/flink-core/src/main/java/org/apache/flink/api/common/eventtime
>
>
> https://github.com/StephanEwen/flink/blob/source_interface/flink-core/src/main/java/org/apache/flink/api/common/src/SourceOutput.java
>
>
>
> Some experiments on how to build the source reader and its library for
> common threading/split patterns:
>
> https://github.com/StephanEwen/flink/tree/source_interface/flink-core/src/main/java/org/apache/flink/api/common/src
>
>
> Best,
> Stephan
>
>
> On Thu, Jul 25, 2019 at 10:03 AM Biao Liu  wrote:
>
>> Hi devs,
>>
>> Since 1.9 is nearly released, I think we could get back to FLIP-27. I
>> believe it should be included in 1.10.
>>
>> There are so many things mentioned in document of FLIP-27. [1] I think
>> we'd better discuss them separately. However the wiki is not a good place
>> to discuss. I wrote google doc about SplitReader API which misses some
>> details in the document. [2]
>>
>> 1.
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27:+Refactor+Source+Interface
>> 2.
>> https://docs.google.com/document/d/1R1s_89T4S3CZwq7Tf31DciaMCqZwrLHGZFqPASu66oE/edit?usp=sharing
>>
>> CC Stephan, Aljoscha, Piotrek, Becket
>>
>>
>> On Thu, Mar 28, 2019 at 4:38 PM Biao Liu  wrote:
>>
>>> Hi Steven,
>>> Thank you for the feedback. Please take a look at the document FLIP-27
>>> <https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface>
>>>  which
>>> is updated recently. A lot of details of enumerator were added in this
>>> document. I think it would help.
>>>
>>> Steven Wu  于2019年3月28日周四 下午12:52写道：
>>>
>>>> This proposal mentioned that SplitEnumerator might run on the
>>>> JobManager or
>>>> in a single task on a TaskManager.
>>>>
>>>> if enumerator is a single task on a taskmanager, then the job DAG can
>>>> never
>>>> been embarrassingly parallel anymore. That will nullify the leverage of
>>>> fine-grained recovery for embarrassingly parallel jobs.
>>>>
>>>> It's not clear to me what's the implication of running enumerator on the
>>>> jobmanager. So I will leave that out for now.
>>>>
>>>> On Mon, Jan 28, 2019 at 3:05 AM Biao Liu  wrote:
>>>>
>>>> > Hi Stephan & Piotrek,
>>>> >
>>>> > Thank you for feedback.
>>>> >
>>>> > It seems that there are a lot of things to do in community. I am just
>>>> > afraid that this discussion may be forgotten since there so many
>>>> proposals
>>>> > recently.
>>>> > Anyway, wish to see the split topics soon :)
>>>> >
>>>> > Piotr Nowojski  于2019年1月24日周四 下午8:21写道：
>>>> >
>>>> > > Hi Biao!
>>>> > >
>>>> > > This discussion was stalled because of preparations for the open
>>>> sourcing
>>>> > > & merging Blink. I think before creating the tickets we should
>>>> split this
>>>> > > discussion into topics/areas outlined by Stephan and create Flips
>>>> for
>>>> > that.
>>>> > >
>>>> > > I think there is no chance for this to be completed in couple of
>>>> > remaining
>>>> > > weeks/1 month before 1.8 feature freeze, however it would be good
>>>> to aim
>>>> > > with those changes for 1.9.
>>>> > >
>>>> > > Piotrek
>>>> > >
>>>> > > > On 20 Jan 2019, at 16:08, Biao Liu  wrote:
>>>> > > >
>>>> > > > Hi community,
>>>> > > > The summary of Stephan makes a lot sense to me. It is much clearer
>>>> > indeed
>>>> > > > after splittin

Re: [DISCUSS] Removing the flink-mapr-fs module

2019-07-29 Thread Biao Liu

Hi Aljoscha,

Does it mean the MapRFileSystem is no longer supported since 1.9.0?

On Mon, Jul 29, 2019 at 5:19 PM Ufuk Celebi  wrote:

> +1
>
>
> On Mon, Jul 29, 2019 at 11:06 AM Jeff Zhang  wrote:
>
> > +1 to remove it.
> >
> > Aljoscha Krettek  于2019年7月29日周一 下午5:01写道：
> >
> > > Hi,
> > >
> > > Because of recent problems in the dependencies of that module [1] I
> would
> > > suggest that we remove it. If people are using it, they can use the one
> > > from Flink 1.8.
> > >
> > > What do you think about it? It would a) solve the dependency problem
> and
> > > b) make our build a tiny smidgen more lightweight.
> > >
> > > Aljoscha
> > >
> > > [1]
> > >
> >
> https://lists.apache.org/thread.html/16c16db8f4b94dc47e638f059fd53be936d5da423376a8c1092eaad1@%3Cdev.flink.apache.org%3E
> >
> >
> >
> > --
> > Best Regards
> >
> > Jeff Zhang
> >
>

Re: [DISCUSS] Removing the flink-mapr-fs module

2019-07-29 Thread Biao Liu

+1 for removing it.

Actually I encountered this issue several times. I thought it might be
blocked by firewall of China :(

BTW, I think it should be included in release notes.


On Mon, Jul 29, 2019 at 5:37 PM Aljoscha Krettek 
wrote:

> If we remove it, that would mean it’s not supported in Flink 1.9.0, yes.
> Or we only remove it in Flink 1.10.0.
>
> Aljoscha
>
> > On 29. Jul 2019, at 11:35, Biao Liu  wrote:
> >
> > Hi Aljoscha,
> >
> > Does it mean the MapRFileSystem is no longer supported since 1.9.0?
> >
> > On Mon, Jul 29, 2019 at 5:19 PM Ufuk Celebi  wrote:
> >
> >> +1
> >>
> >>
> >> On Mon, Jul 29, 2019 at 11:06 AM Jeff Zhang  wrote:
> >>
> >>> +1 to remove it.
> >>>
> >>> Aljoscha Krettek  于2019年7月29日周一 下午5:01写道：
> >>>
> >>>> Hi,
> >>>>
> >>>> Because of recent problems in the dependencies of that module [1] I
> >> would
> >>>> suggest that we remove it. If people are using it, they can use the
> one
> >>>> from Flink 1.8.
> >>>>
> >>>> What do you think about it? It would a) solve the dependency problem
> >> and
> >>>> b) make our build a tiny smidgen more lightweight.
> >>>>
> >>>> Aljoscha
> >>>>
> >>>> [1]
> >>>>
> >>>
> >>
> https://lists.apache.org/thread.html/16c16db8f4b94dc47e638f059fd53be936d5da423376a8c1092eaad1@%3Cdev.flink.apache.org%3E
> >>>
> >>>
> >>>
> >>> --
> >>> Best Regards
> >>>
> >>> Jeff Zhang
> >>>
> >>
>
>

Re: [ANNOUNCE] Progress updates for Apache Flink 1.9.0 release

2019-07-29 Thread Biao Liu

Hi Gordon,

Thanks for updating progress.

Currently I'm working on FLINK-9900. I need a committer to assign the
ticket to me.

Tzu-Li (Gordon) Tai 于2019年7月30日 周二13:01写道：

> Hi all,
>
> There are quite a few instabilities in our builds right now (master +
> release-1.9), some of which are directed or suspiciously related to the 1.9
> release.
>
> I'll categorize the instabilities into ones which we were already tracking
> in the 1.9 Burndown Kanban board [1] prior to this email, and which ones
> seems to be new or were not monitored so that we draw additional attention
> to them:
>
> *Instabilities that were already being tracked*
>
> - FLINK-13242: StandaloneResourceManagerTest.testStartupPeriod fails on
> Travis [2]
> A fix for this is coming with FLINK-13408 (Schedule
> StandaloneResourceManager.setFailUnfulfillableRequest whenever the
> leadership is acquired) [3]
>
> *New discovered instabilities that we should also start monitoring*
>
> - FLINK-13484: ConnectedComponents E2E fails with
> ResourceNotAvailableException [4]
> - FLINK-13487:
> TaskExecutorPartitionLifecycleTest.testPartitionReleaseAfterReleaseCall
> failed on Travis [5]. FLINK-13476 (Partitions not being properly released
> on cancel) could be the cause [6].
> - FLINK-13488: flink-python fails to build on Travis due to Python 3.3
> install failure [7]
> - FLINK-13489: Heavy deployment E2E fails quite consistently on Travis with
> TM heartbeat timeout [8]
> - FLINK-9900:
> ZooKeeperHighAvailabilityITCase.testRestoreBehaviourWithFaultyStateHandles
> deadlocks [9]
> - FLINK-13377: Streaming SQ E2E fails on Travis with mismatching outputs
> (could just be that the SQL query tested on Travis is indeterministic) [10]
>
> Cheers,
> Gordon
>
> [1]
>
> https://issues.apache.org/jira/secure/RapidBoard.jspa?projectKey=FLINK&rapidView=328
>
> [2]  https://issues.apache.org/jira/browse/FLINK-13242
> [3]  https://issues.apache.org/jira/browse/FLINK-13408
> [4]  https://issues.apache.org/jira/browse/FLINK-13484
> [5]  https://issues.apache.org/jira/browse/FLINK-13487
> [6]  https://issues.apache.org/jira/browse/FLINK-13476
> [7]  https://issues.apache.org/jira/browse/FLINK-13488
> [8]  https://issues.apache.org/jira/browse/FLINK-13489
> [9]  https://issues.apache.org/jira/browse/FLINK-9900
> [10] https://issues.apache.org/jira/browse/FLINK-13377
>
> On Sun, Jul 28, 2019 at 6:14 AM zhijiang  .invalid>
> wrote:
>
> > Hi Gordon,
> >
> > Thanks for the following updates of current progress.
> > In addition, it might be better to also cover the fix of network resource
> > leak in jira ticket [1] which would be merged soon I think.
> >
> > [1] FLINK-13245: This fixes the leak of releasing reader/view with
> > partition in network stack.
> >
> > Best,
> > Zhijiang
> > --
> > From:Tzu-Li (Gordon) Tai 
> > Send Time:2019年7月27日(星期六) 10:41
> > To:dev 
> > Subject:Re: [ANNOUNCE] Progress updates for Apache Flink 1.9.0 release
> >
> > Hi all,
> >
> > It's been a while since our last update for the release testing of 1.9.0,
> > so I want to bring attention to the current status of the release.
> >
> > We are approaching RC1 soon, waiting on the following specific last
> ongoing
> > threads to be closed:
> > - FLINK-13241: This fixes a problem where when using YARN, slot
> allocation
> > requests may be ignored [1]
> > - FLINK-13371: Potential partitions resource leak in case of producer
> > restarts [2]
> > - FLINK-13350: Distinguish between temporary tables and persisted tables
> > [3]. Strictly speaking this would be a new feature, but there was a
> > discussion here [4] to include a workaround for now in 1.9.0, and a
> proper
> > solution later on in 1.10.x.
> > - FLINK-12858: Potential distributed deadlock in case of synchronous
> > savepoint failure [5]
> >
> > The above is the critical path for moving forward with an RC1 for
> official
> > voting.
> > All of them have PRs already, and are currently being reviewed or close
> to
> > being merged.
> >
> > Cheers,
> > Gordon
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-13241
> > [2] https://issues.apache.org/jira/browse/FLINK-13371
> > [3] https://issues.apache.org/jira/browse/FLINK-13350
> > [4]
> >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-temporary-tables-in-SQL-API-td30831.html
> > [5] https://issues.apache.org/jira/browse/FLINK-12858
> >
> > On Tue, Jul 16, 2019 at 5:26 AM Tzu-Li (Gordon) Tai  >
> > wrote:
> >
> > > Update: RC0 for 1.9.0 has been created. Please see [1] for the preview
> > > source / binary releases and Maven artifacts.
> > >
> > > Cheers,
> > > Gordon
> > >
> > > [1]
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/PREVIEW-Apache-Flink-1-9-0-release-candidate-0-td30583.html
> > >
> > > On Mon, Jul 15, 2019 at 6:39 PM Tzu-Li (Gordon) Tai <
> tzuli...@apache.org
> > >
> > > wrote:
> > >
> > >> Hi Flink devs,
> > >>
> > >> As previously announced by Kur

Re: [VOTE] Publish the PyFlink into PyPI

2019-08-01 Thread Biao Liu

Thanks Jincheng for working on this.

+1 (non-binding)

Thanks,
Biao /'bɪ.aʊ/



On Thu, Aug 1, 2019 at 8:55 PM Jark Wu  wrote:

> +1 (non-binding)
>
> Cheers,
> Jark
>
> On Thu, 1 Aug 2019 at 17:45, Yu Li  wrote:
>
> > +1 (non-binding)
> >
> > Thanks for driving this!
> >
> > Best Regards,
> > Yu
> >
> >
> > On Thu, 1 Aug 2019 at 11:41, Till Rohrmann  wrote:
> >
> > > +1
> > >
> > > Cheers,
> > > Till
> > >
> > > On Thu, Aug 1, 2019 at 10:39 AM vino yang 
> wrote:
> > >
> > > > +1 （non-binding）
> > > >
> > > > Jeff Zhang  于2019年8月1日周四 下午4:33写道：
> > > >
> > > > > +1 （non-binding）
> > > > >
> > > > > Stephan Ewen  于2019年8月1日周四 下午4:29写道：
> > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > > On Thu, Aug 1, 2019 at 9:52 AM Dian Fu 
> > > wrote:
> > > > > >
> > > > > > > Hi Jincheng,
> > > > > > >
> > > > > > > Thanks a lot for driving this.
> > > > > > > +1 (non-binding).
> > > > > > >
> > > > > > > Regards,
> > > > > > > Dian
> > > > > > >
> > > > > > > > 在 2019年8月1日，下午3:24，jincheng sun 
> 写道：
> > > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > Publish the PyFlink into PyPI is very important for our user,
> > > > Please
> > > > > > vote
> > > > > > > > on the following proposal:
> > > > > > > >
> > > > > > > > 1. Create PyPI Project for Apache Flink Python API, named:
> > > > > > "apache-flink"
> > > > > > > > 2. Release one binary with the default Scala version same
> with
> > > > flink
> > > > > > > > default config.
> > > > > > > > 3. Create an account, named "pyflink" as owner(only PMC can
> > > manage
> > > > > it).
> > > > > > > PMC
> > > > > > > > can add account for the Release Manager, but Release Manager
> > can
> > > > not
> > > > > > > delete
> > > > > > > > the release.
> > > > > > > >
> > > > > > > > [ ] +1, Approve the proposal.
> > > > > > > > [ ] -1, Disapprove the proposal, because ...
> > > > > > > >
> > > > > > > > The vote will be open for at least 72 hours. It is adopted
> by a
> > > > > simple
> > > > > > > > majority with a minimum of three positive votes.
> > > > > > > >
> > > > > > > > See discussion threads for more details [1].
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Jincheng
> > > > > > > >
> > > > > > > > [1]
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Publish-the-PyFlink-into-PyPI-td30095.html
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best Regards
> > > > >
> > > > > Jeff Zhang
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS][CODE STYLE] Breaking long function argument lists and chained method calls

2019-08-02 Thread Biao Liu

Hi Andrey,

Thank you for bringing us this discussion.

I would like to make some details clear. Correct me if I am wrong.

The guide draft [1] says the line length is limited in 100 characters. From
my understanding, this discussion suggests if there is more than 100
characters in one line (both Scala and Java), we should start a new line
(or lines).

*Question 1*: If a line does not exceed 100 characters, should we break the
chained calls into lines? Currently the chained calls always been broken
into lines even it's not too long. Does it just a suggestion or a
limitation?
I prefer it's a limitation which must be respected. And we should always
break the chained calls no matter how long the line is.

For a chained method calls, the new line should be started with the dot.

*Question 2:* The indent of new line should be 1 tab or 2 tabs? Currently
there exists these two different styles. This rule should be also applied
to function arguments.

BTW, big +1 to options from Chesnay. We should make auto-format possible on
our project.

1.
https://docs.google.com/document/d/1owKfK1DwXA-w6qnx3R7t2D_o0BsFkkukGlRhvl3XXjQ/edit#

Thanks,
Biao /'bɪ.aʊ/

On Fri, Aug 2, 2019 at 9:20 AM SHI Xiaogang  wrote:

> Hi Andrey,
>
> Thanks for bringing this. Personally, I prefer to the following style which
> (1) puts the right parenthese in the next line
> (2) a new line for each exception if exceptions can not be put in the same
> line
>
> That way, parentheses are aligned in a similar way to braces and exceptions
> can be well aligned.
>
> *public **void func(*
> *int arg1,*
> *int arg2,*
> *...
> *) throws E1, E2, E3 {*
> *...
> *}*
>
> or
>
> *public **void func(*
> *int arg1,*
> *int arg2,*
> *...
> *) throws
> *E1,
> *E2,
> *E3 {*
> *...
> *}*
>
> Regards,
> Xiaogang
>
> Andrey Zagrebin  于2019年8月1日周四 下午11:19写道：
>
> > Hi all,
> >
> > This is one more small suggestion for the recent thread about code style
> > guide in Flink [1].
> >
> > We already have a note about using a new line for each chained call in
> > Scala, e.g. either:
> >
> > *values**.stream()**.map(...)**,collect(...);*
> >
> > or
> >
> > *values*
> > *.stream()*
> > *.map(*...*)*
> > *.collect(...)*
> >
> > if it would result in a too long line by keeping all chained calls in one
> > line.
> >
> > The suggestion is to have it for Java as well and add the same rule for a
> > long list of function arguments. So it is either:
> >
> > *public void func(int arg1, int arg2, ...) throws E1, E2, E3 {*
> > *...*
> > *}*
> >
> > or
> >
> > *public **void func(*
> > *int arg1,*
> > *int arg2,*
> > *...)** throws E1, E2, E3 {*
> > *...*
> > *}*
> >
> > but thrown exceptions stay on the same last line.
> >
> > Please, feel free to share you thoughts.
> >
> > Best,
> > Andrey
> >
> > [1]
> >
> >
> http://mail-archives.apache.org/mod_mbox/flink-dev/201906.mbox/%3ced91df4b-7cab-4547-a430-85bc710fd...@apache.org%3E
> >
>

Re: [DISCUSS][CODE STYLE] Usage of Java Optional

2019-08-02 Thread Biao Liu

Hi Andrey,

Thanks for working on this.

+1 it's clear and acceptable for me.

To Qi,

IMO the most performance critical codes are "per record" code path. We
should definitely avoid Optional there. For your concern, it's "per buffer"
code path which seems to be acceptable with Optional.

Just one more question, is there any other code paths which are also
critical? I think we'd better note that clearly.

Thanks,
Biao /'bɪ.aʊ/



On Fri, Aug 2, 2019 at 11:08 AM qi luo  wrote:

> Agree that using Optional will improve code robustness. However we’re
> hesitating to use Optional in data intensive operations.
>
> For example, SingleInputGate is already creating Optional for every
> BufferOrEvent in getNextBufferOrEvent(). How much performance gain would we
> get if it’s replaced by null check?
>
> Regards,
> Qi
>
> > On Aug 1, 2019, at 11:00 PM, Andrey Zagrebin 
> wrote:
> >
> > Hi all,
> >
> > This is the next follow up discussion about suggestions for the recent
> > thread about code style guide in Flink [1].
> >
> > In general, one could argue that any variable, which is nullable, can be
> > replaced by wrapping it with Optional to explicitly show that it can be
> > null. Examples are:
> >
> >   - returned values to force user to check not null
> >   - optional function arguments, e.g. with implicit default values
> >   - even class fields as e.g. optional config options with implicit
> >   default values
> >
> >
> > At the same time, we also have @Nullable annotation to express this
> > intention.
> >
> > Also, when the class Optional was introduced, Oracle posted a guideline
> > about its usage [2]. Basically, it suggests to use it mostly in APIs for
> > returned values to inform and force users to check the returned value
> > instead of returning null and avoid NullPointerException.
> >
> > Wrapping with Optional also comes with the performance overhead.
> >
> > Following the Oracle's guide in general, the suggestion is:
> >
> >   - Avoid using Optional in any performance critical code
> >   - Use Optional only to return nullable values in the API/public methods
> >   unless it is performance critical then rather use @Nullable
> >   - Passing an Optional argument to a method can be allowed if it is
> >   within a private helper method and simplifies the code, example is in
> [3]
> >   - Optional should not be used for class fields
> >
> >
> > Please, feel free to share you thoughts.
> >
> > Best,
> > Andrey
> >
> > [1]
> >
> http://mail-archives.apache.org/mod_mbox/flink-dev/201906.mbox/%3ced91df4b-7cab-4547-a430-85bc710fd...@apache.org%3E
> > [2]
> >
> https://www.oracle.com/technetwork/articles/java/java8-optional-2175753.html
> > [3]
> >
> https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/typeutils/AvroFactory.java#L95
>
>

Re: [DISCUSS][CODE STYLE] Usage of Java Optional

2019-08-02 Thread Biao Liu

Hi Jark & Zili,

I thought it means "Optional should not be used for class fields". However
now I get a bit confused about the edited version.

Anyway +1 to "Optional should not be used for class fields"

Thanks,
Biao /'bɪ.aʊ/



On Fri, Aug 2, 2019 at 5:00 PM Zili Chen  wrote:

> Hi Jark,
>
> Follow your opinion, for class field, we can make
> use of @Nullable/@Nonnull annotation or Flink's
> SerializableOptional. It would be sufficient.
>
> Best,
> tison.
>
>
> Jark Wu  于2019年8月2日周五 下午4:57写道：
>
> > Hi Andrey,
> >
> > I have some concern on point (3) "even class fields as e.g. optional
> config
> > options with implicit default values".
> >
> > Regarding to the Oracle's guide (4) "Optional should not be used for
> class
> > fields".
> > And IntelliJ IDEA also report warnings if a class field is Optional,
> > because Optional is not serializable.
> >
> >
> > Do we allow Optional as class field only if the class is not serializable
> > or forbid this totally?
> >
> > Thanks,
> > Jark
> >
> > On Fri, 2 Aug 2019 at 16:30, Biao Liu  wrote:
> >
> > > Hi Andrey,
> > >
> > > Thanks for working on this.
> > >
> > > +1 it's clear and acceptable for me.
> > >
> > > To Qi,
> > >
> > > IMO the most performance critical codes are "per record" code path. We
> > > should definitely avoid Optional there. For your concern, it's "per
> > buffer"
> > > code path which seems to be acceptable with Optional.
> > >
> > > Just one more question, is there any other code paths which are also
> > > critical? I think we'd better note that clearly.
> > >
> > > Thanks,
> > > Biao /'bɪ.aʊ/
> > >
> > >
> > >
> > > On Fri, Aug 2, 2019 at 11:08 AM qi luo  wrote:
> > >
> > > > Agree that using Optional will improve code robustness. However we’re
> > > > hesitating to use Optional in data intensive operations.
> > > >
> > > > For example, SingleInputGate is already creating Optional for every
> > > > BufferOrEvent in getNextBufferOrEvent(). How much performance gain
> > would
> > > we
> > > > get if it’s replaced by null check?
> > > >
> > > > Regards,
> > > > Qi
> > > >
> > > > > On Aug 1, 2019, at 11:00 PM, Andrey Zagrebin  >
> > > > wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > This is the next follow up discussion about suggestions for the
> > recent
> > > > > thread about code style guide in Flink [1].
> > > > >
> > > > > In general, one could argue that any variable, which is nullable,
> can
> > > be
> > > > > replaced by wrapping it with Optional to explicitly show that it
> can
> > be
> > > > > null. Examples are:
> > > > >
> > > > >   - returned values to force user to check not null
> > > > >   - optional function arguments, e.g. with implicit default values
> > > > >   - even class fields as e.g. optional config options with implicit
> > > > >   default values
> > > > >
> > > > >
> > > > > At the same time, we also have @Nullable annotation to express this
> > > > > intention.
> > > > >
> > > > > Also, when the class Optional was introduced, Oracle posted a
> > guideline
> > > > > about its usage [2]. Basically, it suggests to use it mostly in
> APIs
> > > for
> > > > > returned values to inform and force users to check the returned
> value
> > > > > instead of returning null and avoid NullPointerException.
> > > > >
> > > > > Wrapping with Optional also comes with the performance overhead.
> > > > >
> > > > > Following the Oracle's guide in general, the suggestion is:
> > > > >
> > > > >   - Avoid using Optional in any performance critical code
> > > > >   - Use Optional only to return nullable values in the API/public
> > > methods
> > > > >   unless it is performance critical then rather use @Nullable
> > > > >   - Passing an Optional argument to a method can be allowed if it
> is
> > > > >   within a private helper method and simplifies the code, example
> is
> > in
> > > > [3]
> > > > >   - Optional should not be used for class fields
> > > > >
> > > > >
> > > > > Please, feel free to share you thoughts.
> > > > >
> > > > > Best,
> > > > > Andrey
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/flink-dev/201906.mbox/%3ced91df4b-7cab-4547-a430-85bc710fd...@apache.org%3E
> > > > > [2]
> > > > >
> > > >
> > >
> >
> https://www.oracle.com/technetwork/articles/java/java8-optional-2175753.html
> > > > > [3]
> > > > >
> > > >
> > >
> >
> https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/typeutils/AvroFactory.java#L95
> > > >
> > > >
> > >
> >
>

Re: [ANNOUNCE] Hequn becomes a Flink committer

2019-08-07 Thread Biao Liu

Congrats Hequn!

Thanks,
Biao /'bɪ.aʊ/

On Wed, Aug 7, 2019 at 6:00 PM Zhu Zhu  wrote:

> Congratulations to Hequn!
>
> Thanks,
> Zhu Zhu
>
> Zili Chen  于2019年8月7日周三 下午5:16写道：
>
>> Congrats Hequn!
>>
>> Best,
>> tison.
>>
>>
>> Jeff Zhang  于2019年8月7日周三 下午5:14写道：
>>
>>> Congrats Hequn!
>>>
>>> Paul Lam  于2019年8月7日周三 下午5:08写道：
>>>
 Congrats Hequn! Well deserved!

 Best,
 Paul Lam

 在 2019年8月7日，16:28，jincheng sun  写道：

 Hi everyone,

 I'm very happy to announce that Hequn accepted the offer of the Flink
 PMC to become a committer of the Flink project.

 Hequn has been contributing to Flink for many years, mainly working on
 SQL/Table API features. He's also frequently helping out on the user
 mailing lists and helping check/vote the release.

 Congratulations Hequn!

 Best, Jincheng
 (on behalf of the Flink PMC)

>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>>

Re: [DISCUSS] Repository split

2019-08-09 Thread Biao Liu

Hi folks,

Thanks for bringing this discussion Chesnay.

+1 for the motivation. It's really a bad experience of waiting Travis
building for a long time.

WRT the solution, personally I agree with Dawid/David.

IMO the biggest benefit of splitting repository is reducing build time. I
think it could be achieved without splitting the repository. That's the
best solution for me.

And there would be several pains I do really care about if we split the
repository.

1. Most of our users are developer. The non-developer users probably do not
care the code structure at all. They might use the released binary file
directly. For developers, the multiple repositories are not so friendly to
read, build or test the codes. I think it's a big regression.

2. It's definitely a nightmare to work across repositories. As Piotr said,
it's should be a rare case. However Jack raised a good example, debugging a
sub-repository IT case. Image the scenario, I'm debugging an unstable Kafka
IT case. I need to add some logs in runtime components to find some clues.
What should I do? I have to locally install the flink-main project for each
time after adding logs. And it's easy to make mistakes with switching
between repositories time after time.

To sum up, at least for now I agree with Dawid that we should go toward
splitting the CI builds not the repository.

Thanks,
Biao /'bɪ.aʊ/



On Fri, Aug 9, 2019 at 12:55 AM Jark Wu  wrote:

> Hi,
>
> First of all, I agree with Dawid and David's point.
>
> I will share some experience on the repository split. We have been through
> it for Alibaba Blink, which is the most worthwhile project to learn from I
> think.
> We split Blink project into "blink-connectors" and "blink", but we didn't
> get much benefit for better development process. In the contrary, it slow
> down the development sometimes.
> We have suffered from the following issues after split as far as I can see:
>
> 1. Unstable build and test:
> The interface or behavior changes in the underlying (e.g. core, table) will
> lead to build fail and tests fail in the connectors repo. AFAIK, table api
> are still under heavy evolution.
> This will make connectors repo more unstable and makes us busy to fix the
> build problems and tests problems **after-commit**.
> First, it's not easy to locate which commit of main repo lead to the
> connectors repo fail (we have over 70+ commits every day in flink master
> now and it is growing).
> Second, when 2 or 3 build/test problems happened at one time, it's hard to
> fix the problem because we can't make the build/test pass in separate
> hotfix pull requests.
>
> 2. Debug difficulty:
> As modules are separate in different repositories, if we want to debug a
> Kafka IT case,
> we may need to debug some code in flink runtime or verify whether the
> runtime code change
> can fix the Kafka case. However, it will be more complex because they are
> not in one project.
>
> IMO, this actually slows down the development process.
>
> --
>
> In my understanding, the issues we want to solve with the split include:
> 1) long build/testing time
> 2) unstable tests
> 3) increasing number of PRs
>
> Ad. 1 I think we have several ways to reduce the build/testing time. As
> Dawid said, we can trigger corresponding CI in a single repository (without
> to run all the tests).
> An easy way might be to analyse the pom.xml that which modules depends on
> the changed module. And one thing we can do right now is skipping all the
> tests for documentation changes.
>
> Ad. 2 I can't see how unstable connectors tests can be fixed more quickly
> after moved to a separate repositories. As far as I can tell, this problem
> might be more significant.
>
> Ad. 3 I also doubt how repository split could help with this. I think this
> will give the sub-repositories less exposure and bahir-flink[1] is an
> example (only 3 commits in the last 2 months).
>
> At the end, from my point of view,
>   1) if we want to reduce build/testing time, we can start a new thread to
> collect ideas from community. We can try some approaches to see if they can
> solve most of the problems.
>   2) if we want to split repository, we need to be cautious enough to the
> potential development slow down we might meet.
>
> Regards,
> Jark
>
> [1]: https://github.com/apache/bahir-flink/graphs/commit-activity
>
>
>
>
> On Fri, 9 Aug 2019 at 00:26, Till Rohrmann  wrote:
>
> > I pretty much agree with your points Dav/wid. Some problems which we want
> > to solve with a respository split are clearly caused by the existing
> build
> > system (no incremental builds, not enough flexibility to only build a
> > subset of modules). Given that a repository split would be a major
> > endeavour with a lot of uncertainties, changing Flink's build system
> might
> > actually be simpler.
> >
> > In the past I tried to build Flink with Gradle because it better supports
> > incremental builds. Unfortunately, I never got it really off the grounds
> > because of too little t

Re: [VOTE] Flink Project Bylaws

2019-08-13 Thread Biao Liu

+1 (non-binding)

Thanks for pushing this!

Thanks,
Biao /'bɪ.aʊ/



On Wed, 14 Aug 2019 at 09:37, Jark Wu  wrote:

> +1 (non-binding)
>
> Best,
> Jark
>
> On Wed, 14 Aug 2019 at 09:22, Kurt Young  wrote:
>
> > +1 (binding)
> >
> > Best,
> > Kurt
> >
> >
> > On Wed, Aug 14, 2019 at 1:34 AM Yun Tang  wrote:
> >
> > > +1 (non-binding)
> > >
> > > But I have a minor question about "code change" action, for those
> > > "[hotfix]" github pull requests [1], the dev mailing list would not be
> > > notified currently. I think we should change the description of this
> > action.
> > >
> > >
> > > [1]
> > >
> >
> https://flink.apache.org/contributing/contribute-code.html#code-contribution-process
> > >
> > > Best
> > > Yun Tang
> > > 
> > > From: JingsongLee 
> > > Sent: Tuesday, August 13, 2019 23:56
> > > To: dev 
> > > Subject: Re: [VOTE] Flink Project Bylaws
> > >
> > > +1 (non-binding)
> > > Thanks Becket.
> > > I've learned a lot from current bylaws.
> > >
> > > Best,
> > > Jingsong Lee
> > >
> > >
> > > --
> > > From:Yu Li 
> > > Send Time:2019年8月13日(星期二) 17:48
> > > To:dev 
> > > Subject:Re: [VOTE] Flink Project Bylaws
> > >
> > > +1 (non-binding)
> > >
> > > Thanks for the efforts Becket!
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > > On Tue, 13 Aug 2019 at 16:09, Xintong Song 
> > wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Tue, Aug 13, 2019 at 1:48 PM Robert Metzger 
> > > > wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > On Tue, Aug 13, 2019 at 1:47 PM Becket Qin 
> > > wrote:
> > > > >
> > > > > > Thanks everyone for voting.
> > > > > >
> > > > > > For those who have already voted, just want to bring this up to
> > your
> > > > > > attention that there is a minor clarification to the bylaws wiki
> > this
> > > > > > morning. The change is in bold format below:
> > > > > >
> > > > > > one +1 from a committer followed by a Lazy approval (not counting
> > the
> > > > > vote
> > > > > > > of the contributor), moving to lazy majority if a -1 is
> received.
> > > > > > >
> > > > > >
> > > > > >
> > > > > > Note that this implies that committers can +1 their own commits
> and
> > > > merge
> > > > > > > right away. *However, the committe**rs should use their best
> > > > judgement
> > > > > to
> > > > > > > respect the components expertise and ongoing development plan.*
> > > > > >
> > > > > >
> > > > > > This addition does not really change anything the bylaws meant to
> > > set.
> > > > It
> > > > > > is simply a clarification. If anyone who have casted the vote
> > > objects,
> > > > > > please feel free to withdraw the vote.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jiangjie (Becket) Qin
> > > > > >
> > > > > >
> > > > > > On Tue, Aug 13, 2019 at 1:29 PM Piotr Nowojski <
> > pi...@ververica.com>
> > > > > > wrote:
> > > > > >
> > > > > > > +1
> > > > > > >
> > > > > > > > On 13 Aug 2019, at 13:22, vino yang 
> > > wrote:
> > > > > > > >
> > > > > > > > +1
> > > > > > > >
> > > > > > > > Tzu-Li (Gordon) Tai  于2019年8月13日周二
> > > 下午6:32写道：
> > > > > > > >
> > > > > > > >> +1
> > > > > > > >>
> > > > > > > >> On Tue, Aug 13, 2019, 12:31 PM Hequn Cheng <
> > > chenghe...@gmail.com>
> > > > > > > wrote:
> > > > > > > >>
> > > > > > > >>> +1 (non-binding)
> > > > > > > >>>
> > > > > > > >>> Thanks a lot for driving this! Good job. @Becket Qin <
> > > > > > > >> becket@gmail.com
> > > > > > > 
> > > > > > > >>>
> > > > > > > >>> Best, Hequn
> > > > > > > >>>
> > > > > > > >>> On Tue, Aug 13, 2019 at 6:26 PM Stephan Ewen <
> > se...@apache.org
> > > >
> > > > > > wrote:
> > > > > > > >>>
> > > > > > >  +1
> > > > > > > 
> > > > > > >  On Tue, Aug 13, 2019 at 12:22 PM Maximilian Michels <
> > > > > m...@apache.org
> > > > > > >
> > > > > > >  wrote:
> > > > > > > 
> > > > > > > > +1 It's good that we formalize this.
> > > > > > > >
> > > > > > > > On 13.08.19 10:41, Fabian Hueske wrote:
> > > > > > > >> +1 for the proposed bylaws.
> > > > > > > >> Thanks for pushing this Becket!
> > > > > > > >>
> > > > > > > >> Cheers, Fabian
> > > > > > > >>
> > > > > > > >> Am Mo., 12. Aug. 2019 um 16:31 Uhr schrieb Robert
> Metzger
> > <
> > > > > > > >> rmetz...@apache.org>:
> > > > > > > >>
> > > > > > > >>> I changed the permissions of the page.
> > > > > > > >>>
> > > > > > > >>> On Mon, Aug 12, 2019 at 4:21 PM Till Rohrmann <
> > > > > > > >>> trohrm...@apache.org>
> > > > > > > >>> wrote:
> > > > > > > >>>
> > > > > > >  +1 for the proposal. Thanks a lot for driving this
> > > > discussion
> > > > > > >  Becket!
> > > > > > > 
> > > > > > >  Cheers,
> > > > > > >  Till
> > > > > > > 
> > > > > > >  On Mon, Aug 12, 2019 at 3:02 PM Becket Qin <
> > > > > > > >> becket@gmai

Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Biao Liu

Congrats!

Thanks,
Biao /'bɪ.aʊ/



On Thu, 15 Aug 2019 at 10:03, Jark Wu  wrote:

> Congratulations Andrey!
>
>
> Cheers,
> Jark
>
> On Thu, 15 Aug 2019 at 00:57, jincheng sun 
> wrote:
>
> > Congrats Andrey! Very happy to have you onboard :)
> >
> > Best, Jincheng
> >
> > Yu Li  于2019年8月15日周四 上午12:06写道：
> >
> > > Congratulations Andrey! Well deserved!
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > > On Wed, 14 Aug 2019 at 17:55, Aleksey Pak 
> wrote:
> > >
> > > > Congratulations, Andrey!
> > > >
> > > > On Wed, Aug 14, 2019 at 4:53 PM Markos Sfikas  >
> > > > wrote:
> > > >
> > > > > Congrats Andrey!
> > > > >
> > > > > On Wed, 14 Aug 2019 at 16:47, Becket Qin 
> > wrote:
> > > > >
> > > > > > Congratulations, Andrey!
> > > > > >
> > > > > > On Wed, Aug 14, 2019 at 4:35 PM Thomas Weise 
> > wrote:
> > > > > >
> > > > > > > Congrats!
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Aug 14, 2019, 7:12 AM Robert Metzger <
> > rmetz...@apache.org>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Congratulations! Very happy to have you onboard :)
> > > > > > > >
> > > > > > > > On Wed, Aug 14, 2019 at 4:06 PM Kostas Kloudas <
> > > kklou...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Congratulations Andrey!
> > > > > > > > > Well deserved!
> > > > > > > > >
> > > > > > > > > Kostas
> > > > > > > > >
> > > > > > > > > On Wed, Aug 14, 2019 at 4:04 PM Yun Tang  >
> > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Congratulations Andrey.
> > > > > > > > > >
> > > > > > > > > > Best
> > > > > > > > > > Yun Tang
> > > > > > > > > > 
> > > > > > > > > > From: Xintong Song 
> > > > > > > > > > Sent: Wednesday, August 14, 2019 21:40
> > > > > > > > > > To: Oytun Tez 
> > > > > > > > > > Cc: Zili Chen ; Till Rohrmann <
> > > > > > > > > trohrm...@apache.org>; dev ; user <
> > > > > > > > > u...@flink.apache.org>
> > > > > > > > > > Subject: Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink
> > > > committer
> > > > > > > > > >
> > > > > > > > > > Congratulations Andery~!
> > > > > > > > > >
> > > > > > > > > > Thank you~
> > > > > > > > > >
> > > > > > > > > > Xintong Song
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez <
> > > oy...@motaword.com>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Congratulations Andrey!
> > > > > > > > > >
> > > > > > > > > > I am glad the Flink committer team is growing at such a
> > pace!
> > > > > > > > > >
> > > > > > > > > > ---
> > > > > > > > > > Oytun Tez
> > > > > > > > > >
> > > > > > > > > > M O T A W O R D
> > > > > > > > > > The World's Fastest Human Translation Platform.
> > > > > > > > > > oy...@motaword.com — www.motaword.com
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Aug 14, 2019 at 9:29 AM Zili Chen <
> > > > wander4...@gmail.com>
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Congratulations Andrey!
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > tison.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Till Rohrmann  于2019年8月14日周三
> > 下午9:26写道：
> > > > > > > > > >
> > > > > > > > > > Hi everyone,
> > > > > > > > > >
> > > > > > > > > > I'm very happy to announce that Andrey Zagrebin accepted
> > the
> > > > > offer
> > > > > > of
> > > > > > > > > the Flink PMC to become a committer of the Flink project.
> > > > > > > > > >
> > > > > > > > > > Andrey has been an active community member for more than
> 15
> > > > > months.
> > > > > > > He
> > > > > > > > > has helped shaping numerous features such as State TTL,
> > > FRocksDB
> > > > > > > release,
> > > > > > > > > Shuffle service abstraction, FLIP-1, result partition
> > > management
> > > > > and
> > > > > > > > > various fixes/improvements. He's also frequently helping
> out
> > on
> > > > the
> > > > > > > > > user@f.a.o mailing lists.
> > > > > > > > > >
> > > > > > > > > > Congratulations Andrey!
> > > > > > > > > >
> > > > > > > > > > Best, Till
> > > > > > > > > > (on behalf of the Flink PMC)
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Markos Sfikas
> > > > > +49 (0) 15759630002
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-52: Remove legacy Program interface.

2019-08-15 Thread Biao Liu

+1

Thanks Kostas for pushing this.

Thanks,
Biao /'bɪ.aʊ/



On Thu, 15 Aug 2019 at 16:03, Kostas Kloudas  wrote:

> Thanks a lot for the quick response!
> I will consider the Flink Accepted and will start working on it.
>
> Cheers,
> Kostas
>
> On Thu, Aug 15, 2019 at 5:29 AM SHI Xiaogang 
> wrote:
> >
> > +1
> >
> > Glad that programming with flink becomes simpler and easier.
> >
> > Regards,
> > Xiaogang
> >
> > Aljoscha Krettek  于2019年8月14日周三 下午11:31写道：
> >
> > > +1 (for the same reasons I posted on the other thread)
> > >
> > > > On 14. Aug 2019, at 15:03, Zili Chen  wrote:
> > > >
> > > > +1
> > > >
> > > > It could be regarded as part of Flink client api refactor.
> > > > Removal of stale code paths helps reason refactor.
> > > >
> > > > There is one thing worth attention that in this thread[1] Thomas
> > > > suggests an interface with a method return JobGraph based on the
> > > > fact that REST API and in per job mode actually extracts the JobGraph
> > > > from user program and submit it instead of running user program and
> > > > submission happens inside the program in session scenario.
> > > >
> > > > Such an interface would be like
> > > >
> > > > interface Program {
> > > >  JobGraph getJobGraph(args);
> > > > }
> > > >
> > > > Anyway, the discussion above could be continued in that thread.
> > > > Current Program is a legacy class that quite less useful than it
> should
> > > be.
> > > >
> > > > Best,
> > > > tison.
> > > >
> > > > [1]
> > > >
> > >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/REST-API-JarRunHandler-More-flexibility-for-launching-jobs-td31026.html#a31168
> > > >
> > > >
> > > > Stephan Ewen  于2019年8月14日周三 下午7:50写道：
> > > >
> > > >> +1
> > > >>
> > > >> the "main" method is the overwhelming default. getting rid of "two
> ways
> > > to
> > > >> do things" is a good idea.
> > > >>
> > > >> On Wed, Aug 14, 2019 at 1:42 PM Kostas Kloudas 
> > > wrote:
> > > >>
> > > >>> Hi all,
> > > >>>
> > > >>> As discussed in [1] , the Program interface seems to be outdated
> and
> > > >>> there seems to be
> > > >>> no objection to remove it.
> > > >>>
> > > >>> Given that this interface is PublicEvolving, its removal should
> pass
> > > >>> through a FLIP and
> > > >>> this discussion and the associated FLIP are exactly for that
> purpose.
> > > >>>
> > > >>> Please let me know what you think and if it is ok to proceed with
> its
> > > >>> removal.
> > > >>>
> > > >>> Cheers,
> > > >>> Kostas
> > > >>>
> > > >>> link to FLIP-52 :
> > > >>>
> > > >>
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=125308637
> > > >>>
> > > >>> [1]
> > > >>>
> > > >>
> > >
> https://lists.apache.org/x/thread.html/7ffc9936a384b891dbcf0a481d26c6d13b2125607c200577780d1e18@%3Cdev.flink.apache.org%3E
> > > >>>
> > > >>
> > >
> > >
>

Re: CiBot Update

2019-08-22 Thread Biao Liu

Thanks Chesnay a lot,

I love this feature!

Thanks,
Biao /'bɪ.aʊ/



On Thu, 22 Aug 2019 at 20:55, Hequn Cheng  wrote:

> Cool, thanks Chesnay a lot for the improvement!
>
> Best, Hequn
>
> On Thu, Aug 22, 2019 at 5:02 PM Zhu Zhu  wrote:
>
> > Thanks Chesnay for the CI improvement!
> > It is very helpful.
> >
> > Thanks,
> > Zhu Zhu
> >
> > zhijiang  于2019年8月22日周四 下午4:18写道：
> >
> > > It is really very convenient now. Valuable work, Chesnay!
> > >
> > > Best,
> > > Zhijiang
> > > --
> > > From:Till Rohrmann 
> > > Send Time:2019年8月22日(星期四) 10:13
> > > To:dev 
> > > Subject:Re: CiBot Update
> > >
> > > Thanks for the continuous work on the CiBot Chesnay!
> > >
> > > Cheers,
> > > Till
> > >
> > > On Thu, Aug 22, 2019 at 9:47 AM Jark Wu  wrote:
> > >
> > > > Great work! Thanks Chesnay!
> > > >
> > > >
> > > >
> > > > On Thu, 22 Aug 2019 at 15:42, Xintong Song 
> > > wrote:
> > > >
> > > > > The re-triggering travis feature is so convenient. Thanks Chesnay~!
> > > > >
> > > > > Thank you~
> > > > >
> > > > > Xintong Song
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Aug 22, 2019 at 9:26 AM Stephan Ewen 
> > wrote:
> > > > >
> > > > > > Nice, thanks!
> > > > > >
> > > > > > On Thu, Aug 22, 2019 at 3:59 AM Zili Chen 
> > > > wrote:
> > > > > >
> > > > > > > Thanks for your announcement. Nice work!
> > > > > > >
> > > > > > > Best,
> > > > > > > tison.
> > > > > > >
> > > > > > >
> > > > > > > vino yang  于2019年8月22日周四 上午8:14写道：
> > > > > > >
> > > > > > > > +1 for "@flinkbot run travis", it is very convenient.
> > > > > > > >
> > > > > > > > Chesnay Schepler  于2019年8月21日周三
> 下午9:12写道：
> > > > > > > >
> > > > > > > > > Hi everyone,
> > > > > > > > >
> > > > > > > > > this is an update on recent changes to the CI bot.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > The bot now cancels builds if a new commit was added to a
> PR,
> > > and
> > > > > > > > > cancels all builds if the PR was closed.
> > > > > > > > > (This was implemented a while ago; I'm just mentioning it
> > again
> > > > for
> > > > > > > > > discoverability)
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Additionally, starting today you can now re-trigger a
> Travis
> > > run
> > > > by
> > > > > > > > > writing a comment "@flinkbot run travis"; this means you no
> > > > longer
> > > > > > have
> > > > > > > > > to commit an empty commit or do other shenanigans to get
> > > another
> > > > > > build
> > > > > > > > > running.
> > > > > > > > > Note that this will /not/ work if the PR was re-opened,
> until
> > > at
> > > > > > least
> > > > > > > 1
> > > > > > > > > new build was triggered by a push.
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> >
>

Build failure on flink-python

2019-09-02 Thread Biao Liu

Hi guys,

I just found I can't pass the Travis build due to some errors in
flink-python module [1].
I'm sure my PR has nothing related with flink-python. And there are also a
lot of builds are failing on these errors.

I have rebased master branch and tried several times. But it doesn't work.

Could somebody who is familiar with flink-python module take a look at this
failure?

1. https://travis-ci.com/flink-ci/flink/jobs/229989235

Thanks,
Biao /'bɪ.aʊ/

Re: Build failure on flink-python

2019-09-02 Thread Biao Liu

There are already some Jira tickets opened for this failure [1] [2].
Sorry I didn't recognize them.

1. https://issues.apache.org/jira/browse/FLINK-13906
2. https://issues.apache.org/jira/browse/FLINK-13932

Thanks,
Biao /'bɪ.aʊ/



On Mon, 2 Sep 2019 at 16:24, Biao Liu  wrote:

> Hi guys,
>
> I just found I can't pass the Travis build due to some errors in
> flink-python module [1].
> I'm sure my PR has nothing related with flink-python. And there are also a
> lot of builds are failing on these errors.
>
> I have rebased master branch and tried several times. But it doesn't work.
>
> Could somebody who is familiar with flink-python module take a look at
> this failure?
>
> 1. https://travis-ci.com/flink-ci/flink/jobs/229989235
>
> Thanks,
> Biao /'bɪ.aʊ/
>
>

Re: Build failure on flink-python

2019-09-02 Thread Biao Liu

Hi Hequn,

Glad to hear that! Thanks a lot.

Thanks,
Biao /'bɪ.aʊ/



On Mon, 2 Sep 2019 at 17:28, Hequn Cheng  wrote:

> Hi Biao,
>
> Thanks a lot for reporting the problem. The fix has been merged into the
> master just now. You can rebase to the master and try again.
>
> Thanks to @Wei Zhong for the fixing.
>
> Best, Hequn
>
> On Mon, Sep 2, 2019 at 4:41 PM Biao Liu  wrote:
>
> > There are already some Jira tickets opened for this failure [1] [2].
> > Sorry I didn't recognize them.
> >
> > 1. https://issues.apache.org/jira/browse/FLINK-13906
> > 2. https://issues.apache.org/jira/browse/FLINK-13932
> >
> > Thanks,
> > Biao /'bɪ.aʊ/
> >
> >
> >
> > On Mon, 2 Sep 2019 at 16:24, Biao Liu  wrote:
> >
> > > Hi guys,
> > >
> > > I just found I can't pass the Travis build due to some errors in
> > > flink-python module [1].
> > > I'm sure my PR has nothing related with flink-python. And there are
> also
> > a
> > > lot of builds are failing on these errors.
> > >
> > > I have rebased master branch and tried several times. But it doesn't
> > work.
> > >
> > > Could somebody who is familiar with flink-python module take a look at
> > > this failure?
> > >
> > > 1. https://travis-ci.com/flink-ci/flink/jobs/229989235
> > >
> > > Thanks,
> > > Biao /'bɪ.aʊ/
> > >
> > >
> >
>

Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-09 Thread Biao Liu

Congrats, Kostas!

Thanks,
Biao /'bɪ.aʊ/



On Mon, 9 Sep 2019 at 16:07, JingsongLee 
wrote:

> Congrats, Kostas! Well deserved.
>
> Best,
> Jingsong Lee
>
>
> --
> From:Kostas Kloudas 
> Send Time:2019年9月9日(星期一) 15:50
> To:dev ; Yun Gao 
> Subject:Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC
>
> Thanks a lot everyone for the warm welcome!
>
> Cheers,
> Kostas
>
> On Mon, Sep 9, 2019 at 4:54 AM Yun Gao 
> wrote:
> >
> >   Congratulations, Kostas!
> >
> >  Best,
> >  Yun‍‍‍
> >
> >
> >
> >
> > --
> > From:Becket Qin 
> > Send Time:2019 Sep. 9 (Mon.) 10:47
> > To:dev 
> > Subject:Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC
> >
> > Congrats, Kostas!
> >
> > On Sun, Sep 8, 2019 at 11:48 PM myasuka  wrote:
> >
> > > Congratulations Kostas!
> > >
> > > Best
> > > Yun Tang
> > >
> > >
> > >
> > > --
> > > Sent from:
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> > >
> >
>
>

Re: [DISCUSS] FLIP-63: Rework table partition support

2019-09-09 Thread Biao Liu

Hi Jingsong,

Thank you for bringing this discussion. Since I don't have much experience
of Flink table/SQL, I'll ask some questions from runtime or engine
perspective.

> ... where we describe how to partition support in flink and how to
integrate to hive partition.

FLIP-27 [1] introduces "partition" concept officially. The changes of
FLIP-27 are not only about source interface but also about the whole
infrastructure.
Have you ever thought how to integrate your proposal with these changes? Or
you just want to support "partition" in table layer, there will be no
requirement of underlying infrastructure?

I have seen a discussion [2] that seems be a requirement of infrastructure
to support your proposal. So I have some concerns there might be some
conflicts between this proposal and FLIP-27.

1.
https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
2.
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html

Thanks,
Biao /'bɪ.aʊ/

On Fri, 6 Sep 2019 at 13:22, JingsongLee 
wrote:

> Hi everyone, thank you for your comments. Mail name was updated
> and streaming-related concepts were added.
>
> We would like to start a discussion thread on "FLIP-63: Rework table
> partition support"(Design doc: [1]), where we describe how to partition
> support in flink and how to integrate to hive partition.
>
> This FLIP addresses:
>- Introduce whole story about partition support.
>- Introduce and discuss DDL of partition support.
>- Introduce static and dynamic partition insert.
>- Introduce partition pruning
>- Introduce dynamic partition implementation
>- Introduce FileFormatSink to deal with streaming exactly-once and
>  partition-related logic.
>
> Details can be seen in the design document.
> Looking forward to your feedbacks. Thank you.
>
> [1]
> https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing
>
> Best,
> Jingsong Lee

Re: [ANNOUNCE] Zili Chen becomes a Flink committer

2019-09-11 Thread Biao Liu

Congrats Zili!

Thanks,
Biao /'bɪ.aʊ/



On Wed, 11 Sep 2019 at 18:43, Oytun Tez  wrote:

> Congratulations!
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>
>
> On Wed, Sep 11, 2019 at 6:36 AM bupt_ljy  wrote:
>
>> Congratulations!
>>
>>
>> Best,
>>
>> Jiayi Liao
>>
>>  Original Message
>> *Sender:* Till Rohrmann
>> *Recipient:* dev; user
>> *Date:* Wednesday, Sep 11, 2019 17:22
>> *Subject:* [ANNOUNCE] Zili Chen becomes a Flink committer
>>
>> Hi everyone,
>>
>> I'm very happy to announce that Zili Chen (some of you might also know
>> him as Tison Kun) accepted the offer of the Flink PMC to become a committer
>> of the Flink project.
>>
>> Zili Chen has been an active community member for almost 16 months now.
>> He helped pushing the Flip-6 effort over the finish line, ported a lot of
>> legacy code tests, removed a good part of the legacy code, contributed
>> numerous fixes, is involved in the Flink's client API refactoring, drives
>> the refactoring of Flink's HighAvailabilityServices and much more. Zili
>> Chen also helped the community by PR reviews, reporting Flink issues,
>> answering user mails and being very active on the dev mailing list.
>>
>> Congratulations Zili Chen!
>>
>> Best, Till
>> (on behalf of the Flink PMC)
>>
>

Re: [DISCUSS] Features for Apache Flink 1.10

2019-09-11 Thread Biao Liu

Thanks Gary for kicking off the discussion.

+1 for the feature freeze time. Also thanks Gary and Yu Li for volunteering
as the release manager.

BTW, I'm working on refactoring of `CheckpointCoordinator` [1]. It would be
great if it is included in 1.10.

1. https://issues.apache.org/jira/browse/FLINK-13698

Thanks,
Biao /'bɪ.aʊ/



On Wed, 11 Sep 2019 at 18:33, Aljoscha Krettek  wrote:

> Hi,
>
> Thanks for putting together the list! And I’m +1 for the suggested
> release timeline and also for Gary and Yu as the release managers.
>
> Best,
> Aljoscha
>
> On 9 Sep 2019, at 7:39, Yu Li wrote:
>
> > Hi Xuefu,
> >
> > If I understand it correctly, the data type support work should be
> > included
> > in the "Table API improvements->Finish type system" part, please check
> > it
> > and let us know if anything missing there. Thanks.
> >
> > Best Regards,
> > Yu
> >
> >
> > On Mon, 9 Sep 2019 at 11:14, Xuefu Z  wrote:
> >
> >> Looking at feature list, I don't see an item for complete the data
> >> type
> >> support. Specifically, high precision timestamp is needed to Hive
> >> integration, as it's so common. Missing it would damage the
> >> completeness of
> >> our Hive effort.
> >>
> >> Thanks,
> >> Xuefu
> >>
> >> On Sat, Sep 7, 2019 at 7:06 PM Xintong Song 
> >> wrote:
> >>
> >>> Thanks Gray and Yu for compiling the feature list and kicking off
> >>> this
> >>> discussion.
> >>>
> >>> +1 for Gary and Yu being the release managers for Flink 1.10.
> >>>
> >>> Thank you~
> >>>
> >>> Xintong Song
> >>>
> >>>
> >>>
> >>> On Sat, Sep 7, 2019 at 4:58 PM Till Rohrmann 
> >> wrote:
> >>>
>  Thanks for compiling the list of 1.10 efforts for the community
>  Gary. I
>  think this helps a lot to better understand what the community is
> >>> currently
>  working on.
> 
>  Thanks for volunteering as the release managers for the next major
>  release. +1 for Gary and Yu being the RMs for Flink 1.10.
> 
>  Cheers,
>  Till
> 
>  On Sat, Sep 7, 2019 at 7:26 AM Zhu Zhu  wrote:
> 
> > Thanks Gary for kicking off this discussion.
> > Really appreciate that you and Yu offer to help to manage 1.10
> >> release.
> >
> > +1 for Gary and Yu as release managers.
> >
> > Thanks,
> > Zhu Zhu
> >
> > Dian Fu  于2019年9月7日周六
> > 下午12:26写道：
> >
> >> Hi Gary,
> >>
> >> Thanks for kicking off the release schedule of 1.10. +1 for you
> >> and
> >>> Yu
>  Li
> >> as the release manager.
> >>
> >> The feature freeze/release time sounds reasonable.
> >>
> >> Thanks,
> >> Dian
> >>
> >>> 在 2019年9月7日，上午11:30，Jark Wu 
> >>> 写道：
> >>>
> >>> Thanks Gary for kicking off the discussion for 1.10 release.
> >>>
> >>> +1 for Gary and Yu as release managers. Thank you for you
> >>> effort.
> >>>
> >>> Best,
> >>> Jark
> >>>
> >>>
>  在 2019年9月7日，00:52，zhijiang
>  
> >>> 写道：
> 
>  Hi Gary,
> 
>  Thanks for kicking off the features for next release 1.10.  I
>  am
>  very
> >> supportive of you and Yu Li to be the relaese managers.
> 
>  Just mention another two improvements which want to be covered
> >> in
> >> FLINK-1.10 and I already confirmed with Piotr to reach an
> >> agreement
> > before.
> 
>  1. Data serialize and copy only once for broadcast partition
> >> [1]:
> >>> It
> >> would improve the throughput performance greatly in broadcast
> >> mode
> >>> and
> > was
> >> actually proposed in Flink-1.8. Most of works already done before
> >> and
> > only
> >> left the last critical jira/PR. It will not take much efforts to
> >> make
>  it
> >> ready.
> 
>  2. Let Netty use Flink's buffers directly in credit-based mode
> >>> [2] :
> > It
> >> could avoid memory copy from netty stack to flink managed network
>  buffer.
> >> The obvious benefit is decreasing the direct memory overhead
> >> greatly
> >>> in
> >> large-scale jobs. I also heard of some user cases encounter
> >> direct
> >>> OOM
> >> caused by netty memory overhead. Actually this improvment was
> >>> proposed
>  by
> >> nico in FLINK-1.7 and always no time to focus then. Yun Gao
> >> already
> >> submitted a PR half an year ago but have not been reviewed yet. I
> >>> could
> >> help review the deign and PR codes to make it ready.
> 
>  And you could make these two items as lowest priority if
> >> possible.
> 
>  [1] https://issues.apache.org/jira/browse/FLINK-10745
>  [2] https://issues.apache.org/jira/browse/FLINK-10742
> 
>  Best,
>  Zhijiang
> 
> >> --
>  From:Gary Yao 
>  Send Time:2019年9月6日(星期五) 17:06
>  To:dev 
>  Cc:carp84 
>  Su

Re: [DISCUSS] FLIP-63: Rework table partition support

2019-09-11 Thread Biao Liu

Hi Jingsong,

Thanks for explaining. It looks cool!

Thanks,
Biao /'bɪ.aʊ/



On Wed, 11 Sep 2019 at 11:37, JingsongLee 
wrote:

> Hi biao, thanks for your feedbacks:
>
> Actually, the runtime source partition of runtime is similar to split,
> which concerns data reading, parallelism and fault tolerance, all the
> runtime concepts.
> While table partition is only a virtual concept. Users are more likely to
> choose which partition to read and which partition to write. Users can
> manage their partitions.
> One is physical implementation correlation, the other is logical concept
> correlation.
> So I think they are two completely different things.
>
> About [2], The main problem is that how to write data to a catalog file
> system in stream mode, it is a general problem and has little to do with
> partition.
>
> [2]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html
>
> Best,
> Jingsong Lee
>
>
> --
> From:Biao Liu 
> Send Time:2019年9月10日(星期二) 14:57
> To:dev ; JingsongLee 
> Subject:Re: [DISCUSS] FLIP-63: Rework table partition support
>
> Hi Jingsong,
>
> Thank you for bringing this discussion. Since I don't have much experience
> of Flink table/SQL, I'll ask some questions from runtime or engine
> perspective.
>
> > ... where we describe how to partition support in flink and how to
> integrate to hive partition.
>
> FLIP-27 [1] introduces "partition" concept officially. The changes of
> FLIP-27 are not only about source interface but also about the whole
> infrastructure.
> Have you ever thought how to integrate your proposal with these changes?
> Or you just want to support "partition" in table layer, there will be no
> requirement of underlying infrastructure?
>
> I have seen a discussion [2] that seems be a requirement of infrastructure
> to support your proposal. So I have some concerns there might be some
> conflicts between this proposal and FLIP-27.
>
> 1.
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
> 2.
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html
>
> Thanks,
> Biao /'bɪ.aʊ/
>
>
>
> On Fri, 6 Sep 2019 at 13:22, JingsongLee 
> wrote:
> Hi everyone, thank you for your comments. Mail name was updated
>  and streaming-related concepts were added.
>
>  We would like to start a discussion thread on "FLIP-63: Rework table
>  partition support"(Design doc: [1]), where we describe how to partition
>  support in flink and how to integrate to hive partition.
>
>  This FLIP addresses:
> - Introduce whole story about partition support.
> - Introduce and discuss DDL of partition support.
> - Introduce static and dynamic partition insert.
> - Introduce partition pruning
> - Introduce dynamic partition implementation
> - Introduce FileFormatSink to deal with streaming exactly-once and
>   partition-related logic.
>
>  Details can be seen in the design document.
>  Looking forward to your feedbacks. Thank you.
>
>  [1]
> https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing
>
>  Best,
>  Jingsong Lee
>
>

Re: [ANNOUNCE] New Committers and PMC member

2020-04-01 Thread Biao Liu

Congrats!

Thanks,
Biao /'bɪ.aʊ/



On Thu, 2 Apr 2020 at 11:59, Yuan Mei  wrote:

> Congrats :-)
>
> On Wed, Apr 1, 2020 at 4:52 PM Stephan Ewen  wrote:
>
> > Hi all!
> >
> > Happy to announce that over the last few weeks, several people in the
> > community joined in new roles:
> >
> >   - Konstantin Knauf joined as a committer. You may know him, for
> example,
> > from the weekly community updates.
> >
> >   - Dawid Wysakowicz joined the PMC. Dawid is one of the main developers
> on
> > the Table API.
> >
> >   - Zhijiang Wang joined the PMC. Zhijiang is a veteran of Flink's
> network
> > / data shuffle system.
> >
> > A warm welcome to your new roles in the Flink project!
> >
> > Best,
> > Stephan
> >
>

Re: Have trouble on running flink

2019-09-24 Thread Biao Liu

Hi Russell,

I don't think `BackendBuildingException` is root cause. In your case, this
exception appears when task is under cancelling.

Have you ever checked the log of yarn node manager? There should be an exit
code of container. Even more the container is probably killed by yarn node
manager.

BTW, I think we should discuss this in flink-user mailing list, not dev
mailing list. Will forward this mail there.

Thanks,
Biao /'bɪ.aʊ/

On Tue, 24 Sep 2019 at 19:19, Russell Bie  wrote:

> Hi Flink team,
>
> I am trying to submit flink job (version 1.8.2) with RocksDB backend to my
> own yarn cluster (hadoop version 2.6.0-cdh5.7.3), the job always failed
> after running for a few hours with the connection loss of some
> taskmanagers. Here<
> https://stackoverflow.com/questions/58046847/ioexception-when-taskmanager-restored-from-rocksdb-state-in-hdfs>
> is the question details on the stackoverflow. I am just wondering if you
> could provide some advice on this issue?
>
> Thanks,
> Russell
>
>

[SURVEY] How do you use ExternallyInducedSource or WithMasterCheckpointHook

2019-10-09 Thread Biao Liu

Hi everyone,

I would like to reach out to the user who uses or is interested in
`ExternallyInducedSource` or `WithMasterCheckpointHook` interfaces.

The background of this survey is I'm doing some reworking of
`CheckpointCoordinator`. I encountered a problem that the semantics of
`MasterTriggerRestoreHook#triggerCheckpoint` [1] might be a bit confusing.
It looks like an asynchronous invocation (value returned is a future). But
based on the description of java doc, the implementation could be
synchronous (maybe blocking) or asynchronous. It's OK for now. However it
makes the optimization more complicated, to take care of synchronous and
asynchronous invocation at the same time [2].

I want to make the semantics clearer. Here are some options.
1. Keeps this method as the current. But emphasize in java doc and release
note that it should be a non-blocking operation, any heavy IO operation
should be executed asynchronously through the given executor. Otherwise
there might be a performance issue. In this way, it keeps the compatibility.
2. Changes the signature of this method. There will be no executor and
completable future in this method. It could be blocking for a while. We
will execute it in an executor outside. This also makes things easier,
however it breaks the compatibility.

At this moment, personally I intend to choose option 1.
If you depends on these interfaces, please let me know your opinion. Any
feedback is welcome!

[1]
https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/MasterTriggerRestoreHook.java
[2] https://issues.apache.org/jira/browse/FLINK-14344

Thanks,
Biao /'bɪ.aʊ/

Re: [SURVEY] Dropping non Credit-based Flow Control

2019-10-10 Thread Biao Liu

Thanks for start this survey, Piotr.

We have benefitted from credit-based flow control a lot. I can't figure out
a reason to use non credit-based model.
I think we have kept the older code paths long enough (1.5 -> 1.9). That's
a big burden to maintain. Especially there are a lot duplicated codes
between credit-based and non credit-based model.

So +1 to do the cleanup.

Thanks,
Biao /'bɪ.aʊ/



On Thu, 10 Oct 2019 at 11:15, zhijiang 
wrote:

> Thanks for bringing this survey Piotr.
>
> Actually I was also trying to dropping the non credit-based code path from
> release-1.9, and now I think it is the proper time to do it motivated by
> [3].
> The credit-based mode is as default from Flink 1.5 and it has been
> verified to be stable and reliable in many versions. In Alibaba we are
> always using the default credit-based mode in all products.
> It can reduce much overhead of maintaining non credit-based code path, so
> +1 from my side to drop it.
>
> Best,
> Zhijiang
> --
> From:Piotr Nowojski 
> Send Time:2019年10月2日(星期三) 17:01
> To:dev 
> Subject:[SURVEY] Dropping non Credit-based Flow Control
>
> Hi,
>
> In Flink 1.5 we have introduced Credit-based Flow Control [1] in the
> network stack. Back then we were aware about potential downsides of it [2]
> and we decided to keep the old model in the code base ( configurable by
> setting  `taskmanager.network.credit-model: false` ). Now, that we are
> about to modify internals of the network stack again [3], it might be a
> good time to clean up the code and remove the older code paths.
>
> Is anyone still using the non default non Credit-based model (
> `taskmanager.network.credit-model: false`)? If so, why?
>
> Piotrek
>
> [1] https://flink.apache.org/2019/06/05/flink-network-stack.html <
> https://flink.apache.org/2019/06/05/flink-network-stack.html>
> [2]
> https://flink.apache.org/2019/06/05/flink-network-stack.html#what-do-we-gain-where-is-the-catch
> <
> https://flink.apache.org/2019/06/05/flink-network-stack.html#what-do-we-gain-where-is-the-catch
> >
> [3]
> https://lists.apache.org/thread.html/a2b58b7b2b24b9bd4814b2aa51d2fc44b08a919eddbb5b1256be5b6a@%3Cdev.flink.apache.org%3E
> <
> https://lists.apache.org/thread.html/a2b58b7b2b24b9bd4814b2aa51d2fc44b08a919eddbb5b1256be5b6a@%3Cdev.flink.apache.org%3E
> >
>
>

Re: [VOTE] FLIP-74: Flink JobClient API

2019-10-11 Thread Biao Liu

+1 (non-binding), glad to have this improvement!

Thanks,
Biao /'bɪ.aʊ/



On Fri, 11 Oct 2019 at 14:44, Jeff Zhang  wrote:

> +1， overall design make sense to me
>
> SHI Xiaogang  于2019年10月11日周五 上午11:15写道：
>
> > +1. The interface looks fine to me.
> >
> > Regards,
> > Xiaogang
> >
> > Zili Chen  于2019年10月9日周三 下午2:36写道：
> >
> > > Given the ongoing FlinkForward Berlin event, I'm going to extend
> > > this vote thread with a bit of period, said until Oct. 11th(Friday).
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > Zili Chen  于2019年10月7日周一 下午4:15写道：
> > >
> > > > Hi all,
> > > >
> > > > I would like to start the vote for FLIP-74[1], which is discussed and
> > > > reached a consensus in the discussion thread[2].
> > > >
> > > > The vote will be open util Oct. 9th(72h starting on Oct.7th), unless
> > > > there is an objection or not  enough votes.
> > > >
> > > > Best,
> > > > tison.
> > > >
> > > > [1]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-74%3A+Flink+JobClient+API
> > > > [2]
> > > >
> > >
> >
> https://lists.apache.org/x/thread.html/b2e22a45aeb94a8d06b50c4de078f7b23d9ff08b8226918a1a903768@%3Cdev.flink.apache.org%3E
> > > >
> > >
> >
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Re: [PROPOSAL] Contribute Stateful Functions to Apache Flink

2019-10-12 Thread Biao Liu

Hi Stehpan,

+1 for having Stateful Functions in Flink.

Before discussing which repository it should belong, I was wondering if we
have reached an agreement of "splitting flink repository" as Piotr
mentioned or not. It seems that it's just no more further discussion.
It's OK for me to add it to core repository. After all almost everything is
in core repository now. But if we decide to split the core repository
someday, I tend to create a separate repository for Stateful Functions. It
might be good time to take the first step of splitting.

Thanks,
Biao /'bɪ.aʊ/

On Sat, 12 Oct 2019 at 19:31, Yu Li  wrote:

> Hi Stephan,
>
> Big +1 for adding stateful functions to Flink. I believe a lot of user
> would be interested to try this out and I could imagine how this could
> contribute to reduce the TCO for business requiring both streaming
> processing and stateful functions.
>
> And my 2 cents is to put it into flink core repository since I could see a
> tight connection between this library and flink state.
>
> Best Regards,
> Yu
>
>
> On Sat, 12 Oct 2019 at 17:31, jincheng sun 
> wrote:
>
>> Hi Stephan,
>>
>> bit +1 for adding this great features to Apache Flink.
>>
>> Regarding where we should place it, put it into Flink core repository or
>> create a separate repository? I prefer put it into main repository and
>> looking forward the more detail discussion for this decision.
>>
>> Best,
>> Jincheng
>>
>>
>> Jingsong Li  于2019年10月12日周六 上午11:32写道：
>>
>>> Hi Stephan,
>>>
>>> big +1 for this contribution. It provides another user interface that is
>>> easy to use and popular at this time. these functions, It's hard for users
>>> to write in SQL/TableApi, while using DataStream is too complex. (We've
>>> done some stateFun kind jobs using DataStream before). With statefun, it is
>>> very easy.
>>>
>>> I think it's also a good opportunity to exercise Flink's core
>>> capabilities. I looked at stateful-functions-flink briefly, it is very
>>> interesting. I think there are many other things Flink can improve. So I
>>> think it's a better thing to put it into Flink, and the improvement for it
>>> will be more natural in the future.
>>>
>>> Best,
>>> Jingsong Lee
>>>
>>> On Fri, Oct 11, 2019 at 7:33 PM Dawid Wysakowicz 
>>> wrote:
>>>
 Hi Stephan,

 I think this is a nice library, but what I like more about it is that
 it suggests exploring different use-cases. I think it definitely makes
 sense for the Flink community to explore more lightweight applications that
 reuses resources. Therefore I definitely think it is a good idea for Flink
 community to accept this contribution and help maintaining it.

 Personally I'd prefer to have it in a separate repository. There were a
 few discussions before where different people were suggesting to extract
 connectors and other libraries to separate repositories. Moreover I think
 it could serve as an example for the Flink ecosystem website[1]. This could
 be the first project in there and give a good impression that the community
 sees potential in the ecosystem website.

 Lastly, I'm wondering if this should go through PMC vote according to
 our bylaws[2]. In the end the suggestion is to adopt an existing code base
 as is. It also proposes a new programs concept that could result in a shift
 of priorities for the community in a long run.

 Best,

 Dawid

 [1]
 http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Create-a-Flink-ecosystem-website-td27519.html

 [2] https://cwiki.apache.org/confluence/display/FLINK/Flink+Bylaws
 On 11/10/2019 13:12, Till Rohrmann wrote:

 Hi Stephan,

 +1 for adding stateful functions to Flink. I believe the new set of
 applications this feature will unlock will be super interesting for new and
 existing Flink users alike.

 One reason for not including it in the main repository would to not
 being bound to Flink's release cadence. This would allow to release faster
 and more often. However, I believe that having it eventually in Flink's
 main repository would be beneficial in the long run.

 Cheers,
 Till

 On Fri, Oct 11, 2019 at 12:56 PM Trevor Grant 
 wrote:

> +1 non-binding on contribution.
>
> Separate repo, or feature branch to start maybe? I just feel like in
> the beginning this thing is going to have lots of breaking changes that
> maybe aren't going to fit well with tests / other "v1+" release code. Just
> my .02.
>
>
>
> On Fri, Oct 11, 2019 at 4:38 AM Stephan Ewen  wrote:
>
>> Dear Flink Community!
>>
>> Some of you probably heard it already: On Tuesday, at Flink Forward
>> Berlin, we announced **Stateful Functions**.
>>
>> Stateful Functions is a library on Flink to implement general purpose
>> applications. It is built around stateful functions (who woul

Re: [VOTE] Accept Stateful Functions into Apache Flink

2019-10-21 Thread Biao Liu

+1 (non-binding)

Thanks,
Biao /'bɪ.aʊ/



On Tue, 22 Oct 2019 at 10:26, Jark Wu  wrote:

> +1 (non-binding)
>
> Best,
> Jark
>
> On Tue, 22 Oct 2019 at 09:38, Hequn Cheng  wrote:
>
> > +1 (non-binding)
> >
> > Best, Hequn
> >
> > On Tue, Oct 22, 2019 at 9:21 AM Dian Fu  wrote:
> >
> > > +1 (non-binding)
> > >
> > > Regards,
> > > Dian
> > >
> > > > 在 2019年10月22日，上午9:10，Kurt Young  写道：
> > > >
> > > > +1 (binding)
> > > >
> > > > Best,
> > > > Kurt
> > > >
> > > >
> > > > On Tue, Oct 22, 2019 at 12:56 AM Fabian Hueske 
> > > wrote:
> > > >
> > > >> +1 (binding)
> > > >>
> > > >> Am Mo., 21. Okt. 2019 um 16:18 Uhr schrieb Thomas Weise <
> > t...@apache.org
> > > >:
> > > >>
> > > >>> +1 (binding)
> > > >>>
> > > >>>
> > > >>> On Mon, Oct 21, 2019 at 7:10 AM Timo Walther 
> > > wrote:
> > > >>>
> > >  +1 (binding)
> > > 
> > >  Thanks,
> > >  Timo
> > > 
> > > 
> > >  On 21.10.19 15:59, Till Rohrmann wrote:
> > > > +1 (binding)
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Mon, Oct 21, 2019 at 12:13 PM Robert Metzger <
> > rmetz...@apache.org
> > > >>>
> > >  wrote:
> > > >
> > > >> +1 (binding)
> > > >>
> > > >> On Mon, Oct 21, 2019 at 12:06 PM Stephan Ewen  >
> > > >>> wrote:
> > > >>
> > > >>> This is the official vote whether to accept the Stateful
> > Functions
> > > >>> code
> > > >>> contribution to Apache Flink.
> > > >>>
> > > >>> The current Stateful Functions code, documentation, and website
> > can
> > > >>> be
> > > >>> found here:
> > > >>> https://statefun.io/
> > > >>> https://github.com/ververica/stateful-functions
> > > >>>
> > > >>> This vote should capture whether the Apache Flink community is
> > >  interested
> > > >>> in accepting, maintaining, and evolving Stateful Functions.
> > > >>>
> > > >>> Reiterating my original motivation, I believe that this project
> > is
> > > >> a
> > > >> great
> > > >>> match for Apache Flink, because it helps Flink to grow the
> > > >> community
> > > >> into a
> > > >>> new set of use cases. We see current users interested in such
> use
> > >  cases,
> > > >>> but they are not well supported by Flink as it currently is.
> > > >>>
> > > >>> I also personally commit to put time into making sure this
> > > >> integrates
> > > >> well
> > > >>> with Flink and that we grow contributors and committers to
> > maintain
> > >  this
> > > >>> new component well.
> > > >>>
> > > >>> This is a "Adoption of a new Codebase" vote as per the Flink
> > bylaws
> > >  [1].
> > > >>> Only PMC votes are binding. The vote will be open at least 6
> days
> > > >>> (excluding weekends), meaning until Tuesday Oct.29th 12:00 UTC,
> > or
> > >  until
> > > >> we
> > > >>> achieve the 2/3rd majority.
> > > >>>
> > > >>> Happy voting!
> > > >>>
> > > >>> Best,
> > > >>> Stephan
> > > >>>
> > > >>> [1]
> > > >>>
> > > >>
> > > 
> > > >>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026
> > > 
> > > 
> > > 
> > > >>>
> > > >>
> > >
> > >
> >
>

Re: [ANNOUNCE] Becket Qin joins the Flink PMC

2019-10-28 Thread Biao Liu

Congrats Becket!

Thanks,
Biao /'bɪ.aʊ/



On Tue, 29 Oct 2019 at 12:07, jincheng sun  wrote:

> Congratulations Becket.
> Best,
> Jincheng
>
> Rui Li  于2019年10月29日周二 上午11:37写道：
>
> > Congrats Becket!
> >
> > On Tue, Oct 29, 2019 at 11:20 AM Leonard Xu  wrote:
> >
> > > Congratulations!  Becket.
> > >
> > > Best,
> > > Leonard Xu
> > >
> > > > On 2019年10月29日, at 上午11:00, Zhenghua Gao  wrote:
> > > >
> > > > Congratulations, Becket!
> > > >
> > > > *Best Regards,*
> > > > *Zhenghua Gao*
> > > >
> > > >
> > > > On Tue, Oct 29, 2019 at 10:34 AM Yun Gao
>  > >
> > > > wrote:
> > > >
> > > >> Congratulations Becket!
> > > >>
> > > >> Best,
> > > >> Yun
> > > >>
> > > >>
> > > >> --
> > > >> From:Jingsong Li 
> > > >> Send Time:2019 Oct. 29 (Tue.) 10:23
> > > >> To:dev 
> > > >> Subject:Re: [ANNOUNCE] Becket Qin joins the Flink PMC
> > > >>
> > > >> Congratulations Becket!
> > > >>
> > > >> Best,
> > > >> Jingsong Lee
> > > >>
> > > >> On Tue, Oct 29, 2019 at 10:18 AM Terry Wang 
> > wrote:
> > > >>
> > > >>> Congratulations, Becket!
> > > >>>
> > > >>> Best,
> > > >>> Terry Wang
> > > >>>
> > > >>>
> > > >>>
> > >  2019年10月29日 10:12，OpenInx  写道：
> > > 
> > >  Congratulations Becket!
> > > 
> > >  On Tue, Oct 29, 2019 at 10:06 AM Zili Chen 
> > > >> wrote:
> > > 
> > > > Congratulations Becket!
> > > >
> > > > Best,
> > > > tison.
> > > >
> > > >
> > > > Congxian Qiu  于2019年10月29日周二 上午9:53写道：
> > > >
> > > >> Congratulations Becket!
> > > >>
> > > >> Best,
> > > >> Congxian
> > > >>
> > > >>
> > > >> Wei Zhong  于2019年10月29日周二 上午9:42写道：
> > > >>
> > > >>> Congratulations Becket!
> > > >>>
> > > >>> Best,
> > > >>> Wei
> > > >>>
> > >  在 2019年10月29日，09:36，Paul Lam  写道：
> > > 
> > >  Congrats Becket!
> > > 
> > >  Best,
> > >  Paul Lam
> > > 
> > > > 在 2019年10月29日，02:18，Xingcan Cui  写道：
> > > >
> > > > Congratulations, Becket!
> > > >
> > > > Best,
> > > > Xingcan
> > > >
> > > >> On Oct 28, 2019, at 1:23 PM, Xuefu Z 
> > wrote:
> > > >>
> > > >> Congratulations, Becket!
> > > >>
> > > >> On Mon, Oct 28, 2019 at 10:08 AM Zhu Zhu  >
> > > > wrote:
> > > >>
> > > >>> Congratulations Becket!
> > > >>>
> > > >>> Thanks,
> > > >>> Zhu Zhu
> > > >>>
> > > >>> Peter Huang  于2019年10月29日周二
> > > >> 上午1:01写道：
> > > >>>
> > >  Congratulations Becket Qin!
> > > 
> > > 
> > >  Best Regards
> > >  Peter Huang
> > > 
> > >  On Mon, Oct 28, 2019 at 9:19 AM Rong Rong <
> > > walter...@gmail.com
> > > >>>
> > > >>> wrote:
> > > 
> > > > Congratulations Becket!!
> > > >
> > > > --
> > > > Rong
> > > >
> > > > On Mon, Oct 28, 2019, 7:53 AM Jark Wu 
> > > >> wrote:
> > > >
> > > >> Congratulations Becket!
> > > >>
> > > >> Best,
> > > >> Jark
> > > >>
> > > >> On Mon, 28 Oct 2019 at 20:26, Benchao Li <
> > > >> libenc...@gmail.com>
> > > >>> wrote:
> > > >>
> > > >>> Congratulations Becket.
> > > >>>
> > > >>> Dian Fu  于2019年10月28日周一
> 下午7:22写道：
> > > >>>
> > >  Congrats, Becket.
> > > 
> > > > 在 2019年10月28日，下午6:07，Fabian Hueske <
> fhue...@gmail.com>
> > > >> 写道：
> > > >
> > > > Hi everyone,
> > > >
> > > > I'm happy to announce that Becket Qin has joined the
> > > Flink
> > > >> PMC.
> > > > Let's congratulate and welcome Becket as a new member
> > of
> > > >> the
> > >  Flink
> > > >> PMC!
> > > >
> > > > Cheers,
> > > > Fabian
> > > 
> > > 
> > > >>>
> > > >>> --
> > > >>>
> > > >>> Benchao Li
> > > >>> School of Electronics Engineering and Computer Science,
> > > >> Peking
> > > > University
> > > >>> Tel:+86-15650713730
> > > >>> Email: libenc...@gmail.com; libenc...@pku.edu.cn
> > > >>>
> > > >>
> > > >
> > > 
> > > >>>
> > > >>
> > > >>
> > > >> --
> > > >> Xuefu Zhang
> > > >>
> > > >> "In Honey We Trust!"
> > > >
> > > 
> > > >>>
> > > >>>
> > > >>
> > > >
> > > >>>
> > > >>>
> > > >>
> > > >> --
> > > >> Best, Jingsong Lee
> > > >>
> > > >>
> > >
> > >
> >
> > --

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

2019-11-04 Thread Biao Liu

Thanks Yu for bringing this topic.

+1 for this proposal. Glad to have an e2e performance testing.

It seems this proposal is separated into several stages. Is there a more
detailed plan?

Thanks,
Biao /'bɪ.aʊ/



On Mon, 4 Nov 2019 at 19:54, Congxian Qiu  wrote:

> +1 for this idea.
>
> Currently, we have the micro benchmark for flink, which can help us find
> the regressions. And I think the e2e jobs performance testing can also help
> us to cover more scenarios.
>
> Best,
> Congxian
>
>
> Jingsong Li  于2019年11月4日周一 下午5:37写道：
>
> > +1 for the idea. Thanks Yu for driving this.
> > Just curious about that can we collect the metrics about Job scheduling
> and
> > task launch. the speed of this part is also important.
> > We can add tests for watch it too.
> >
> > Look forward to more batch test support.
> >
> > Best,
> > Jingsong Lee
> >
> > On Mon, Nov 4, 2019 at 10:00 AM OpenInx  wrote:
> >
> > > > The test cases are written in java and scripts in python. We propose
> a
> > > separate directory/module in parallel with flink-end-to-end-tests, with
> > the
> > > > name of flink-end-to-end-perf-tests.
> > >
> > > Glad to see that the newly introduced e2e test will be written in Java.
> > > because  I'm re-working on the existed e2e tests suites from BASH
> scripts
> > > to Java test cases so that we can support more external system , such
> as
> > > running the testing job on yarn+flink, docker+flink, standalone+flink,
> > > distributed kafka cluster etc.
> > > BTW, I think the perf e2e test suites will also need to be designed as
> > > supporting running on both standalone env and distributed env. will be
> > > helpful
> > > for developing & evaluating the perf.
> > > Thanks.
> > >
> > > On Mon, Nov 4, 2019 at 9:31 AM aihua li  wrote:
> > >
> > > > In stage1, the checkpoint mode isn't disabled,and uses heap as the
> > > > statebackend.
> > > > I think there should be some special scenarios to test checkpoint and
> > > > statebackend, which will be discussed and added in the release-1.11
> > > >
> > > > > 在 2019年11月2日，上午12:13，Yun Tang  写道：
> > > > >
> > > > > By the way, do you think it's worthy to add a checkpoint mode which
> > > just
> > > > disable checkpoint to run end-to-end jobs? And when will stage2 and
> > > stage3
> > > > be discussed in more details?
> > > >
> > > >
> > >
> >
> >
> > --
> > Best, Jingsong Lee
> >
>

Re: Checkstyle-IDEA plugin For Java

2019-11-04 Thread Biao Liu

Hi yanjun,

I have just checked the CheckStyle-IDEA plugin (latest version 5.33.1) [1].
8.14 is available in my environment (macOS).

I have no idea why it's unavailable in your IDE. Here are just some guesses.
1. There are some similar plugins of style checking. Have you install the
correct one?
2. Did you installed the latest version of CheckStyle-IDEA?
3. 8.14 is not the version of plugin. It's an internal version inside
the plugin. You could find it in "Preferences" -> "Other settings" ->
"Checkstyle".

Please let me know if none of the guesses is right.

[1] https://plugins.jetbrains.com/plugin/1065-checkstyle-idea/

Thanks,
Biao /'bɪ.aʊ/

On Fri, 1 Nov 2019 at 10:21, tison  wrote:

> If you can import checkstyle rules file, the version of checkstyle plugin
> is not very important.
>
> We don't use nightly feature IIRC.
>
> Best,
> tison.
>
>
> yanjun qiu  于2019年10月31日周四 下午7:22写道：
>
> > Hi Community,
> > I want to contribute code to Flink and I have followed the IDE set up
> > guide as below:
> >
> >
> https://ci.apache.org/projects/flink/flink-docs-stable/flinkDev/ide_setup.html#checkstyle-for-java
> > <
> >
> https://ci.apache.org/projects/flink/flink-docs-stable/flinkDev/ide_setup.html#checkstyle-for-java
> > >
> >
> > I have installed Checkstyle-IDEA plugin in my Intellj IDEA 2019.2, but I
> > found that it didn’t have Checkstyle Version 8.14.
> >
> > I want to know which IDEA version or Checkstyle Version should be
> install.
> >
> > Regards,
> > Bruce
>

Re: [ANNOUNCE] Jark Wu is now part of the Flink PMC

2019-11-08 Thread Biao Liu

Congratulations Jack!

Thanks,
Biao /'bɪ.aʊ/



On Fri, 8 Nov 2019 at 19:55, Wei Zhong  wrote:

> Congratulations Jark！
>
> Best,
> Wei
>
>
> > 在 2019年11月8日，19:36，Zhijiang  写道：
> >
> > Congratulations Jark!
> >
> > Best,
> > Zhijiang
> >
> >
> > --
> > From:bupt_ljy 
> > Send Time:2019 Nov. 8 (Fri.) 19:21
> > To:dev ; dev@flink.apache.org Yun Gao <
> yungao...@aliyun.com>
> > Subject:Re: [ANNOUNCE] Jark Wu is now part of the Flink PMC
> >
> > Congratulations Jark!
> >
> >
> > Best,
> > Jiayi Liao
> >
> >
> > Original Message
> > Sender: Yun Gao
> > Recipient: dev
> > Date: Friday, Nov 8, 2019 18:37
> > Subject: Re: [ANNOUNCE] Jark Wu is now part of the Flink PMC
> >
> >
> > Congratulations Jark！ Best, Yun
> --
> From:wenlong.lwl  Send Time:2019 Nov. 8 (Fri.)
> 18:31 To:dev  Subject:Re: [ANNOUNCE] Jark Wu is now
> part of the Flink PMC Congratulations Jark, well deserved! Best, Wenlong
> Lyu On Fri, 8 Nov 2019 at 18:22, tison  wrote: >
> Congrats Jark! > > Best, > tison. > > > Jingsong Li <
> jingsongl...@gmail.com> 于2019年11月8日周五 下午6:08写道： > > > Congratulations to
> Jark. > > Jark has really contributed a lot to the table layer with a long
> time. > Well > > deserved. > > > > Best, > > Jingsong Lee > > > > On Fri,
> Nov 8, 2019 at 6:05 PM Yu Li  wrote: > > > > >
> Congratulations Jark! Well deserved! > > > > > > Best Regards, > > > Yu > >
> > > > > > > > On Fri, 8 Nov 2019 at 17:55, OpenInx 
> wrote: > > > > > > > Congrats Jark ! Well deserve. > > > > > > > > On Fri,
> Nov 8, 2019 at 5:53 PM Paul Lam  > wrote: > > > >
> > > > > > Congrats Jark! > > > > > > > > > > Best, > > > > > Paul Lam > > >
> > > > > > > > > 在 2019年11月8日，17:51，jincheng sun 
> 写道： > > > > > > > > > > > > Hi all, > > > > > > > > > > > > On behalf of
> the Flink PMC, I'm happy to announce that Jark Wu is > > now > > > > > >
> part of the Apache Flink Project Management Committee (PMC). > > > > > > >
> > > > > > Jark has been a committer since February 2017. He has been very >
> > > active > > > > on > > > > > > Flink's Table API / SQL component, as
> well as frequently helping > > > > > > manage/verify/vote releases. He has
> been writing many blogs about > > > > Flink, > > > > > > also driving the
> translation work of Flink website and > > documentation. > > > > He > > > >
> > is > > > > > > very active in China community as he gives talks about
> Flink at > > many > > > > > events > > > > > > in China. > > > > > > > > >
> > > > Congratulations & Welcome Jark! > > > > > > > > > > > > Best, > > > >
> > > Jincheng (on behalf of the Flink PMC) > > > > > > > > > > > > > > > > >
> > > > > > > -- > > Best, Jingsong Lee > > >
> >
>
>

Re: Error:java: 无效的标记: --add-exports=java.base/sun.net.util=ALL-UNNAMED

2019-11-13 Thread Biao Liu

Hi,

I have encountered the same issue when setting up a dev environment.

It seems that the my Intellij (2019.2.1) unexpectedly activates java11
profile of maven. It doesn't match the Java compiler (JDK8). I'm not sure
why it happened silently.

So for me, the solution is "Intellij" -> "View" -> "Tool Windows" ->
"Maven" -> "Profiles" -> uncheck the "java11" -> reimport maven project.

Thanks,
Biao /'bɪ.aʊ/



On Mon, 4 Nov 2019 at 18:01, OpenInx  wrote:

> Hi
> I met the same problem before. After some digging,  I find that the idea
> will detect the JDK version
> and choose whether to use the jdk11 option to run the flink maven building.
> if you are in jdk11 env,  then
> it will add the option --add-exports when maven building in IDEA.
>
> For my case,  I was in IntelliJIdea2019.2 which depends on the jdk11, and
> once I re-import the flink
> modules then the IDEA will add the --add-exports flag even if  I removed
> all the flags in .idea/compile.xml
> explicitly.  I noticed that the Intellij's JDK affected the flink maven
> building, so I turned to use the Intellij with JDK8
> bundled,  then the problem was gone.
>
> You can verify it, and if  it's really the same. can just replace your IDEA
> with the pkg suffix with "with bundled JBR 8" in
> here [1].
> Say if you are using MacOS, then should download the package "2019.2.4 for
> macOS with bundled JBR 8 (dmg)"
>
> Hope it works for you
> Thanks.
>
> [1]. https://www.jetbrains.com/idea/download/other.html
>
>
> On Mon, Nov 4, 2019 at 5:44 PM Till Rohrmann  wrote:
>
> > Try to reimport that maven project. This should resolve this issue.
> >
> > Cheers,
> > Till
> >
> > On Mon, Nov 4, 2019 at 10:34 AM 刘建刚  wrote:
> >
> > >   Hi, I am using flink 1.9 in idea. But when I run a unit test in
> > idea.
> > > The idea reports the following error:"Error:java: 无效的标记:
> > > --add-exports=java.base/sun.net.util=ALL-UNNAMED".
> > >   Everything is ok when I use flink 1.6. I am using jdk 1.8. Is it
> > > related to the java version?
> > >
> >
>

Re: [ANNOUNCE] Zhu Zhu becomes a Flink committer

2019-12-15 Thread Biao Liu

Congrats Zhu Zhu!

Thanks,
Biao /'bɪ.aʊ/



On Mon, 16 Dec 2019 at 10:23, Congxian Qiu  wrote:

> Congrats, Zhu Zhu!
>
> Best,
> Congxian
>
>
> aihua li  于2019年12月16日周一 上午10:16写道：
>
> > Congratulations， zhuzhu！
> >
> > > 在 2019年12月16日，上午10:04，Jingsong Li  写道：
> > >
> > > Congratulations Zhu Zhu!
> > >
> > > Best,
> > > Jingsong Lee
> > >
> > > On Mon, Dec 16, 2019 at 10:01 AM Yang Wang 
> > wrote:
> > >
> > >> Congratulations, Zhu Zhu!
> > >>
> > >> wenlong.lwl  于2019年12月16日周一 上午9:56写道：
> > >>
> > >>> Congratulations, Zhu Zhu!
> > >>>
> > >>> On Mon, 16 Dec 2019 at 09:14, Leonard Xu  wrote:
> > >>>
> >  Congratulations, Zhu Zhu ! !
> > 
> >  Best,
> >  Leonard Xu
> > 
> > > On Dec 16, 2019, at 07:53, Becket Qin 
> wrote:
> > >
> > > Congrats, Zhu Zhu!
> > >
> > > On Sun, Dec 15, 2019 at 10:26 PM Dian Fu 
> > >>> wrote:
> > >
> > >> Congrats Zhu Zhu!
> > >>
> > >>> 在 2019年12月15日，下午6:23，Zhu Zhu  写道：
> > >>>
> > >>> Thanks everyone for the warm welcome!
> > >>> It's my honor and pleasure to improve Flink with all of you in
> the
> > >>> community!
> > >>>
> > >>> Thanks,
> > >>> Zhu Zhu
> > >>>
> > >>> Benchao Li  于2019年12月15日周日 下午3:54写道：
> > >>>
> >  Congratulations!:)
> > 
> >  Hequn Cheng  于2019年12月15日周日 上午11:47写道：
> > 
> > > Congrats, Zhu Zhu!
> > >
> > > Best, Hequn
> > >
> > > On Sun, Dec 15, 2019 at 6:11 AM Shuyi Chen  >
> >  wrote:
> > >
> > >> Congratulations!
> > >>
> > >> On Sat, Dec 14, 2019 at 7:59 AM Rong Rong <
> walter...@gmail.com>
> > >> wrote:
> > >>
> > >>> Congrats Zhu Zhu :-)
> > >>>
> > >>> --
> > >>> Rong
> > >>>
> > >>> On Sat, Dec 14, 2019 at 4:47 AM tison 
> >  wrote:
> > >>>
> >  Congratulations!:)
> > 
> >  Best,
> >  tison.
> > 
> > 
> >  OpenInx  于2019年12月14日周六 下午7:34写道：
> > 
> > > Congrats Zhu Zhu!
> > >
> > > On Sat, Dec 14, 2019 at 2:38 PM Jeff Zhang <
> zjf...@gmail.com
> > >>>
> > > wrote:
> > >
> > >> Congrats, Zhu Zhu!
> > >>
> > >> Paul Lam  于2019年12月14日周六
> 上午10:29写道：
> > >>
> > >>> Congrats Zhu Zhu!
> > >>>
> > >>> Best,
> > >>> Paul Lam
> > >>>
> > >>> Kurt Young  于2019年12月14日周六 上午10:22写道：
> > >>>
> >  Congratulations Zhu Zhu!
> > 
> >  Best,
> >  Kurt
> > 
> > 
> >  On Sat, Dec 14, 2019 at 10:04 AM jincheng sun <
> > >> sunjincheng...@gmail.com>
> >  wrote:
> > 
> > > Congrats ZhuZhu and welcome on board!
> > >
> > > Best,
> > > Jincheng
> > >
> > >
> > > Jark Wu  于2019年12月14日周六 上午9:55写道：
> > >
> > >> Congratulations, Zhu Zhu!
> > >>
> > >> Best,
> > >> Jark
> > >>
> > >> On Sat, 14 Dec 2019 at 08:20, Yangze Guo <
> > >> karma...@gmail.com
> > 
> > >> wrote:
> > >>
> > >>> Congrats, ZhuZhu!
> > >>>
> > >>> Bowen Li  于 2019年12月14日周六
> > > 上午5:37写道：
> > >>>
> >  Congrats!
> > 
> >  On Fri, Dec 13, 2019 at 10:42 AM Xuefu Z <
> >  usxu...@gmail.com>
> >  wrote:
> > 
> > > Congratulations, Zhu Zhu!
> > >
> > > On Fri, Dec 13, 2019 at 10:37 AM Peter Huang <
> > >>> huangzhenqiu0...@gmail.com
> > >
> > > wrote:
> > >
> > >> Congratulations!:)
> > >>
> > >> On Fri, Dec 13, 2019 at 9:45 AM Piotr Nowojski
> >  <
> > >> pi...@ververica.com>
> > >> wrote:
> > >>
> > >>> Congratulations! :)
> > >>>
> >  On 13 Dec 2019, at 18:05, Fabian Hueske <
> > >>> fhue...@gmail.com
> > >
> > >>> wrote:
> > 
> >  Congrats Zhu Zhu and welcome on board!
> > 
> >  Best, Fabian
> > 
> >  Am Fr., 13. Dez. 2019 um 17:51 Uhr schrieb
> > > Till

Re: Potential side-effect of connector code to JM/TM

2019-12-17 Thread Biao Liu

Hi Yingjie,

Thanks for figuring out the impressive bug and bringing this discussion.

I'm afraid there is no such a silver bullet for isolation from third-party
library. However I agree that resource checking utils might help.
It seems that you and Till have already raised some feasible ideas.
Resource leaking issue looks like quite common. It would be great If
someone could share some experience. Will keep an eye on this discussion.

Thanks,
Biao /'bɪ.aʊ/



On Tue, 17 Dec 2019 at 20:27, Till Rohrmann  wrote:

> Hi Yingjie,
>
> thanks for reporting this issue and starting this discussion. If we are
> dealing with third party libraries I believe there is always the risk that
> one overlooks closing resources. Ideally we make it as hard from Flink's
> perspective as possible but realistically it is hard to completely avoid.
> Hence, I believe that it would be beneficial to have some tooling (e.g.
> stress tests) which could help to surface these kind of problems. Maybe one
> could automate it so that a dev only needs to provide a user jar and then
> this jar is being executed several times and the cluster is checked for
> anomalies.
>
> Cheers,
> Till
>
> On Tue, Dec 17, 2019 at 8:43 AM Yingjie Cao 
> wrote:
>
> > Hi community,
> >
> >   After running tpc-ds test suite for several days on a session cluster,
> we
> > found a resource leak problem of OrcInputFormat which was reported in
> > FLINK-15239. The problem comes from the dependent third party library
> which
> > creates new internal thread (pool) and never release it. As a result, the
> > user class loader which is referenced by these threads will never be
> > garbage collected as well as other classes loaded by the user class
> loader,
> > which finally lead to the continually grow of meta space size for JM (AM)
> > whose meta space size is not limited currently. And for TM whose meta
> space
> > size is limited, it will result in meta space oom eventually. I am not
> sure
> > if any other connectors/input formats incurs the similar problem.
> >   In general, it is hard for Flink to restrict the behavior of the third
> > party dependencies, especially the dependencies of connectors. However,
> it
> > will be better if we can supply some mechanism like stronger isolation or
> > some test facilities to find potential problems, for example, we can run
> > jobs on a cluster and automatically check something like whether user
> class
> > loader can be garbage collected, whether there is thread leak, whether
> some
> > shutdown hooks have been registered and so on.
> >   What do you think? Or should we treat it as a problem?
> >
> > Best,
> > Yingjie
> >
>

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Biao Liu

Congrats!

Thanks,
Biao /'bɪ.aʊ/



On Fri, 17 Jan 2020 at 13:43, Rui Li  wrote:

> Congratulations Dian, well deserved!
>
> On Thu, Jan 16, 2020 at 5:58 PM jincheng sun 
> wrote:
>
>> Hi everyone,
>>
>> I'm very happy to announce that Dian accepted the offer of the Flink PMC
>> to become a committer of the Flink project.
>>
>> Dian Fu has been contributing to Flink for many years. Dian Fu played an
>> essential role in PyFlink/CEP/SQL/Table API modules. Dian Fu has
>> contributed several major features, reported and fixed many bugs, spent a
>> lot of time reviewing pull requests and also frequently helping out on the
>> user mailing lists and check/vote the release.
>>
>> Please join in me congratulating Dian for becoming a Flink committer !
>>
>> Best,
>> Jincheng(on behalf of the Flink PMC)
>>
>
>
> --
> Best regards!
> Rui Li
>

Re: [DISCUSS] Change default for RocksDB timers: Java Heap => in RocksDB

2020-01-16 Thread Biao Liu

+1

I think that's how it should be. Timer should align with other regular
state.

If user wants a better performance without memory concern, memory or FS
statebackend might be considered. Or maybe we could optimize the
performance by introducing a specific column family for timer. It could
have its own tuned options.

Thanks,
Biao /'bɪ.aʊ/



On Fri, 17 Jan 2020 at 10:11, Jingsong Li  wrote:

> Hi Stephan,
>
> Thanks for starting this discussion.
> +1 for stores times in RocksDB by default.
> In the past, when Flink didn't save the times with RocksDb, I had a
> headache. I always adjusted parameters carefully to ensure that there was
> no risk of Out of Memory.
>
> Just curious, how much impact of heap and RocksDb for times on performance
> - if there is no order of magnitude difference between heap and RocksDb,
> there is no problem in using RocksDb.
> - if there is, maybe we should improve our documentation to let users know
> about this option. (Looks like a lot of users didn't know)
>
> Best,
> Jingsong Lee
>
> On Fri, Jan 17, 2020 at 3:18 AM Yun Tang  wrote:
>
>> Hi Stephan,
>>
>> I am +1 for the change which stores timers in RocksDB by default.
>>
>> Some users hope the checkpoint could be completed as fast as possible,
>> which also need the timer stored in RocksDB to not affect the sync part of
>> checkpoint.
>>
>> Best
>> Yun Tang
>> --
>> *From:* Andrey Zagrebin 
>> *Sent:* Friday, January 17, 2020 0:07
>> *To:* Stephan Ewen 
>> *Cc:* dev ; user 
>> *Subject:* Re: [DISCUSS] Change default for RocksDB timers: Java Heap =>
>> in RocksDB
>>
>> Hi Stephan,
>>
>> Thanks for starting this discussion. I am +1 for this change.
>> In general, number of timer state keys can have the same order as number
>> of main state keys.
>> So if RocksDB is used for main state for scalability, it makes sense to
>> have timers there as well
>> unless timers are used for only very limited subset of keys which fits
>> into memory.
>>
>> Best,
>> Andrey
>>
>> On Thu, Jan 16, 2020 at 4:27 PM Stephan Ewen  wrote:
>>
>> Hi all!
>>
>> I would suggest a change of the current default for timers. A bit of
>> background:
>>
>>   - Timers (for windows, process functions, etc.) are state that is
>> managed and checkpointed as well.
>>   - When using the MemoryStateBackend and the FsStateBackend, timers are
>> kept on the JVM heap, like regular state.
>>   - When using the RocksDBStateBackend, timers can be kept in RocksDB
>> (like other state) or on the JVM heap. The JVM heap is the default though!
>>
>> I find this a bit un-intuitive and would propose to change this to let
>> the RocksDBStateBackend store all state in RocksDB by default.
>> The rationale being that if there is a tradeoff (like here), safe and
>> scalable should be the default and unsafe performance be an explicit choice.
>>
>> This sentiment seems to be shared by various users as well, see
>> https://twitter.com/StephanEwen/status/1214590846168903680 and
>> https://twitter.com/StephanEwen/status/1214594273565388801
>> We would of course keep the switch and mention in the performance tuning
>> section that this is an option.
>>
>> # RocksDB State Backend Timers on Heap
>>   - Pro: faster
>>   - Con: not memory safe, GC overhead, longer synchronous checkpoint
>> time, no incremental checkpoints
>>
>> #  RocksDB State Backend Timers on in RocksDB
>>   - Pro: safe and scalable, asynchronously and incrementally checkpointed
>>   - Con: performance overhead.
>>
>> Please chime in and let me know what you think.
>>
>> Best,
>> Stephan
>>
>>
>
> --
> Best, Jingsong Lee
>

Re: [ANNOUNCE] Yu Li became a Flink committer

2020-01-29 Thread Biao Liu

Congrats!

On Wed, Jan 29, 2020 at 10:37 aihua li  wrote:

> Congratulations Yu LI, well deserved.
>
> > 2020年1月23日 下午4:59，Stephan Ewen  写道：
> >
> > Hi all!
> >
> > We are announcing that Yu Li has joined the rank of Flink committers.
> >
> > Yu joined already in late December, but the announcement got lost because
> > of the Christmas and New Years season, so here is a belated proper
> > announcement.
> >
> > Yu is one of the main contributors to the state backend components in the
> > recent year, working on various improvements, for example the RocksDB
> > memory management for 1.10.
> > He has also been one of the release managers for the big 1.10 release.
> >
> > Congrats for joining us, Yu!
> >
> > Best,
> > Stephan
>
> --

Thanks,
Biao /'bɪ.aʊ/

Re: [DISCUSS] FLIP-27: Refactor Source Interface

2018-11-05 Thread Biao Liu

Thanks Aljoscha for bringing us this discussion!

1. I think one of the reason about separating `advance()` and
`getCurrent()` is that we have several different types returned by source.
Not just the `record`, but also the timestamp of record and the watermark.
If we don't separate these into different methods, the source has to return
a tuple3 which is not so user friendly. The prototype of Aljoscha is
acceptable to me. Regarding the specific method name, I'm not sure which
one is better. Both of them are reasonable for me.

2. As Thomas and Becket mentioned before, I think a non-blocking API is
necessary. Moreover, IMO we should not offer a blocking API. It doesn't
help but makes things more complicated.

3. About the thread model.
I agree with Thomas about the thread-less IO model. A standard workflow
should look like below.
  - If there is available data, Flink would read it.
  - If there is no data available temporary, Flink would check again a
moment later. Maybe waiting on a semaphore until a timer wake it up.
Furthermore, we can offer an optional optimization for source which has
external thread. Like Guowei mentioned, there can be a listener which the
reader can wake the framework up as soon as new data comes. This can solve
Piotr's concern about efficiency.

4. One more thing. After taking a look at the prototype codes. Off the top
of my head, the implementation is more fit for batch job not streaming job.
There are two types of tasks in prototype. First is a source task that
discovers the splits. The source passes the splits to the second task which
process the splits one by one. And then the source keeps watch to discover
more splits.

However, I think the more common scenario of streaming job is:
there are fixed splits, each of the subtasks takes several splits. The
subtasks just keep processing the fixed splits. There would be continuous
datum in each split. We don't need a source task to discover more splits.
It can not be finished in streaming job since we don't want the processing
task finished even there are no more splits.

So IMO we should offer another source operator for the new interface. It
would discover all splits when it is opening. Then picks the splits belong
to this subtask. Keep processing these splits until all of them are
finished.

Becket Qin  于2018年11月5日周一 上午11:00写道：

> Hi Thomas,
>
> The iterator-like API was also the first thing that came to me. But it
> seems a little confusing that hasNext() does not mean "the stream has not
> ended", but means "the next record is ready", which is repurposing the well
> known meaning of hasNext(). If we follow the hasNext()/next() pattern, an
> additional isNextReady() method to indicate whether the next record is
> ready seems more intuitive to me.
>
> Similarly, in poll()/take() pattern, another method of isDone() is needed
> to indicate whether the stream has ended or not.
>
> Compared with hasNext()/next()/isNextReady() pattern,
> isDone()/poll()/take() seems more flexible for the reader implementation.
> When I am implementing a reader, I could have a couple of choices:
>
>- A thread-less reader that does not have any internal thread.
>- When poll() is called, the same calling thread will perform a bunch of
>   IO asynchronously.
>   - When take() is called, the same calling thread will perform a bunch
>   of IO and wait until the record is ready.
>- A reader with internal threads performing network IO and put records
>into a buffer.
>   - When poll() is called, the calling thread simply reads from the
>   buffer and return empty result immediately if there is no record.
>   - When take() is called, the calling thread reads from the buffer and
>   block waiting if the buffer is empty.
>
> On the other hand, with the hasNext()/next()/isNextReady() API, it is less
> intuitive for the reader developers to write the thread-less pattern.
> Although technically speaking one can still do the asynchronous IO to
> prepare the record in isNextReady(). But it is inexplicit and seems
> somewhat hacky.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
>
> On Mon, Nov 5, 2018 at 6:55 AM Thomas Weise  wrote:
>
> > Couple more points regarding discovery:
> >
> > The proposal mentions that discovery could be outside the execution
> graph.
> > Today, discovered partitions/shards are checkpointed. I believe that will
> > also need to be the case in the future, even when discovery and reading
> are
> > split between different tasks.
> >
> > For cases such as resharding of a Kinesis stream, the relationship
> between
> > splits needs to be considered. Splits cannot be randomly distributed over
> > readers in certain situations. An example was mentioned here:
> > https://github.com/apache/flink/pull/6980#issuecomment-435202809
> >
> > Thomas
> >
> >
> > On Sun, Nov 4, 2018 at 1:43 PM Thomas Weise  wrote:
> >
> > > Thanks for getting the ball rolling on this!
> > >
> > > Can the number of splits decrease? Yes, split

Re: [DISCUSS] FLIP-27: Refactor Source Interface

2018-11-06 Thread Biao Liu

Regarding the naming style.

The advantage of `poll()` style is that basically the name of `poll` means
it should be a non-blocking operator, same with `Queue` in Java API. It's
easy to understand. We don't need to write too much in docs to imply the
implementation should not do something heavy.
However `poll` also means it should return the thing we want. In our
scenario, there are 3 types currently, record, timestamp and watermark. So
the return type of `poll` should be tuple3 or something like that. It looks
a little hacky IMO.

The `advance()` style is more like RecordReader
<https://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/RecordReader.html>
of
MapReduce, or ISpout
<https://storm.apache.org/releases/1.1.2/javadocs/org/apache/storm/spout/ISpout.html>
of
Storm. It means moving the offset forward indeed. It makes sense to me.
To be honest I like `advance()` style more.

And there is also another small point I can't get.

Why use `start()` and `close()` in `SplitReader`? `start()` makes me think
of "starting a thread" or something like that. We should not assume there
would be some thread. I prefer `open()`, it also matches the `close()`
better.


Becket Qin  于2018年11月6日周二 上午11:04写道：

> Thanks for updating the wiki, Aljoscha.
>
> The isDone()/advance()/getCurrent() API looks more similar to
> hasNext()/isNextReady()/getNext(), but implying some different behaviors.
>
> If users call getCurrent() twice without calling advance() in between, will
> they get the same record back? From the API itself, users might think
> advance() is the API that moves the offset forward, and getCurrent() just
> return the record at the current offset.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Mon, Nov 5, 2018 at 10:41 PM Aljoscha Krettek 
> wrote:
>
> > I updated the FLIP [1] with some Javadoc for the SplitReader to outline
> > what I had in mind with the interface. Sorry for not doing that earlier,
> > it's not quite clear how the methods should work from the name alone.
> >
> > The gist of it is that advance() should be non-blocking, so
> > isDone/advance()/getCurrent() are very similar to isDone()/poll()/take()
> > that I have seen mentioned.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
> > <
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27:+Refactor+Source+Interface
> > >
> >
> > > On 5. Nov 2018, at 11:05, Biao Liu  wrote:
> > >
> > > Thanks Aljoscha for bringing us this discussion!
> > >
> > > 1. I think one of the reason about separating `advance()` and
> > > `getCurrent()` is that we have several different types returned by
> > source.
> > > Not just the `record`, but also the timestamp of record and the
> > watermark.
> > > If we don't separate these into different methods, the source has to
> > return
> > > a tuple3 which is not so user friendly. The prototype of Aljoscha is
> > > acceptable to me. Regarding the specific method name, I'm not sure
> which
> > > one is better. Both of them are reasonable for me.
> > >
> > > 2. As Thomas and Becket mentioned before, I think a non-blocking API is
> > > necessary. Moreover, IMO we should not offer a blocking API. It doesn't
> > > help but makes things more complicated.
> > >
> > > 3. About the thread model.
> > > I agree with Thomas about the thread-less IO model. A standard workflow
> > > should look like below.
> > >  - If there is available data, Flink would read it.
> > >  - If there is no data available temporary, Flink would check again a
> > > moment later. Maybe waiting on a semaphore until a timer wake it up.
> > > Furthermore, we can offer an optional optimization for source which has
> > > external thread. Like Guowei mentioned, there can be a listener which
> the
> > > reader can wake the framework up as soon as new data comes. This can
> > solve
> > > Piotr's concern about efficiency.
> > >
> > > 4. One more thing. After taking a look at the prototype codes. Off the
> > top
> > > of my head, the implementation is more fit for batch job not streaming
> > job.
> > > There are two types of tasks in prototype. First is a source task that
> > > discovers the splits. The source passes the splits to the second task
> > which
> > > process the splits one by one. And then the source keeps watch to
> > discover
> > > more splits.
> > >
> > > However, I think the more common scenario of streaming job is

Re: [DISCUSS] FLIP-27: Refactor Source Interface

2018-11-18 Thread Biao Liu

Hi community,

Thank you guys for sharing ideas.

The thing I really concern is about the thread mode.
Actually in Alibaba, we have implemented our "split reader" based source
two years ago. That's based on "SourceFunction", it's just an extension not
a refactoring. It's almost same with the version Thomas and Jamie described
in Google Doc. It really helps in many scenarios.

However I don't like the thread mode which starts a thread for each split.
Starting extra thread in operator is not an ideal way IMO. Especially
thread count is decided by split count. So I was wondering if there is a
more elegant way. Do we really want these threads in Flink core?

I agree that blocking interface is more easy to implement. Could we at
least separate the split reader with source function into different
interfaces? Not all sources would like to read all splits concurrently. In
batch scenario, reading splits one by one is more general. And also not all
sources are partitioned, right?
I prefer there is a new source interface with "pull mode" only, no split.
There is a splittable source extended it. And there is one implementation
that starting threads for each split, reading all splits concurrently.


Thomas Weise  于2018年11月18日周日 上午3:18写道：

> @Aljoscha to address your question first: In the case of the Kinesis
> consumer (with current Kinesis consumer API), there would also be N+1
> threads. I have implemented a prototype similar to what is shown in Jamie's
> document, where the thread ownership is similar to what you have done for
> Kafka.
>
> The equivalent of split reader manages its own thread and the "source main
> thread" is responsible for emitting the data. The interface between the N
> reader threads and the 1 emitter is a blocking queue per consumer thread.
> The emitter can now control which queue to consume from based on the event
> time progress.
>
> This is akin to a "non-blocking" interface *between emitter and split
> reader*. Emitter uses poll to retrieve records from the N queues (which
> requires non-blocking interaction). The emitter is independent of the split
> reader implementation, that part could live in Flink.
>
> Regarding whether or not to assume that split readers always need a thread
> and in addition that these reader threads should be managed by Flink: It
> depends on the API of respective external systems and I would not bake that
> assumption into Flink. Some client libraries manage their own threads (see
> push based API like JMS and as I understand it may also apply to the new
> fan-out Kinesis API:
>
> https://docs.aws.amazon.com/streams/latest/dev/building-enhanced-consumers-api.html
> ).
> In such cases it would not make sense to layer another reader thread on
> top. It may instead be better if Flink provides to the split reader the
> queue/buffer to push records to.
>
> The discussion so far has largely ignored the discovery aspect. There are
> some important considerations such as ordering dependency of splits and
> work rebalancing that may affect the split reader interface. Should we fork
> this into a separate thread?
>
> Thanks,
> Thomas
>
>
> On Fri, Nov 16, 2018 at 8:09 AM Piotr Nowojski 
> wrote:
>
> > Hi Jamie,
> >
> > As it was already covered with my discussion with Becket, there is an
> easy
> > way to provide blocking API on top of non-blocking API. And yes we both
> > agreed that blocking API is easier to implement by users.
> >
> > I also do not agree with respect to usefulness of non blocking API.
> > Actually Kafka connector is the one that could be more efficient thanks
> to
> > the removal of the one layer of threading.
> >
> > Piotrek
> >
> > > On 16 Nov 2018, at 02:21, Jamie Grier  wrote:
> > >
> > > Thanks Aljoscha for getting this effort going!
> > >
> > > There's been plenty of discussion here already and I'll add my big +1
> to
> > > making this interface very simple to implement for a new
> > > Source/SplitReader.  Writing a new production quality connector for
> Flink
> > > is very difficult today and requires a lot of detailed knowledge about
> > > Flink, event time progress, watermarking, idle shard detection, etc and
> > it
> > > would be good to move almost all of this type of code into Flink itself
> > and
> > > out of source implementations.  I also think this is totally doable and
> > I'm
> > > really excited to see this happening.
> > >
> > > I do have a couple of thoughts about the API and the implementation..
> > >
> > > In a perfect world there would be a single thread per Flink source
> > sub-task
> > > and no additional threads for SplitReaders -- but this assumes a world
> > > where you have true async IO APIs for the upstream systems (like Kafka
> > and
> > > Kinesis, S3, HDFS, etc).  If that world did exist the single thread
> could
> > > just sit in an efficient select() call waiting for new data to arrive
> on
> > > any Split.  That'd be awesome..
> > >
> > > But, that world doesn't exist and given that practical consideration I
> > > wo

Re: [DISCUSS] FLIP-27: Refactor Source Interface

2018-11-25 Thread Biao Liu

Hi community,
Glad to see this topic is still so active.

Thanks for replying @Piotrek and @Becket.

Last time, I expressed some rough ideas about the thread model. However I
found that it's hard to describe clearly in mailing list. So I wrote it
down with some graphs, exampled some kinds of models, see Thread Model of
Source
.
I wish that can be helpful.

IMO thread model is an important part. Without thinking of implementation
clearly, it's difficult to decide what the up level interface should look
like.
It would be better if we draw the whole picture first and then fill the
detail parts one by one.

@Piotrek About adding new splits to existing split reader. It's an
interesting idea. Not only for solving too many threads problem, but also
for supporting some more complicated system. I know in some storage
systems, there is some scenario which the partition is dynamic(dynamically
splitting or merging). Though I have not think of it very clearly now. I
would give you more detailed reply asap :)


Guowei Ma  于2018年11月23日周五 下午6:37写道：

> Hi,Piotr
> Sorry  for so late to response.
>
>
> First of all I think Flink runtime can assigned a thread for a StreamTask,
> which likes  'Actor' model. The number of threads for a StreamTask should
> not be proportional to the operator or other things. This will give Flink
> the ability to scale horizontally. So I think it's not just the
> network(flush),checkpoint and  source, but some operators' threads can also
> be removed in the future, like AsyncWaitOperator.
>
>
>
> for b)
> When using event time, some sources want to assign a timestamp to each
> element. In current Flink interface, user will write like this
> public class EventTimeSource implements SourceFunction {
>   public void run() {
>  while(...){
>  Element record = // get from file or some queue;
>  long timestamp = parseTimestampFromElement(record);
>  sourceContext.collectWithTimestamp(record, timestamp);
>  }
>   }
> }
> Using the interfaces from this FLIP, user can write like this
>
> public EventTimeSplitReader implements SplitReader {
> Element currentRecord = null;
>
>
> // Please ignoring the handling of boundary conditions
> public boolean advace(){
>currentRecord = //move a pointer forward
>return true;
>  }
>
> public Element getCurrent(){
>return currentRecord;
> }
> public long getCurrentTimestamp() {
>   return parseTimestampFromElement(currentRecord);
> }
> }
>
> if merging the advance/getNext to a method like getNext() , the SplitReader
> interface may need to change a little like this
>
> public interface SplitReader2 {
> public class ElementWithTimestamp {
> T element;
> long timestamp;
> }
>
> public ElementWithTimestamp getNext() ;
>
> }
> Now user may need implement the source like this
> public EventTimeSplitReader implements SplitReader2 {
> Element currentRecord = null;
>
> // Please ignoring the handling of boundary conditions
> public ElementWithTimestamp getCurrent(){
>return new ElementWithTimestamp(currentRecord,
> parseTimestampFromElement(currentRecord))
> }
> }
> The user can use a constant ElementWithTimestamp but I think this need the
> every connector developers to know this trick. The current Flip will not
> have this burden.
> Maybe there has other way like '' void getCurrent(ElementWithTimestamp)"
> to avoid creating a new object.  But my personal preference is
> ‘advance/getCurrent’.
>
>
>
> Piotr Nowojski  于2018年11月7日周三 下午4:31写道：
>
> > Hi,
> >
> > a)
> >
> > > BTW, regarding the isBlock() method, I have a few more questions. 21,
> Is
> > a method isReady() with boolean as a return value
> > > equivalent? Personally I found it is a little bit confusing in what is
> > supposed to be returned when the future is completed. 22. if
> > > the implementation of isBlocked() is optional, how do the callers know
> > whether the method is properly implemented or not?
> > > Does not implemented mean it always return a completed future?
> >
> > `CompletableFuture isBlocked()` is more or less an equivalent to
> > `boolean hasNext()` which in case of “false” provides some kind of a
> > listener/callback that notifies about presence of next element. There are
> > some minor details, like `CompletableFuture` has a minimal two state
> > logic:
> >
> > 1. Future is completed - we have more data
> > 2. Future not yet completed - we don’t have data now, but we might/we
> will
> > have in the future
> >
> > While `boolean hasNext()` and `notify()` callback are a bit more
> > complicated/dispersed and can lead/encourage `notify()` spam.
> >
> > b)
> >
> > > 3. If merge the `advance` and `getCurrent`  to one method like
> `getNext`
> > the `getNext` would need return a
> > >`ElementWithTimestamp` because some sources want to add timestamp to
> > every element. IMO, this is n

Re: [DISCUSS] FLIP-27: Refactor Source Interface

2018-11-26 Thread Biao Liu

Hi Kostas,

Regarding the checkpoint of "per thread for each split mode". IMO, there
are severals things source operator need to do.
1. Source operator need to record all splits in checkpoint. The unfinished
splits must be recorded. I'm not sure whether we could skip recording the
finished splits, it depends on split discovery implementation.
2. Source operator need to collect the last record polled from each split
queue. And put them into checkpoint.
3. SplitReader can be restored by giving a specific split with a position
of last record.

And I think you raised another important issue. The queue between task
thread and split readers.
1. I agree that it must be a thread-safe, size limited queue, such as
ArrayBlockingQueue.
2. Also it's hard to decide the size of queue. We have to consider the
split count, the size of item in queue to make sure the memory of source
operator will not be out of control. Giving a unified queue size is not
proper since there may be several different sources in one job. It's better
that each source can decide the queue size of itself.


Kostas Kloudas  于2018年11月26日周一 下午8:42写道：

> Hi all,
>
> From the discussion, I understand that we are leaning towards a design
> where the user writes a single-threaded SplitReader, which Flink executes
> on another thread (not the main task thread). This way the task can have
> multiple readers running concurrently, each one reading a different split.
>
> Each of these threads writes in its own queue. These queues are then polled
> by the main thread (based on a potentially user-defined prioritization),
> which is responsible for emitting data downstream. There were also
> proposals for a single shared queue, but I believe that 1) the contention
> for the lock in such a queue can be a limitation and 2) it is not easy to
> prioritise which elements to consume first (assuming that we want to
> support different prioritisation strategies).
>
> Assuming the above model, I have the following question:
>
> We have the split/shard/partition discovery logic outside the "reader"
> operator. For now it can be a plain old source function with parallelism of
> 1 that periodically checks for new splits (for an example see the existing
> ContinuousFileMonitoringFunction).[1]
>
> This source sends the split to be read downstream to the multi-threaded
> readers. In these settings, there must be a "throttling" or
> "rate-limitting" mechanism that guaranttees that we do not surpass the
> capabilities of the machines. The first thing that comes to mind is some
> kind of a fixed size (blocking) queue or a fixed size thread pool. The main
> thread adds splits to the queue and the readers consume them. When the
> queue or the pool is full, then we block (backpressure).
>
> In the case above, how do we make sure that the checkpoints still go
> through?
>
> Cheers,
> Kostas
>
> PS: I am assuming the current task implementation and not an "actor" based
> one.
>
> *[1] The ContinuousFileReaderOperator has a single thread (different from
> the main task thread) consuming the splits one by one. Unfortunately, there
> is no rate-limiting mechanism.
>
>
> On Sun, Nov 25, 2018 at 6:40 PM Biao Liu  wrote:
>
> > Hi community,
> > Glad to see this topic is still so active.
> >
> > Thanks for replying @Piotrek and @Becket.
> >
> > Last time, I expressed some rough ideas about the thread model. However I
> > found that it's hard to describe clearly in mailing list. So I wrote it
> > down with some graphs, exampled some kinds of models, see Thread Model of
> > Source
> > <
> >
> https://docs.google.com/document/d/1XpYkkJo97CUw-UMVrKU6b0ZZuJJ2V7mBb__L6UzdWTw/edit?usp=sharing
> > >.
> > I wish that can be helpful.
> >
> > IMO thread model is an important part. Without thinking of implementation
> > clearly, it's difficult to decide what the up level interface should look
> > like.
> > It would be better if we draw the whole picture first and then fill the
> > detail parts one by one.
> >
> > @Piotrek About adding new splits to existing split reader. It's an
> > interesting idea. Not only for solving too many threads problem, but also
> > for supporting some more complicated system. I know in some storage
> > systems, there is some scenario which the partition is
> dynamic(dynamically
> > splitting or merging). Though I have not think of it very clearly now. I
> > would give you more detailed reply asap :)
> >
> >
> > Guowei Ma  于2018年11月23日周五 下午6:37写道：
> >
> > > Hi,Piotr
> > > Sorry  for so late to response.
> > >
> > >
> > >

Re: [DISCUSS] FLIP-27: Refactor Source Interface

2018-11-26 Thread Biao Liu

Hi Kostas again,

Did I misunderstand you in last response?

If you mean checkpoint in the scenario that the source and the split reader
are in different operators, like Aljoscha's prototype. That's indeed a
problem, so I think that's would not be the final version. Aljoscha also
said in FLIP doc that it is a MVP (minimum viable product? correct me if I
was wrong).

There is some other problem in this scenario. For example, if the split
count is fixed. So the source discovery node will finish soon. If the split
is infinite, such as a message queue. The checkpoint would be never
triggered since source node has already been finished.


Biao Liu  于2018年11月27日周二 上午10:37写道：

> Hi Kostas,
>
> Regarding the checkpoint of "per thread for each split mode". IMO, there
> are severals things source operator need to do.
> 1. Source operator need to record all splits in checkpoint. The unfinished
> splits must be recorded. I'm not sure whether we could skip recording the
> finished splits, it depends on split discovery implementation.
> 2. Source operator need to collect the last record polled from each split
> queue. And put them into checkpoint.
> 3. SplitReader can be restored by giving a specific split with a position
> of last record.
>
> And I think you raised another important issue. The queue between task
> thread and split readers.
> 1. I agree that it must be a thread-safe, size limited queue, such as
> ArrayBlockingQueue.
> 2. Also it's hard to decide the size of queue. We have to consider the
> split count, the size of item in queue to make sure the memory of source
> operator will not be out of control. Giving a unified queue size is not
> proper since there may be several different sources in one job. It's better
> that each source can decide the queue size of itself.
>
>
> Kostas Kloudas  于2018年11月26日周一 下午8:42写道：
>
>> Hi all,
>>
>> From the discussion, I understand that we are leaning towards a design
>> where the user writes a single-threaded SplitReader, which Flink executes
>> on another thread (not the main task thread). This way the task can have
>> multiple readers running concurrently, each one reading a different split.
>>
>> Each of these threads writes in its own queue. These queues are then
>> polled
>> by the main thread (based on a potentially user-defined prioritization),
>> which is responsible for emitting data downstream. There were also
>> proposals for a single shared queue, but I believe that 1) the contention
>> for the lock in such a queue can be a limitation and 2) it is not easy to
>> prioritise which elements to consume first (assuming that we want to
>> support different prioritisation strategies).
>>
>> Assuming the above model, I have the following question:
>>
>> We have the split/shard/partition discovery logic outside the "reader"
>> operator. For now it can be a plain old source function with parallelism
>> of
>> 1 that periodically checks for new splits (for an example see the existing
>> ContinuousFileMonitoringFunction).[1]
>>
>> This source sends the split to be read downstream to the multi-threaded
>> readers. In these settings, there must be a "throttling" or
>> "rate-limitting" mechanism that guaranttees that we do not surpass the
>> capabilities of the machines. The first thing that comes to mind is some
>> kind of a fixed size (blocking) queue or a fixed size thread pool. The
>> main
>> thread adds splits to the queue and the readers consume them. When the
>> queue or the pool is full, then we block (backpressure).
>>
>> In the case above, how do we make sure that the checkpoints still go
>> through?
>>
>> Cheers,
>> Kostas
>>
>> PS: I am assuming the current task implementation and not an "actor" based
>> one.
>>
>> *[1] The ContinuousFileReaderOperator has a single thread (different from
>> the main task thread) consuming the splits one by one. Unfortunately,
>> there
>> is no rate-limiting mechanism.
>>
>>
>> On Sun, Nov 25, 2018 at 6:40 PM Biao Liu  wrote:
>>
>> > Hi community,
>> > Glad to see this topic is still so active.
>> >
>> > Thanks for replying @Piotrek and @Becket.
>> >
>> > Last time, I expressed some rough ideas about the thread model. However
>> I
>> > found that it's hard to describe clearly in mailing list. So I wrote it
>> > down with some graphs, exampled some kinds of models, see Thread Model
>> of
>> > Source
>> > <
>> >
>> https://docs.google.com/document/d/1XpYkkJo97

Re: Disable local data transportation

2019-01-14 Thread Biao Liu

HI Chris,
I'm not sure what you want to test. As far as I know there isn't an option
that forcing the data must be through network. And I don't think it's a
generic feature we should support. I think zhijiang has given a good
suggestion. Changing the runtime codes would be a fast way to satisfy the
requirement. Another choice is that changing the code of Execution.java.
Force generating the "LocationType.REMOTE" type of
"ResultPartitionLocation". It probably works.

zhijiang  于2019年1月14日周一 下午3:44写道：

> Hi Chris,
>
> I am not sure why you do not want to use local channel. Are there any
> problems for local channel in your case?
>
> The root cause of local channel is determined by scheduler which schedules
> both producer and consumer tasks into the same task manager. So if you want
> to change this behaviour, it is better to change the logic or limit in
> scheduler instead of network stack.  Another simple way for your
> requirement is setting only one slot per task manager, then there would be
> only one task running in each task manager.
>
> Best,
> Zhijiang
>
>
> --
> From:Chris Miller 
> Send Time:2019年1月14日(星期一) 00:18
> To:dev 
> Subject:Disable local data transportation
>
>
>
> Hi all,
>
> let's have a look at a simple Join with two DataSources and parallelism
> p=5.
>
> The whole Job consists of 3 parts:
>
> 1. DataSource Task
>
> 2. Join Task
>
> 3. DataSink Task
>
> In the first task, the data is provided and prepared for the Join task.
> In particular each DataSource task creates a ResultPartition which is
> divided into 5 subpartitions. Since 1/5 of the Join Task will be located
> in the same node, one of these subpartitions does not have to be shipped
> over the network.
>
> This one subpartition will be shipped to a LocalInputChannel (not
> RemoteInputChannel) and therefore will not get in touch with the
> network.
>
> Now I made some changes in the network part for my research and would
> like them to affect all subpartitions.
>
> Question:
>
> Is there a feature build into flink to completely disable the local
> stuff and send all subpartitions via network even if they have the same
> location and destination?
>
> If not - does anyone have an idea where to tweak this?
>
> Thanks.
>
> Chris
>
>
>

Re: [DISCUSS] FLIP-27: Refactor Source Interface

2019-01-20 Thread Biao Liu

Hi community,
The summary of Stephan makes a lot sense to me. It is much clearer indeed
after splitting the complex topic into small ones.
I was wondering is there any detail plan for next step? If not, I would
like to push this thing forward by creating some JIRA issues.
Another question is that should version 1.8 include these features?

Stephan Ewen  于2018年12月1日周六 上午4:20写道：

> Thanks everyone for the lively discussion. Let me try to summarize where I
> see convergence in the discussion and open issues.
> I'll try to group this by design aspect of the source. Please let me know
> if I got things wrong or missed something crucial here.
>
> For issues 1-3, if the below reflects the state of the discussion, I would
> try and update the FLIP in the next days.
> For the remaining ones we need more discussion.
>
> I would suggest to fork each of these aspects into a separate mail thread,
> or will loose sight of the individual aspects.
>
> *(1) Separation of Split Enumerator and Split Reader*
>
>   - All seem to agree this is a good thing
>   - Split Enumerator could in the end live on JobManager (and assign splits
> via RPC) or in a task (and assign splits via data streams)
>   - this discussion is orthogonal and should come later, when the interface
> is agreed upon.
>
> *(2) Split Readers for one or more splits*
>
>   - Discussion seems to agree that we need to support one reader that
> possibly handles multiple splits concurrently.
>   - The requirement comes from sources where one poll()-style call fetches
> data from different splits / partitions
> --> example sources that require that would be for example Kafka,
> Pravega, Pulsar
>
>   - Could have one split reader per source, or multiple split readers that
> share the "poll()" function
>   - To not make it too complicated, we can start with thinking about one
> split reader for all splits initially and see if that covers all
> requirements
>
> *(3) Threading model of the Split Reader*
>
>   - Most active part of the discussion ;-)
>
>   - A non-blocking way for Flink's task code to interact with the source is
> needed in order to a task runtime code based on a
> single-threaded/actor-style task design
> --> I personally am a big proponent of that, it will help with
> well-behaved checkpoints, efficiency, and simpler yet more robust runtime
> code
>
>   - Users care about simple abstraction, so as a subclass of SplitReader
> (non-blocking / async) we need to have a BlockingSplitReader which will
> form the basis of most source implementations. BlockingSplitReader lets
> users do blocking simple poll() calls.
>   - The BlockingSplitReader would spawn a thread (or more) and the
> thread(s) can make blocking calls and hand over data buffers via a blocking
> queue
>   - This should allow us to cover both, a fully async runtime, and a simple
> blocking interface for users.
>   - This is actually very similar to how the Kafka connectors work. Kafka
> 9+ with one thread, Kafka 8 with multiple threads
>
>   - On the base SplitReader (the async one), the non-blocking method that
> gets the next chunk of data would signal data availability via a
> CompletableFuture, because that gives the best flexibility (can await
> completion or register notification handlers).
>   - The source task would register a "thenHandle()" (or similar) on the
> future to put a "take next data" task into the actor-style mailbox
>
> *(4) Split Enumeration and Assignment*
>
>   - Splits may be generated lazily, both in cases where there is a limited
> number of splits (but very many), or splits are discovered over time
>   - Assignment should also be lazy, to get better load balancing
>   - Assignment needs support locality preferences
>
>   - Possible design based on discussion so far:
>
> --> SplitReader has a method "addSplits(SplitT...)" to add one or more
> splits. Some split readers might assume they have only one split ever,
> concurrently, others assume multiple splits. (Note: idea behind being able
> to add multiple splits at the same time is to ease startup where multiple
> splits may be assigned instantly.)
> --> SplitReader has a context object on which it can call indicate when
> splits are completed. The enumerator gets that notification and can use to
> decide when to assign new splits. This should help both in cases of sources
> that take splits lazily (file readers) and in case the source needs to
> preserve a partial order between splits (Kinesis, Pravega, Pulsar may need
> that).
> --> SplitEnumerator gets notification when SplitReaders start and when
> they finish splits. They can decide at that moment to push more splits to
> that reader
> --> The SplitEnumerator should probably be aware of the source
> parallelism, to build its initial distribution.
>
>   - Open question: Should the source expose something like "host
> preferences", so that yarn/mesos/k8s can take this into account when
> selecting a node to start a TM on?
>
> *(5) Watermarks

Re: [DISCUSS] FLIP-27: Refactor Source Interface

2019-01-28 Thread Biao Liu

Hi Stephan & Piotrek,

Thank you for feedback.

It seems that there are a lot of things to do in community. I am just
afraid that this discussion may be forgotten since there so many proposals
recently.
Anyway, wish to see the split topics soon :)

Piotr Nowojski  于2019年1月24日周四 下午8:21写道：

> Hi Biao!
>
> This discussion was stalled because of preparations for the open sourcing
> & merging Blink. I think before creating the tickets we should split this
> discussion into topics/areas outlined by Stephan and create Flips for that.
>
> I think there is no chance for this to be completed in couple of remaining
> weeks/1 month before 1.8 feature freeze, however it would be good to aim
> with those changes for 1.9.
>
> Piotrek
>
> > On 20 Jan 2019, at 16:08, Biao Liu  wrote:
> >
> > Hi community,
> > The summary of Stephan makes a lot sense to me. It is much clearer indeed
> > after splitting the complex topic into small ones.
> > I was wondering is there any detail plan for next step? If not, I would
> > like to push this thing forward by creating some JIRA issues.
> > Another question is that should version 1.8 include these features?
> >
> > Stephan Ewen  于2018年12月1日周六 上午4:20写道：
> >
> >> Thanks everyone for the lively discussion. Let me try to summarize
> where I
> >> see convergence in the discussion and open issues.
> >> I'll try to group this by design aspect of the source. Please let me
> know
> >> if I got things wrong or missed something crucial here.
> >>
> >> For issues 1-3, if the below reflects the state of the discussion, I
> would
> >> try and update the FLIP in the next days.
> >> For the remaining ones we need more discussion.
> >>
> >> I would suggest to fork each of these aspects into a separate mail
> thread,
> >> or will loose sight of the individual aspects.
> >>
> >> *(1) Separation of Split Enumerator and Split Reader*
> >>
> >>  - All seem to agree this is a good thing
> >>  - Split Enumerator could in the end live on JobManager (and assign
> splits
> >> via RPC) or in a task (and assign splits via data streams)
> >>  - this discussion is orthogonal and should come later, when the
> interface
> >> is agreed upon.
> >>
> >> *(2) Split Readers for one or more splits*
> >>
> >>  - Discussion seems to agree that we need to support one reader that
> >> possibly handles multiple splits concurrently.
> >>  - The requirement comes from sources where one poll()-style call
> fetches
> >> data from different splits / partitions
> >>--> example sources that require that would be for example Kafka,
> >> Pravega, Pulsar
> >>
> >>  - Could have one split reader per source, or multiple split readers
> that
> >> share the "poll()" function
> >>  - To not make it too complicated, we can start with thinking about one
> >> split reader for all splits initially and see if that covers all
> >> requirements
> >>
> >> *(3) Threading model of the Split Reader*
> >>
> >>  - Most active part of the discussion ;-)
> >>
> >>  - A non-blocking way for Flink's task code to interact with the source
> is
> >> needed in order to a task runtime code based on a
> >> single-threaded/actor-style task design
> >>--> I personally am a big proponent of that, it will help with
> >> well-behaved checkpoints, efficiency, and simpler yet more robust
> runtime
> >> code
> >>
> >>  - Users care about simple abstraction, so as a subclass of SplitReader
> >> (non-blocking / async) we need to have a BlockingSplitReader which will
> >> form the basis of most source implementations. BlockingSplitReader lets
> >> users do blocking simple poll() calls.
> >>  - The BlockingSplitReader would spawn a thread (or more) and the
> >> thread(s) can make blocking calls and hand over data buffers via a
> blocking
> >> queue
> >>  - This should allow us to cover both, a fully async runtime, and a
> simple
> >> blocking interface for users.
> >>  - This is actually very similar to how the Kafka connectors work. Kafka
> >> 9+ with one thread, Kafka 8 with multiple threads
> >>
> >>  - On the base SplitReader (the async one), the non-blocking method that
> >> gets the next chunk of data would signal data availability via a
> >> CompletableFuture, because that gives the best flexibility (can await
> >> completion or

Re: [DISCUSS] FLIP-33: Terminate/Suspend Job with Savepoint

2019-02-12 Thread Biao Liu

Thanks for bringing us this discussion.
I like the idea. It's really useful in production scenario.

+1 for the proposal.

jincheng sun  于2019年2月13日周三 上午9:35写道：

> Thank you for starting the discussion about cancel-with-savepoint Kostas.
>
> +1 for the FLIP.
>
> Cheers,
> Jincheng
>
> Fabian Hueske  于2019年2月13日周三 上午4:31写道：
>
> > Thanks for working on improving cancel-with-savepoint Kostas.
> > Distinguishing the termination modes would be a big step forward, IMO.
> >
> > Btw. there is already another FLIP-33 on the way.
> > This one should be FLIP-34.
> >
> > Cheers,
> > Fabian
> >
> > Am Di., 12. Feb. 2019 um 18:49 Uhr schrieb Stephan Ewen <
> se...@apache.org
> > >:
> >
> > > Thank you for starting this feature discussion.
> > > This is a feature that has been requested various times, great to see
> it
> > > happening.
> > >
> > > +1 for this FLIP
> > >
> > > On Tue, Feb 12, 2019 at 5:28 PM Kostas Kloudas <
> k.klou...@ververica.com>
> > > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > >  A commonly used functionality offered by Flink is the
> > > > "cancel-with-savepoint" operation. When applied to the current
> > > exactly-once
> > > > sinks, the current implementation of the feature can be problematic,
> as
> > > it
> > > > does not guarantee that side-effects will be committed by Flink to
> the
> > > 3rd
> > > > party storage system.
> > > >
> > > >  This discussion targets fixing this issue and proposes the addition
> of
> > > two
> > > > termination modes, namely:
> > > > 1) SUSPEND, for temporarily stopping the job, e.g. for Flink
> > version
> > > > upgrading in your cluster
> > > > 2) TERMINATE, for terminal shut down which ends the stream and
> > sends
> > > > MAX_WATERMARK time, and flushes any state associated with (event
> time)
> > > > timers
> > > >
> > > > A google doc with the FLIP proposal can be found here:
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1EZf6pJMvqh_HeBCaUOnhLUr9JmkhfPgn6Mre_z6tgp8/edit?usp=sharing
> > > >
> > > > And the page for the FLIP is here:
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103090212
> > > >
> > > >  The implementation sketch is far from complete, but it is worth
> > having a
> > > > discussion on the semantics as soon as possible. The implementation
> > > section
> > > > is going to be updated soon.
> > > >
> > > >  Looking forward to the discussion,
> > > >  Kostas
> > > >
> > > > --
> > > >
> > > > Kostas Kloudas | Software Engineer
> > > >
> > > >
> > > > 
> > > >
> > > > Follow us @VervericaData
> > > >
> > > > --
> > > >
> > > > Join Flink Forward  - The Apache Flink
> > > > Conference
> > > >
> > > > Stream Processing | Event Driven | Real Time
> > > >
> > > > --
> > > >
> > > > Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> > > >
> > > > --
> > > > Data Artisans GmbH
> > > > Registered at Amtsgericht Charlottenburg: HRB 158244 B
> > > > Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-27: Refactor Source Interface

2019-03-28 Thread Biao Liu

Hi Steven,
Thank you for the feedback. Please take a look at the document FLIP-27
<https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface>
which
is updated recently. A lot of details of enumerator were added in this
document. I think it would help.

Steven Wu  于2019年3月28日周四 下午12:52写道：

> This proposal mentioned that SplitEnumerator might run on the JobManager or
> in a single task on a TaskManager.
>
> if enumerator is a single task on a taskmanager, then the job DAG can never
> been embarrassingly parallel anymore. That will nullify the leverage of
> fine-grained recovery for embarrassingly parallel jobs.
>
> It's not clear to me what's the implication of running enumerator on the
> jobmanager. So I will leave that out for now.
>
> On Mon, Jan 28, 2019 at 3:05 AM Biao Liu  wrote:
>
> > Hi Stephan & Piotrek,
> >
> > Thank you for feedback.
> >
> > It seems that there are a lot of things to do in community. I am just
> > afraid that this discussion may be forgotten since there so many
> proposals
> > recently.
> > Anyway, wish to see the split topics soon :)
> >
> > Piotr Nowojski  于2019年1月24日周四 下午8:21写道：
> >
> > > Hi Biao!
> > >
> > > This discussion was stalled because of preparations for the open
> sourcing
> > > & merging Blink. I think before creating the tickets we should split
> this
> > > discussion into topics/areas outlined by Stephan and create Flips for
> > that.
> > >
> > > I think there is no chance for this to be completed in couple of
> > remaining
> > > weeks/1 month before 1.8 feature freeze, however it would be good to
> aim
> > > with those changes for 1.9.
> > >
> > > Piotrek
> > >
> > > > On 20 Jan 2019, at 16:08, Biao Liu  wrote:
> > > >
> > > > Hi community,
> > > > The summary of Stephan makes a lot sense to me. It is much clearer
> > indeed
> > > > after splitting the complex topic into small ones.
> > > > I was wondering is there any detail plan for next step? If not, I
> would
> > > > like to push this thing forward by creating some JIRA issues.
> > > > Another question is that should version 1.8 include these features?
> > > >
> > > > Stephan Ewen  于2018年12月1日周六 上午4:20写道：
> > > >
> > > >> Thanks everyone for the lively discussion. Let me try to summarize
> > > where I
> > > >> see convergence in the discussion and open issues.
> > > >> I'll try to group this by design aspect of the source. Please let me
> > > know
> > > >> if I got things wrong or missed something crucial here.
> > > >>
> > > >> For issues 1-3, if the below reflects the state of the discussion, I
> > > would
> > > >> try and update the FLIP in the next days.
> > > >> For the remaining ones we need more discussion.
> > > >>
> > > >> I would suggest to fork each of these aspects into a separate mail
> > > thread,
> > > >> or will loose sight of the individual aspects.
> > > >>
> > > >> *(1) Separation of Split Enumerator and Split Reader*
> > > >>
> > > >>  - All seem to agree this is a good thing
> > > >>  - Split Enumerator could in the end live on JobManager (and assign
> > > splits
> > > >> via RPC) or in a task (and assign splits via data streams)
> > > >>  - this discussion is orthogonal and should come later, when the
> > > interface
> > > >> is agreed upon.
> > > >>
> > > >> *(2) Split Readers for one or more splits*
> > > >>
> > > >>  - Discussion seems to agree that we need to support one reader that
> > > >> possibly handles multiple splits concurrently.
> > > >>  - The requirement comes from sources where one poll()-style call
> > > fetches
> > > >> data from different splits / partitions
> > > >>--> example sources that require that would be for example Kafka,
> > > >> Pravega, Pulsar
> > > >>
> > > >>  - Could have one split reader per source, or multiple split readers
> > > that
> > > >> share the "poll()" function
> > > >>  - To not make it too complicated, we can start with thinking about
> > one
> > > >> split reader for all splits initially and see if that covers all
> > > >

Re: [jira] [Created] (FLINK-12153) 提交flink job到flink环境下报错

2019-04-10 Thread Biao Liu

Hi gaojunjie,
1. Please use English to describe your JIRA issue although I think this is
more like a question not a bug report
2. You can send your question in flink-user-zh mailing list which Chinese
is supported
3. I think the exception is clear enough that this feature is not supported
in your Hadoop version

gaojunjie (JIRA)  于2019年4月10日周三 下午5:06写道：

> gaojunjie created FLINK-12153:
> -
>
>  Summary: 提交flink job到flink环境下报错
>  Key: FLINK-12153
>  URL: https://issues.apache.org/jira/browse/FLINK-12153
>  Project: Flink
>   Issue Type: Bug
> Affects Versions: 1.7.2
>  Environment: flink maven
>
> 
>  org.apache.flink
>  flink-streaming-java_2.12
>  1.7.1
> 
> 
> 
>  org.apache.flink
>  flink-connector-kafka-0.11_2.12
>  1.7.1
> 
>
>
> 
>  org.apache.hadoop
>  hadoop-hdfs
>  2.7.2
>  
>  
>  xml-apis
>  xml-apis
>  
>  
> 
>
>
> 
>  org.apache.hadoop
>  hadoop-common
>  2.7.2
> 
>
>
> 
> 
>  org.apache.flink
>  flink-hadoop-compatibility_2.12
>  1.7.1
> 
>
>
> 
> 
>  org.apache.flink
>  flink-connector-filesystem_2.12
>  1.7.1
> 
>
> 
> 
>  org.apache.flink
>  flink-connector-elasticsearch5_2.12
>  1.7.1
> 
>
>
>
>
>
> hadoop 环境版本 2.7.7
>
>
> Reporter: gaojunjie
>
>
> java.lang.UnsupportedOperationException: Recoverable writers on Hadoop are
> only supported for HDFS and for Hadoop version 2.7 or newer
> at
> org.apache.flink.runtime.fs.hdfs.HadoopRecoverableWriter.(HadoopRecoverableWriter.java:57)
> at
> org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.createRecoverableWriter(HadoopFileSystem.java:202)
> at
> org.apache.flink.core.fs.SafetyNetWrapperFileSystem.createRecoverableWriter(SafetyNetWrapperFileSystem.java:69)
> at
> org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.(Buckets.java:112)
> at
> org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink$RowFormatBuilder.createBuckets(StreamingFileSink.java:242)
> at
> org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink.initializeState(StreamingFileSink.java:327)
> at
> org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178)
> at
> org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160)
> at
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96)
> at
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:278)
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:738)
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:289)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:704)
> at java.lang.Thread.run(Thread.java:748)
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v7.6.3#76005)
>

Re: [ANNOUNCE] Apache Flink 1.8.0 released

2019-04-10 Thread Biao Liu

Great news! Thanks Aljoscha and all the contributors.

Till Rohrmann  于2019年4月10日周三 下午6:11写道：

> Thanks a lot to Aljoscha for being our release manager and to the
> community making this release possible!
>
> Cheers,
> Till
>
> On Wed, Apr 10, 2019 at 12:09 PM Hequn Cheng  wrote:
>
>> Thanks a lot for the great release Aljoscha!
>> Also thanks for the work by the whole community. :-)
>>
>> Best, Hequn
>>
>> On Wed, Apr 10, 2019 at 6:03 PM Fabian Hueske  wrote:
>>
>>> Congrats to everyone!
>>>
>>> Thanks Aljoscha and all contributors.
>>>
>>> Cheers, Fabian
>>>
>>> Am Mi., 10. Apr. 2019 um 11:54 Uhr schrieb Congxian Qiu <
>>> qcx978132...@gmail.com>:
>>>
 Cool!

 Thanks Aljoscha a lot for being our release manager, and all the others
 who make this release possible.

 Best, Congxian
 On Apr 10, 2019, 17:47 +0800, Jark Wu , wrote:
 > Cheers!
 >
 > Thanks Aljoscha and all others who make 1.8.0 possible.
 >
 > On Wed, 10 Apr 2019 at 17:33, vino yang 
 wrote:
 >
 > > Great news!
 > >
 > > Thanks Aljoscha for being the release manager and thanks to all the
 > > contributors!
 > >
 > > Best,
 > > Vino
 > >
 > > Driesprong, Fokko  于2019年4月10日周三 下午4:54写道：
 > >
 > > > Great news! Great effort by the community to make this happen.
 Thanks all!
 > > >
 > > > Cheers, Fokko
 > > >
 > > > Op wo 10 apr. 2019 om 10:50 schreef Shaoxuan Wang <
 wshaox...@gmail.com>:
 > > >
 > > > > Thanks Aljoscha and all others who made contributions to FLINK
 1.8.0.
 > > > > Looking forward to FLINK 1.9.0.
 > > > >
 > > > > Regards,
 > > > > Shaoxuan
 > > > >
 > > > > On Wed, Apr 10, 2019 at 4:31 PM Aljoscha Krettek <
 aljos...@apache.org>
 > > > > wrote:
 > > > >
 > > > > > The Apache Flink community is very happy to announce the
 release of
 > > > > Apache
 > > > > > Flink 1.8.0, which is the next major release.
 > > > > >
 > > > > > Apache Flink® is an open-source stream processing framework
 for
 > > > > > distributed, high-performing, always-available, and accurate
 data
 > > > > streaming
 > > > > > applications.
 > > > > >
 > > > > > The release is available for download at:
 > > > > > https://flink.apache.org/downloads.html
 > > > > >
 > > > > > Please check out the release blog post for an overview of the
 > > > > improvements
 > > > > > for this bugfix release:
 > > > > > https://flink.apache.org/news/2019/04/09/release-1.8.0.html
 > > > > >
 > > > > > The full release notes are available in Jira:
 > > > > >
 > > > > >
 > > > >
 > > >
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
 > > > > >
 > > > > > We would like to thank all contributors of the Apache Flink
 community
 > > > who
 > > > > > made this release possible!
 > > > > >
 > > > > > Regards,
 > > > > > Aljoscha
 > > > >
 > > >
 > >

>>>

Re: Introducing Flink's Plugin mechanism

2019-04-10 Thread Biao Liu

Hi Stefan &Piotr,
Thank you for bringing this discussion. As Zhijiang said, class conflict
makes a lot of trouble in our production environment.
I was wondering is there any design document currently? It might be helpful
to understand the PR and even the whole picture as Piotr said in the future
it could be extended to other modules.

Stefan Richter  于2019年4月10日周三 下午11:22写道：

> Thank you Piotr for bringing this discussion to the mailing list! As it
> was not explicitly mentioned in the first email, I wanted to add that there
> is also already an open PR[1] with my implementation of the basic plugin
> mechanism for FileSystem. Looking forward to some feedback from the
> community.
>
>
> [1] https://github.com/apache/flink/pull/8038 <
> https://github.com/apache/flink/pull/8038>
>
> Best,
> Stefan
>
> > On 10. Apr 2019, at 17:08, zhijiang 
> wrote:
> >
> > Thanks Piotr for proposing this new feature.
> >
> > The solution for class loader issue is really helpful in production, and
> we ofen encountered this pain point before.
> > It might bring more possibilities based on this pluggable mechanism.
> Hope to see the progress soon. :)
> >
> > Best,
> > Zhijiang
> > --
> > From:Jeff Zhang 
> > Send Time:2019年4月10日(星期三) 22:01
> > To:dev 
> > Subject:Re: Introducing Flink's Plugin mechanism
> >
> > Thank Piotr for driving this plugin mechanism.  Pluggability is pretty
> > important for the ecosystem of flink.
> >
> > Piotr Nowojski  于2019年4月10日周三 下午5:48写道：
> >
> >> Hi Flink developers,
> >>
> >> I would like to introduce a new plugin loading mechanism that we are
> >> working on right now [1]. The idea is quite simple: isolate services in
> >> separate independent class loaders, so that classes and dependencies do
> not
> >> leak between them and/or Flink runtime itself. Currently we have quite
> some
> >> problems with dependency convergence in multiple places. Some of them we
> >> are solving by shading (built in file systems, metrics), some we are
> >> forcing users to deal with them (custom file systems/metrics) and
> others we
> >> do not solve (connectors - we do not support using different Kafka
> versions
> >> in the same job/SQL). With proper plugins, that are loaded in
> independent
> >> class loaders, those issues could be solved in a generic way.
> >>
> >> Current scope of implementation targets only file systems, without a
> >> centralised Plugin architecture and with Plugins that are only
> “statically”
> >> initialised at the TaskManager and JobManager start up. More or less we
> are
> >> just replacing the way how FileSystem’s implementations are discovered &
> >> loaded.
> >>
> >> In the future this idea could be extended to different modules, like
> >> metric reporters, connectors, functions/data types (especially in SQL),
> >> state backends, internal storage or other future efforts. Some of those
> >> would be easier than others: the metric reporters would require some
> >> smaller refactor, while connectors would require some bigger API design
> >> discussions, which I would like to avoid at the moment. Nevertheless I
> >> wanted to reach out with this idea so if some other potential use cases
> pop
> >> up in the future, more people will be aware.
> >>
> >> Piotr Nowojski
> >>
> >>
> >> [1] https://issues.apache.org/jira/browse/FLINK-11952 <
> >> https://issues.apache.org/jira/browse/FLINK-11952>
> >
> >
> >
> > --
> > Best Regards
> >
> > Jeff Zhang
> >
>
>

Re: [DISCUSS] Features for Apache Flink 1.9.0

2019-05-28 Thread Biao Liu

Thanks for being the release manager, Gordon & Kurt.

For FLIP-27, there are still some more details need to discuss. I don't
think it could catch up the release of 1.9. @Aljoscha, @Stephan, do you
agree that?

zhijiang  于2019年5月28日周二 下午11:28写道：

> Hi Gordon,
>
> Thanks for the kind reminder of feature freeze date for 1.9.0. I think the
> date makes sense on my side.
>
> For FLIP-31, I and Andrey could be done within two weeks or so.
> And I already finished my side work for FLIP-1.
>
> Best,
> Zhijiang
>
>
> --
> From:Timo Walther 
> Send Time:2019年5月28日(星期二) 19:26
> To:dev 
> Subject:Re: [DISCUSS] Features for Apache Flink 1.9.0
>
> Thanks for being the release managers, Kurt and Gordon!
>
>  From the Table & SQL API side, there are still a lot of open issues
> that need to be solved to decouple the API from a planner and enable the
> Blink planner. Also we need to make sure that the Blink planner supports
> at least everything of Flink 1.8 to not introduce a regression. We might
> need to focus more on the main features which is a runnable Blink
> planner and might need to postpone other discussions such as DDL, new
> source/sink interfaces, or proper type inference logic. However, in many
> cases there are shortcuts that we could take in order to achieve our
> goals. So I'm confident that we solve the big blockers until the feature
> freeze :)
>
> I will keep you updated.
>
> Thanks,
> Timo
>
>
> Am 28.05.19 um 05:07 schrieb Kurt Young:
> > Thanks Gordon for bringing this up.
> >
> > I'm glad to say that blink planner merge work is almost done, and i will
> > follow up the work of
> > integrating blink planner with Table API to co-exist with current flink
> > planner.
> >
> > In addition to this, the following features:
> > 1. FLIP-32: Restructure flink-table for future contributions [1]
> > 2. FLIP-37: Rework of the Table API Type System [2]
> > 3. Hive integration work (including hive meta [3] and connectors)
> >
> > are also going well, i will spend some time to keep track of them.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-32%3A+Restructure+flink-table+for+future+contributions
> > [2]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System
> > [3]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs
> >
> > Best,
> > Kurt
> >
> >
> > On Mon, May 27, 2019 at 7:18 PM jincheng sun 
> > wrote:
> >
> >> Hi Gordon,
> >>
> >> Thanks for mention the feature freeze date for 1.9.0, that's very
> helpful
> >> for contributors to evaluate their dev plan！
> >>
> >> Regarding FLIP-29, we are glad to do our best to finish the dev of
> FLIP-29,
> >> then catch up with the release of 1.9.
> >>
> >> Thanks again for push the release of 1.9.0 forward!
> >>
> >> Cheers,
> >> Jincheng
> >>
> >>
> >>
> >> Tzu-Li (Gordon) Tai  于2019年5月27日周一 下午5:48写道：
> >>
> >>> Hi all,
> >>>
> >>> I want to kindly remind the community that we're now 5 weeks away from
> >> the
> >>> proposed feature freeze date for 1.9.0, which is June 28.
> >>>
> >>> This is not yet a final date we have agreed on, so I would like to
> start
> >>> collecting feedback on how the mentioned features are going, and in
> >>> general, whether or not the date sounds reasonable given the current
> >> status
> >>> of the ongoing efforts.
> >>> Please let me know what you think!
> >>>
> >>> Cheers,
> >>> Gordon
> >>>
> >>>
> >>> On Mon, May 27, 2019 at 5:40 PM Tzu-Li (Gordon) Tai <
> tzuli...@apache.org
> >>>
> >>> wrote:
> >>>
>  @Hequn @Jincheng
> 
>  Thanks for bringing up FLIP-29 to attention.
>  As previously mentioned, the original list is not a fixed feature set,
> >> so
>  if FLIP-29 has ongoing efforts and can make it before the feature
> >> freeze,
>  then of course it should be included!
> 
>  @himansh1306
> 
>  Concerning the ORC format for StreamingFileSink, is there already a
> >> JIRA
>  ticket tracking that? If not, I suggest to first open one and see if
> >>> there
>  are similar interests from committers in adding that.
> 
> 
>  On Sun, May 5, 2019 at 11:19 PM Hequn Cheng 
> >>> wrote:
> > Hi,
> >
> > Great job, Gordon! Thanks a lot for driving this and wrapping
> features
> >>> up
> > to a detailed list. +1 on it!
> >
> > Would be great if we can also add flip29 to the list. @jincheng sun
> >   and I are focusing on it these days. I
> >>> think
> > these features in flip29 would bring big enhancements to the Table
> >> API.
> > :-)
> >
> > Best, Hequn
> >
> > On Sun, May 5, 2019 at 10:41 PM Becket Qin 
> >>> wrote:
> >> Thanks for driving this release, Gordon. +1 on the feature list.
> >>
> >> This is a pretty exciting and ambitious release!
> >>
> >> Cheers,
> >>
> >> Jiangjie (Becket) Qin
> >>
> >> On Sun, May 5, 2019 at 4:28

Re: [DISCUSS] Support Local Aggregation in Flink

2019-06-04 Thread Biao Liu

Hi Vino,

+1 for this feature. It's useful for data skew. And it could also reduce
shuffled datum.

I have some concerns about the API part. From my side, this feature should
be more like an improvement. I'm afraid the proposal is an overkill about
the API part. Many other systems support pre-aggregation as an optimization
of global aggregation. The optimization might be used automatically or
manually but with a simple API. The proposal introduces a series of
flexible local aggregation APIs. They could be independent with global
aggregation. It doesn't look like an improvement but introduces a lot of
features. I'm not sure if there is a bigger picture later. As for now the
API part looks a little heavy for me.


vino yang  于2019年6月5日周三 上午10:38写道：

> Hi Litree,
>
> From an implementation level, the localKeyBy API returns a general
> KeyedStream, you can call all the APIs which KeyedStream provides, we did
> not restrict its usage, although we can do this (for example returns a new
> stream object named LocalKeyedStream).
>
> However, to achieve the goal of local aggregation, it only makes sense to
> call the window API.
>
> Best,
> Vino
>
> litree  于2019年6月4日周二 下午10:41写道：
>
> > Hi Vino，
> >
> >
> > I have read your design，something I want to know is the usage of these
> new
> > APIs.It looks like when I use localByKey,i must then use a window
> operator
> > to return a datastream，and then use keyby and another window operator to
> > get the final result?
> >
> >
> > thanks,
> > Litree
> >
> >
> > On 06/04/2019 17:22, vino yang wrote:
> > Hi Dian,
> >
> > Thanks for your reply.
> >
> > I know what you mean. However, if you think deeply, you will find your
> > implementation need to provide an operator which looks like a window
> > operator. You need to use state and receive aggregation function and
> > specify the trigger time. It looks like a lightweight window operator.
> > Right?
> >
> > We try to reuse Flink provided functions and reduce complexity. IMO, It
> is
> > more user-friendly because users are familiar with the window API.
> >
> > Best,
> > Vino
> >
> >
> > Dian Fu  于2019年6月4日周二 下午4:19写道：
> >
> > > Hi Vino,
> > >
> > > Thanks a lot for starting this discussion. +1 to this feature as I
> think
> > > it will be very useful.
> > >
> > > Regarding to using window to buffer the input elements, personally I
> > don't
> > > think it's a good solution for the following reasons:
> > > 1) As we know that WindowOperator will store the accumulated results in
> > > states, this is not necessary for Local Aggregate operator.
> > > 2) For WindowOperator, each input element will be accumulated to
> states.
> > > This is also not necessary for Local Aggregate operator and storing the
> > > input elements in memory is enough.
> > >
> > > Thanks,
> > > Dian
> > >
> > > > 在 2019年6月4日，上午10:03，vino yang  写道：
> > > >
> > > > Hi Ken,
> > > >
> > > > Thanks for your reply.
> > > >
> > > > As I said before, we try to reuse Flink's state concept (fault
> > tolerance
> > > > and guarantee "Exactly-Once" semantics). So we did not consider
> cache.
> > > >
> > > > In addition, if we use Flink's state, the OOM related issue is not a
> > key
> > > > problem we need to consider.
> > > >
> > > > Best,
> > > > Vino
> > > >
> > > > Ken Krugler  于2019年6月4日周二 上午1:37写道：
> > > >
> > > >> Hi all,
> > > >>
> > > >> Cascading implemented this “map-side reduce” functionality with an
> LLR
> > > >> cache.
> > > >>
> > > >> That worked well, as then the skewed keys would always be in the
> > cache.
> > > >>
> > > >> The API let you decide the size of the cache, in terms of number of
> > > >> entries.
> > > >>
> > > >> Having a memory limit would have been better for many of our use
> > cases,
> > > >> though FWIR there’s no good way to estimate in-memory size for
> > objects.
> > > >>
> > > >> — Ken
> > > >>
> > > >>> On Jun 3, 2019, at 2:03 AM, vino yang 
> wrote:
> > > >>>
> > > >>> Hi Piotr,
> > > >>>
> > > >>> The localKeyBy API returns an instance of KeyedStream (we just
> added
> > an
> > > >>> inner flag to identify the local mode) which is Flink has provided
> > > >> before.
> > > >>> Users can call all the APIs(especially *window* APIs) which
> > KeyedStream
> > > >>> provided.
> > > >>>
> > > >>> So if users want to use local aggregation, they should call the
> > window
> > > >> API
> > > >>> to build a local window that means users should (or say "can")
> > specify
> > > >> the
> > > >>> window length and other information based on their needs.
> > > >>>
> > > >>> I think you described another idea different from us. We did not
> try
> > to
> > > >>> react after triggering some predefined threshold. We tend to give
> > users
> > > >> the
> > > >>> discretion to make decisions.
> > > >>>
> > > >>> Our design idea tends to reuse Flink provided concept and functions
> > > like
> > > >>> state and window (IMO, we do not need to worry about OOM and the
> > issues
> > > >> you
> > > >>> mentioned).
> > > >>>
> > > >>> Best,
> > > >>> V

Re: [DISCUSS] Allow at-most-once delivery in case of failures

2019-06-11 Thread Biao Liu

Hi Xiaogang, it's an interesting discussion.

I have heard some of similar feature requirements before. Some users need a
lighter failover strategy since the correctness is not so critical for
their scenario as you mentioned. Even more some jobs may not enable
checkpointing at all, a global/region failover strategy actually doesn't
make sense for these jobs. The individual failover strategy doesn't work
well for these scenario since it only supports a topology without edges
currently.
Actually we have implemented a Best-effort failover strategy in our private
branch. There is a little difference with your proposal that it doesn't
support at-most-once mechanism. It has a weaker consistency model but with
a faster recovery ability. I think it would satisfy your scenario.


SHI Xiaogang  于2019年6月11日周二 下午4:33写道：

> Flink offers a fault-tolerance mechanism to guarantee at-least-once and
> exactly-once message delivery in case of failures. The mechanism works well
> in practice and makes Flink stand out among stream processing systems.
>
> But the guarantee on at-least-once and exactly-once delivery does not come
> without price. It typically requires to restart multiple tasks and fall
> back to the place where the last checkpoint is taken. (Fined-grained
> recovery can help alleviate the cost, but it still needs certain efforts to
> recover jobs.)
>
> In some senarios, users perfer quick recovery and will trade correctness
> off. For example, in some online recommendation systems, timeliness is far
> more important than consistency. In such cases, we can restart only those
> failed tasks individually, and do not need to perform any rollback. Though
> some messages delivered to failed tasks may be lost, other tasks can
> continuously provide service to users.
>
> Many of our users are demanding for at-most-once delivery in Flink. What do
> you think of the proposal? Any feedback is appreciated.
>
> Regards,
> Xiaogang Shi
>

Re: [DISCUSS] Allow at-most-once delivery in case of failures

2019-06-11 Thread Biao Liu

Hi Piotrek,
I agree with you that there are strained resources of community to support
such a feature. I was planing to start a similar discussion after 1.9
released. Anyway we don't have enough time to support this feature now, but
I think a discussion is fine.
It's very interesting of your checkpoint semantic question. I think it is
worth to support however it might not be a small modification.

There is also a big gap need to discuss. Currently the network error
handling is tightly coupled with task failover strategy. There is a typical
scenario, if a TM is crashed, all the tasks of TMs connected with the
failed TM would fail automatically. In our internal implementation, this is
the biggest part to support Best-effort failover strategy.


Piotr Nowojski  于2019年6月11日周二 下午5:31写道：

> Hi Xiaogang,
>
> It sounds interesting and definitely a useful feature, however the
> questions for me would be how useful, how much effort would it require and
> is it worth it? We simply can not do all things at once, and currently
> people that could review/drive/mentor this effort are pretty much strained
> :( For me one would have to investigate answers to those questions and
> prioritise it compared to other ongoing efforts, before I could vote +1 for
> this.
>
> Couple of things to consider:
> - would it be only a job manager/failure region recovery feature?
> - would it require changes in CheckpointBarrierHandler,
> CheckpointCoordinator classes?
> - with `at-most-once` semantic theoretically speaking we could just drop
> the current `CheckpointBarrier` handling/injecting code and avoid all of
> the checkpoint alignment issues - we could just checkpoint all of the tasks
> independently of one another. However maybe that could be a follow up
> optimisation step?
>
> Piotrek
>
> > On 11 Jun 2019, at 10:53, Zili Chen  wrote:
> >
> > Hi Xiaogang,
> >
> > It is an interesting topic.
> >
> > Notice that there is some effort to build a mature mllib of flink these
> > days, it could be also possible for some ml cases trade off correctness
> for
> > timeliness or throughput. Excatly-once delivery excatly makes flink stand
> > out but an at-most-once option would adapt flink to more scenarios.
> >
> > Best,
> > tison.
> >
> >
> > SHI Xiaogang  于2019年6月11日周二 下午4:33写道：
> >
> >> Flink offers a fault-tolerance mechanism to guarantee at-least-once and
> >> exactly-once message delivery in case of failures. The mechanism works
> well
> >> in practice and makes Flink stand out among stream processing systems.
> >>
> >> But the guarantee on at-least-once and exactly-once delivery does not
> come
> >> without price. It typically requires to restart multiple tasks and fall
> >> back to the place where the last checkpoint is taken. (Fined-grained
> >> recovery can help alleviate the cost, but it still needs certain
> efforts to
> >> recover jobs.)
> >>
> >> In some senarios, users perfer quick recovery and will trade correctness
> >> off. For example, in some online recommendation systems, timeliness is
> far
> >> more important than consistency. In such cases, we can restart only
> those
> >> failed tasks individually, and do not need to perform any rollback.
> Though
> >> some messages delivered to failed tasks may be lost, other tasks can
> >> continuously provide service to users.
> >>
> >> Many of our users are demanding for at-most-once delivery in Flink.
> What do
> >> you think of the proposal? Any feedback is appreciated.
> >>
> >> Regards,
> >> Xiaogang Shi
> >>
>
>

Re: Something wrong with travis?

2019-06-18 Thread Biao Liu

It has been crashed for more than 14 hours. Hope it recovers soon.

Jeff Zhang  于2019年6月18日周二 下午3:21写道：

> If it is travis caching issue, we can file apache infra ticket and ask them
> to clean the cache.
>
>
>
> Chesnay Schepler  于2019年6月18日周二 下午3:18写道：
>
> > This is (hopefully a short-lived) hiccup on the Travis caching
> > infrastructure.
> >
> > There's nothing we can do to _fix_ it; if it persists we'll have to
> > rework our travis setup again to not rely on caching.
> >
> > On 18/06/2019 08:34, Kurt Young wrote:
> > > Hi dev,
> > >
> > > I noticed that all the travis tests triggered by pull request are
> failed
> > > with the same error:
> > >
> > > "Cached flink dir /home/travis/flink_cache/x/flink does not exist.
> > > Exiting build."
> > >
> > > Anyone have a clue on what happened and how to fix this?
> > >
> > > Best,
> > > Kurt
> > >
> >
> >
>
> --
> Best Regards
>
> Jeff Zhang
>

Re: [DISCUSS] Start a user...@flink.apache.org mailing list for the Chinese-speaking community?

2019-06-24 Thread Biao Liu

Hi community,

I have tested two most widely used mailbox website mail.qq.com and
mail.163.com(belongs to Netease).

qq.mail works fine, good job @vino yang !
163.mail almost works fine. I have received all the response mails,
although the last one is marked as spam.

I received the new mail of the mailing list from both of them. But there is
a delay of 7 or 8 minutes of 163.com. I'm not sure it's always like this or
not. Will keep observing it.

Another thing, the survey of @Hequn Cheng  shows that
about 14 people said their questions are not answered in mailing list.
There always be some questions that are hard to answer in the user-zh
mailing list. Like "why my job is delayed" but without providing any
detail, or some questions that are more relevant with Java or operating
system but not Flink. Mail like these probably be ignored.

I have a proposal that we Chinese-speaker of community could do better if
we can give them a response even it's not an answer.
A positive feedback is helpful for the user without a good technical
background. I think the user-zh mailing list would be more active if each
question gets a feedback.


Hequn Cheng  于2019年6月24日周一 上午10:34写道：

> Hi,
>
> I'd like to share you the result of the survey. Thanks for the help
> from @Gordon, @Jark and the Flink China operation team when conducting the
> survey.
>
> A total of 81 people participated in the survey. 46 of them choose not to
> use the mailing list. Among these people, the reasons are(Note that a
> respondent may choose multiple reasons):
> 1. 22 people report that problems can be solved in the Dingtalk group and
> it's more convenient than the mailing list.
> 2. 22 people even don't know there is a Chinese user mailing list.
> 3. 20 people don't know how to use the mailing list even though they have
> ever heard of it.
> 4. 14 people said that problems can't be solved in the mailing list even
> they asked in it.
> 5. 16 people choose to use the English user mailing list.
>
> From the result, the biggest obstacle that stops more people involved in
> the mailing list is people don't know it or don't know how to use it. To
> solve the problem, we can do more publicity. I have also recorded a usage
> video about how to subscribe and use the mailing list. Hope it will help.
>
> However, doing more publicity is not enough as 14 people said problems
> can't be solved efficiently in the mailing list. More people should also be
> involved to answer the problems. I don't know whether is it possible for
> the Chinese Flink team on duty to answer the problems. I think that would
> help.
>
> Great to have other opinions.
>
> Best, Hequn
>
>
> On Fri, Jun 21, 2019 at 7:50 PM Hequn Cheng  wrote:
>
> > Hi vino,
> >
> > Thanks a lot for unblocking the email address. I have told the user about
> > this.
> > Hope things can get better.
> >
> > Best, Hequn
> >
> > On Fri, Jun 21, 2019 at 3:14 PM vino yang  wrote:
> >
> >> Hi Hequn,
> >>
> >> Thanks for reporting this case.
> >>
> >> The reason replied by QQ mail team is also caused by *bounce attack*.
> So
> >> this mail address has been intercepted and it's an IP level
> interception.
> >>
> >> Today, the QQ mail team has unblocked this email address. So it can
> >> receive
> >> the follow-up email from Apache mail server normally.
> >>
> >> If this email address still can not work normally in the future. Please
> >> report it here again.
> >>
> >> Best,
> >> Vino
> >>
> >>
> >> Hequn Cheng  于2019年6月21日周五 下午2:39写道：
> >>
> >> > Hi Vino,
> >> >
> >> > Great thanks for your help.
> >> >
> >> > > So if someone reports that they can't receive the email from Apache
> >> mail
> >> > server, they can provide more detailed information to the QQ mailbox
> to
> >> > facilitate the location problem.
> >> >
> >> > I just got one feedback.
> >> > A user(173855...@qq.com) report that he can't receive the emails from
> >> the
> >> > Chinese-speaking mailing list. He had subscripted successfully on
> >> > 2019-05-10. Everything goes well until 2019-05-10 and no more emails
> >> come
> >> > again from the mailing list.
> >> >
> >> > Best, Hequn
> >> >
> >> > On Fri, Jun 21, 2019 at 12:56 PM vino yang 
> >> wrote:
> >> >
> >> > > Hi Kurt,
> >> > >
> >> > > I have copied my reply to the Jira issue of INFRA[1].
> >> > >
> >> > > Within my ability, I am happy to coordinate and promote this
> problem.
> >> > >
> >> > > Best,
> >> > > Vino
> >> > >
> >> > > [1]: https://issues.apache.org/jira/browse/INFRA-18249
> >> > >
> >> > > Kurt Young  于2019年6月21日周五 下午12:11写道：
> >> > >
> >> > > > Hi vino,
> >> > > >
> >> > > > Thanks for your effort. Could you also share this information with
> >> > apache
> >> > > > INFRA? Maybe we can find a workable solution together.
> >> > > > You can try to leave comments in this jira:
> >> > > > https://issues.apache.org/jira/browse/INFRA-18249)
> >> > > >
> >> > > > Best,
> >> > > > Kurt
> >> > > >
> >> > > >
> >> > > > On Fri, Jun 21, 2019 at 11:45 AM vino yang  >
> >> > > wrote:
> >> > >

Re: [ANNOUNCE] Jincheng Sun is now part of the Flink PMC

2019-06-24 Thread Biao Liu

Congratulations Jincheng!


qianjin Xu  于2019年6月25日周二 上午10:25写道：

> Congratulations Jincheng!
>
>
> Best
>
> Forward
>
>
>
>
> Robert Metzger  于2019年6月24日周一 下午11:09写道：
>
> > Hi all,
> >
> > On behalf of the Flink PMC, I'm happy to announce that Jincheng Sun is
> now
> > part of the Apache Flink Project Management Committee (PMC).
> >
> > Jincheng has been a committer since July 2017. He has been very active on
> > Flink's Table API / SQL component, as well as helping with releases.
> >
> > Congratulations & Welcome Jincheng!
> >
> > Best,
> > Robert
> >
>

Re: [DISCUSS] META-FLIP: Sticking (or not) to a strict FLIP voting process

2019-06-27 Thread Biao Liu

Hi community,

Thanks Aljoscha for bringing us this discussion.

As Aljoscha said, "lazy majority" is always the voting rule of FLIP. It
seems that people just ignored or didn't realized this rule.
My concern is that what we can do to make sure developers will obey the
rules.
I think Kurt has given a good suggestion. Since the community is growing
bigger and bigger, maybe we need some volunteers to host the progress of
FLIP. Like start a discussion/voting in ML or update the sheet of FLIP
document [1].

1.
https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals



Dawid Wysakowicz  于2019年6月27日周四 下午2:56写道：

> Hi all,
>
> I do very much agree with the statement from Aljosha's initial message,
> which is currently also expressed in the description page of a FLIP.
>
> These will stick around for quite a while after they’re implemented and the 
> PMC (and the committers) has the burden of maintaining them. I think that 
> therefore FLIP votes are even move important than release votes, because they 
> steer the long time direction of Flink.
>
>
> Therefore I think we should enforce following the lazy majority approach.
> I will probably just repeat what was already said, but I do think this
> would make the decisions more visible, easier to reference in case of
> related decisions, and also this would show if the community has capacity
> to implement the FLIP. Nowadays, even if a FLIP is "accepted" it might be
> just stale because there are no committers that have the capacity to help
> with the changes.
>
> Another, maybe an orthogonal issue, is that we could maybe use this
> process for agreeing on a scope of a release. I think it might make sense
> to construct a release plan of an accepted FLIPs. This would enforce better
> scoping of FLIPs, as they would have to fit into a single release. In my
> opinion FLIPs that spawn multiple releases(thus even over multiple years)
> are rarely relevant in the future anymore, as the project evolves and it
> usually makes sense to revisit the original proposal anyway. This would
> have the benefits that:
>
>- we have a clear scope for a release rather than just a vague list of
>features that we want to have.
>- the whole community is on the same page what a certain feature means
>- the scope does not change drastically during the development period
>
> As for what should and what should not deserve a FLIP, I actually quite
> like the definition in the FLIPs page[1]. I think it does make sense to
> have a FLIP, and as a result a voting process, for any *public* or major
> change. I agree with Gordon. Even if the change is trivial it might affect
> external systems/users and it is also a commitment from the community.
> Therefore I think they deserve a vote.
>
> Lastly, I think Jark raised a valid point. We should have a clear
> understanding what binding votes in this case mean. I think it makes sense
> to consider PMC's and committers' votes as binding for FLIPs voting.
> Otherwise we would lose the aspect of committing to help with getting the
> FLIP into the codebase.
>
> To sum up I would opt for enforcing the lazy majority. I would suggest to
> consider constructing a release plan with a list of accepted FLIPs.
>
> Best,
>
> Dawid
>
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals#FlinkImprovementProposals-Whatisconsidereda%22majorchange%22thatneedsaFLIP
> ?
> On 27/06/2019 04:15, Jark Wu wrote:
>
> +1 for sticking to the lazy majority voting.
>
> A question from my side, the 3+1 votes are binding votes which only active
> (i.e. non-emeritus) committers and PMC members have?
>
>
> Best,
> Jark
>
>
> On Wed, 26 Jun 2019 at 19:07, Tzu-Li (Gordon) Tai  
> 
> wrote:
>
>
> +1 to enforcing lazy majority voting for future FLIPs, starting from FLIPs
> that are still currently under discussion (by the time we've agreed on the
> FLIP voting process).
>
> My two cents concerning "what should and shouldn't be a FLIP":
>
> I can understand Chesnay's argument about how some FLIPs, while meeting the
> criteria defined by the FLIP guidelines, feel to not be sufficiently large
> to justify a FLIP.
> As a matter of fact, the FLIP guidelines explicitly mention that "Exposed
> Monitoring Information" is considered public interface; I guess that was
> why this FLIP came around in the first place.
> I was also hesitant in whether or not the recent FLIP about keyed state
> snapshot binary format unification (FLIP-41) deserves to be a FLIP, since
> the complexity of the change is rather small.
>
> However, with the fact that these changes indeed touch the general public
> interface of Flink, the scope (including all potential 3rd party projects)
> is strictly speaking hard to define.
> Outcomes of such changes, even if the complexity of the change is rather
> trivial, can still stick around for quite a while.
> In this case, IMO the value of proposing a FLIP for such a change is less
> about discussing design or i

Re: [ANNOUNCE] New Apache Flink Committer - Congxian Qiu

2020-10-29 Thread Biao Liu

Congrads!

Thanks,
Biao /'bɪ.aʊ/



On Thu, 29 Oct 2020 at 15:14, Xingbo Huang  wrote:

> Congratulations Congxian.
>
> Best,
> Xingbo
>
> Dian Fu  于2020年10月29日周四 下午3:05写道：
>
> > Congratulations Congxian!
> >
> > Regards,
> > Dian
> >
> > > 在 2020年10月29日，下午2:35，Yangze Guo  写道：
> > >
> > > Congratulations!
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Thu, Oct 29, 2020 at 2:31 PM Jark Wu  wrote:
> > >>
> > >> Congrats Congxian!
> > >>
> > >> Best,
> > >> Jark
> > >>
> > >> On Thu, 29 Oct 2020 at 14:28, Yu Li  wrote:
> > >>
> > >>> Hi all,
> > >>>
> > >>> On behalf of the PMC, I’m very happy to announce Congxian Qiu as a
> new
> > >>> Flink committer.
> > >>>
> > >>> Congxian has been an active contributor for more than two years, with
> > 226
> > >>> contributions including 76 commits and many PR reviews.
> > >>>
> > >>> Congxian mainly works on state backend and checkpoint modules,
> > meantime is
> > >>> one of the main maintainers of our Chinese document translation.
> > >>>
> > >>> Besides his work on the code, he has been driving initiatives on dev@
> > >>> list,
> > >>> supporting users and giving talks at conferences.
> > >>>
> > >>> Please join me in congratulating Congxian for becoming a Flink
> > committer!
> > >>>
> > >>> Cheers,
> > >>> Yu
> > >>>
> >
> >
>

[DISCUSS] FLIP-281 Sink Supports Speculative Execution For Batch Job

2022-12-22 Thread Biao Liu

Hi everyone,

I would like to start a discussion about making Sink support speculative
execution for batch jobs. This proposal is a follow up of "FLIP-168:
Speculative Execution For Batch Job"[1]. Speculative execution is very
meaningful for batch jobs. And it would be more complete after supporting
speculative execution of Sink. Please find more details in the FLIP document
[2].

Looking forward to your feedback.
[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+Execution+for+Batch+Job
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-281+Sink+Supports+Speculative+Execution+For+Batch+Job

Thanks,
Biao /'bɪ.aʊ/

Re: [DISCUSS] FLIP-281 Sink Supports Speculative Execution For Batch Job

2022-12-28 Thread Biao Liu

Thanks for all your feedback!

To @Yuxia,

> What the sink expect to do to isolate data produced by speculative
> executions?  IIUC, if the taks failover, it also generate a new attempt.
> Does it make difference in isolating data produced?

Yes there is something different from the task failover scenario. The
attempt number is more necessary for speculative execution than failover.
Because there can be only one subtask instance running at the same time in
the failover scenario.

Let's take FileSystemOutputFormat as an example. For the failover scenario,
the temporary directory to store produced data can be something like
"$root_dir/task-$taskNumber/". At the initialization phase, subtask deletes
and re-creates the temporary directory.

However in the speculative execution scenario, it does not work because
there might be several subtasks running at the same time. These subtasks
might delete, re-create and write the same temporary directory at the
same time. The correct temporary directory should be like
"$root_dir/task-$taskNumber/attempt-$attemptNumber". So it's necessary to
expose the attempt number to the Sink implementation to do the data
isolation.

To @Lijie,

> I have a question about this: does SinkV2 need to do the same thing?

Actually, yes.

Should we/users do it in the committer? If yes, how does the commiter know
> which one is the right subtask attempt?

Yes, we/users should do it in the committer.

In the current design, the Committer of Sink V2 should get the "which one
is the right subtask attempt" information from the "committable data''
produced by SinkWriter. Let's take the FileSink as example, the
"committable data" sent to the Committer contains the full path of the
files produced by SinkWriter. Users could also pass the attempt number
through "committable data" from SinkWriter to Committer.

In the "Rejected Alternatives -> Introduce a way to clean leaked data of
Sink V2" section of the FLIP document, we discussed some of the reasons
that we didn't provide the API like OutputFormat.

To @Jing Zhang

I have a question about this: Speculative execution of Committer will be
> disabled.

I agree with your point and I saw the similar requirements to disable
speculative
> execution for specified operators.

However the requirement is not supported currently. I think there
should be some
> place to describe how to support it.

In this FLIP design, the speculative execution of Committer of Sink V2 will
be disabled by Flink. It's not an optional operation. Users can not change
it.
And as you said, "disable speculative execution for specified operators" is
not supported in the FLIP. Because it's a bit out of scope: "Sink Supports
Speculative Execution For Batch Job". I think it's better to start another
FLIP to discuss it. "Fine-grained control of enabling speculative execution
for operators" can be the title of that FLIP. And we can discuss there how
to enable or disable speculative execution for specified operators
including Committer and pre/post-committer of Sink V2.

What do you think?

Thanks,
Biao /'bɪ.aʊ/

On Wed, 28 Dec 2022 at 11:30, Jing Zhang  wrote:

> Hi Biao,
>
> Thanks for driving this FLIP. It's meaningful to support speculative
> execution
> of sinks is important.
>
> I have a question about this: Speculative execution of Committer will be
> disabled.
>
> I agree with your point and I saw the similar requirements to disable
> speculative execution for specified operators.
>
> However the requirement is not supported currently. I think there should be
> some place to describe how to support it.
>
> Best,
> Jing Zhang
>
> Lijie Wang  于2022年12月27日周二 18:51写道：
>
> > Hi Biao,
> >
> > Thanks for driving this FLIP.
> > In this FLIP, it introduces "int getFinishedAttempt(int subtaskIndex)"
> for
> > OutputFormat to know which subtask attempt is the one marked as finished
> by
> > JM and commit the right data.
> > I have a question about this: does SinkV2 need to do the same thing?
> Should
> > we/users do it in the committer? If yes, how does the commiter know which
> > one is the right subtask attempt?
> >
> > Best,
> > Lijie
> >
> > yuxia  于2022年12月27日周二 10:01写道：
> >
> > > HI, Biao.
> > > Thanks for driving this FLIP.
> > > After quick look of this FLIP, I have a question about "expose the
> > attempt
> > > number which can be used to isolate data produced by speculative
> > executions
> > > with the same subtask id".
> > > What the sink expect to do to isolate data produced by speculative
> > > executions?  IIUC, if the taks failover, it a

[VOTE] FLIP-281: Sink Supports Speculative Execution For Batch Job

2023-01-04 Thread Biao Liu

Hi everyone,

Thanks for all the feedback!

Based on the discussion[1], we seem to have a consensus. So I'd like to
start a vote on FLIP-281: Sink Supports Speculative Execution For Batch
Job[2]. The vote will last for 72 hours, unless there is an objection or
insufficient votes.

[1] https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-281+Sink+Supports+Speculative+Execution+For+Batch+Job

Thanks,
Biao /'bɪ.aʊ/

Re: [VOTE] FLIP-281: Sink Supports Speculative Execution For Batch Job

2023-01-04 Thread Biao Liu

Hi Martijn,

Sure, thanks for the reminder about the holiday period.
Looking forward to your feedback!

Thanks,
Biao /'bɪ.aʊ/



On Thu, 5 Jan 2023 at 03:07, Martijn Visser 
wrote:

> Hi Biao,
>
> To be honest, I haven't read the FLIP yet since this is still a holiday
> period in Europe. I would like to read it in the next few days. Can you
> keep the vote open a little longer?
>
> Best regards,
>
> Martijn
>
> On Wed, Jan 4, 2023 at 1:31 PM Biao Liu  wrote:
>
> > Hi everyone,
> >
> > Thanks for all the feedback!
> >
> > Based on the discussion[1], we seem to have a consensus. So I'd like to
> > start a vote on FLIP-281: Sink Supports Speculative Execution For Batch
> > Job[2]. The vote will last for 72 hours, unless there is an objection or
> > insufficient votes.
> >
> > [1] https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > [2]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-281+Sink+Supports+Speculative+Execution+For+Batch+Job
> >
> > Thanks,
> > Biao /'bɪ.aʊ/
> >
>

Re: [ANNOUNCE] New Apache Flink Committer - Lincoln Lee

2023-01-09 Thread Biao Liu

Congratulations, Lincoln!

Thanks,
Biao /'bɪ.aʊ/



On Tue, 10 Jan 2023 at 14:59, Hang Ruan  wrote:

> Congratulations, Lincoln!
>
> Best,
> Hang
>
> Biao Geng  于2023年1月10日周二 14:57写道：
>
> > Congrats, Lincoln!
> > Best,
> > Biao Geng
> >
> > 获取 Outlook for iOS
> > 
> > 发件人: Wencong Liu 
> > 发送时间: Tuesday, January 10, 2023 2:39:47 PM
> > 收件人: dev@flink.apache.org 
> > 主题: Re:Re: [ANNOUNCE] New Apache Flink Committer - Lincoln Lee
> >
> > Congratulations, Lincoln!
> >
> > Best regards,
> > Wencong
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > 在 2023-01-10 13:25:09，"Yanfei Lei"  写道：
> > >Congratulations, well deserved!
> > >
> > >Best,
> > >Yanfei
> > >
> > >Yuan Mei  于2023年1月10日周二 13:16写道：
> > >
> > >> Congratulations, Lincoln!
> > >>
> > >> Best,
> > >> Yuan
> > >>
> > >> On Tue, Jan 10, 2023 at 12:23 PM Lijie Wang  >
> > >> wrote:
> > >>
> > >> > Congratulations, Lincoln!
> > >> >
> > >> > Best,
> > >> > Lijie
> > >> >
> > >> > Jingsong Li  于2023年1月10日周二 12:07写道：
> > >> >
> > >> > > Congratulations, Lincoln!
> > >> > >
> > >> > > Best,
> > >> > > Jingsong
> > >> > >
> > >> > > On Tue, Jan 10, 2023 at 11:56 AM Leonard Xu 
> > wrote:
> > >> > > >
> > >> > > > Congratulations, Lincoln!
> > >> > > >
> > >> > > > Impressive work in streaming semantics, well deserved!
> > >> > > >
> > >> > > >
> > >> > > > Best,
> > >> > > > Leonard
> > >> > > >
> > >> > > >
> > >> > > > > On Jan 10, 2023, at 11:52 AM, Jark Wu 
> wrote:
> > >> > > > >
> > >> > > > > Hi everyone,
> > >> > > > >
> > >> > > > > On behalf of the PMC, I'm very happy to announce Lincoln Lee
> as
> > a
> > >> new
> > >> > > Flink
> > >> > > > > committer.
> > >> > > > >
> > >> > > > > Lincoln Lee has been a long-term Flink contributor since 2017.
> > He
> > >> > > mainly
> > >> > > > > works on Flink
> > >> > > > > SQL parts and drives several important FLIPs, e.g., FLIP-232
> > (Retry
> > >> > > Async
> > >> > > > > I/O), FLIP-234 (
> > >> > > > > Retryable Lookup Join), FLIP-260 (TableFunction Finish).
> > Besides,
> > >> He
> > >> > > also
> > >> > > > > contributed
> > >> > > > > much to Streaming Semantics, including the non-determinism
> > problem
> > >> > and
> > >> > > the
> > >> > > > > message
> > >> > > > > ordering problem.
> > >> > > > >
> > >> > > > > Please join me in congratulating Lincoln for becoming a Flink
> > >> > > committer!
> > >> > > > >
> > >> > > > > Cheers,
> > >> > > > > Jark Wu
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
>

Re: [DISCUSS] FLIP-281 Sink Supports Speculative Execution For Batch Job

2023-01-11 Thread Biao Liu

Hi Martijn,

Thanks for your feedback!

Yes, we propose to support speculative execution for SinkFunction.
1. From the perspective of compatibility, SinkFunction is the most original
Sink implementation.There are lots of implementations based on
SinkFunction, not only in Flink official codebase but also in user's
private codebase. It's a more serious issue than Sink V1. Of course we hope
users could migrate the legacy implementation to the new interface. However
migration is always hard.
2. From the perspective of cost, we don't need to do much extra work to
support speculative execution for SinkFunction. All we need to do is check
whether the SinkFunction implementation
inherits SupportsConcurrentExecutionAttempts or not. The other parts of
work are the same with Sink V2.

To summarize, it's cheap to support speculative execution for SinkFunction.
And it may allow more existing scenarios to run with speculative execution.

Thanks,
Biao /'bɪ.aʊ/



On Wed, 11 Jan 2023 at 21:22, Martijn Visser 
wrote:

> Hi Biao,
>
> Apologies for the late jumping in. My only question is about SinkFunction,
> does this imply that we want to add support for this to the SinkFunction?
> If so, I would not be in favour of that since we would like to deprecate (I
> actually thought that was already the case) the SinkFunction in favour of
> SinkV2.
>
> Besides that, I have no other comments.
>
> Best regards,
>
> Martijn
>
> On Wed, Jan 4, 2023 at 7:28 AM Jing Zhang  wrote:
>
> > Hi Biao,
> >
> > Thanks for explanation.
> >
> > +1 for the proposal.
> >
> > Best,
> > Jing Zhang
> >
> > Lijie Wang  于2023年1月4日周三 12:11写道：
> >
> > > Hi Biao,
> > >
> > > Thanks for the explanation of how SinkV2  knows the right subtask
> > > attempt. I have no more questions, +1 for the proposal.
> > >
> > > Best,
> > > Lijie
> > >
> > > Biao Liu  于2022年12月28日周三 17:22写道：
> > >
> > > > Thanks for all your feedback!
> > > >
> > > > To @Yuxia,
> > > >
> > > > > What the sink expect to do to isolate data produced by speculative
> > > > > executions?  IIUC, if the taks failover, it also generate a new
> > > attempt.
> > > > > Does it make difference in isolating data produced?
> > > >
> > > >
> > > > Yes there is something different from the task failover scenario. The
> > > > attempt number is more necessary for speculative execution than
> > failover.
> > > > Because there can be only one subtask instance running at the same
> time
> > > in
> > > > the failover scenario.
> > > >
> > > > Let's take FileSystemOutputFormat as an example. For the failover
> > > scenario,
> > > > the temporary directory to store produced data can be something like
> > > > "$root_dir/task-$taskNumber/". At the initialization phase, subtask
> > > deletes
> > > > and re-creates the temporary directory.
> > > >
> > > > However in the speculative execution scenario, it does not work
> because
> > > > there might be several subtasks running at the same time. These
> > subtasks
> > > > might delete, re-create and write the same temporary directory at the
> > > > same time. The correct temporary directory should be like
> > > > "$root_dir/task-$taskNumber/attempt-$attemptNumber". So it's
> necessary
> > to
> > > > expose the attempt number to the Sink implementation to do the data
> > > > isolation.
> > > >
> > > >
> > > > To @Lijie,
> > > >
> > > > > I have a question about this: does SinkV2 need to do the same
> thing?
> > > >
> > > >
> > > > Actually, yes.
> > > >
> > > > Should we/users do it in the committer? If yes, how does the commiter
> > > know
> > > > > which one is the right subtask attempt?
> > > >
> > > >
> > > > Yes, we/users should do it in the committer.
> > > >
> > > > In the current design, the Committer of Sink V2 should get the "which
> > one
> > > > is the right subtask attempt" information from the "committable
> data''
> > > > produced by SinkWriter. Let's take the FileSink as example, the
> > > > "committable data" sent to the Committer contains the full path of
> the
> > > > files produced by SinkWriter. Users could also p

Re: [VOTE] FLIP-281: Sink Supports Speculative Execution For Batch Job

2023-01-13 Thread Biao Liu

Hi everyone,

I'm happy to announce that FLIP-281[1] has been accepted.

Thanks for all your feedback and votes. Here is the voting result:

+1 (binding), 3 in total:
- Zhu Zhu
- Lijie Wang
- Jing Zhang

+1 (non-binding), 2 in total:
- yuxia
- Jing Ge

+0 (binding), 1 in total:
- Martijn

There are no disapproving votes.

By the way, the discussion of deprecating SinkFunction can be continued in
discussion thread[2].
I think it's more like an orthogonal issue. We might need more time to come
to an agreement about it.

[1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-281
+Sink+Supports+Speculative+Execution+For+Batch+Job
[2] https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1

Thanks,
Biao /'bɪ.aʊ/



On Fri, 13 Jan 2023 at 08:46, Jing Ge  wrote:

> + 1(not binding)
>
> Best Regards,
> Jing
>
> On Thu, Jan 12, 2023 at 10:01 AM Jing Zhang  wrote:
>
> > +1 (binding)
> >
> > Best,
> > Jing Zhang
> >
> > Lijie Wang  于2023年1月12日周四 16:39写道：
> >
> > > +1 (binding)
> > >
> > > Best,
> > > Lijie
> > >
> > > Martijn Visser  于2023年1月12日周四 15:56写道：
> > >
> > > > +0 (binding)
> > > >
> > > > Op di 10 jan. 2023 om 13:11 schreef yuxia <
> luoyu...@alumni.sjtu.edu.cn
> > >:
> > > >
> > > > > +1 (non-binding).
> > > > >
> > > > > Best regards,
> > > > > Yuxia
> > > > >
> > > > > - 原始邮件 -----
> > > > > 发件人: "Zhu Zhu" 
> > > > > 收件人: "dev" 
> > > > > 发送时间: 星期二, 2023年 1 月 10日 下午 5:50:39
> > > > > 主题: Re: [VOTE] FLIP-281: Sink Supports Speculative Execution For
> > Batch
> > > > Job
> > > > >
> > > > > +1 (binding)
> > > > >
> > > > > Thanks,
> > > > > Zhu
> > > > >
> > > > > Biao Liu  于2023年1月5日周四 10:37写道：
> > > > > >
> > > > > > Hi Martijn,
> > > > > >
> > > > > > Sure, thanks for the reminder about the holiday period.
> > > > > > Looking forward to your feedback!
> > > > > >
> > > > > > Thanks,
> > > > > > Biao /'bɪ.aʊ/
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, 5 Jan 2023 at 03:07, Martijn Visser <
> > > martijnvis...@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Biao,
> > > > > > >
> > > > > > > To be honest, I haven't read the FLIP yet since this is still a
> > > > holiday
> > > > > > > period in Europe. I would like to read it in the next few days.
> > Can
> > > > you
> > > > > > > keep the vote open a little longer?
> > > > > > >
> > > > > > > Best regards,
> > > > > > >
> > > > > > > Martijn
> > > > > > >
> > > > > > > On Wed, Jan 4, 2023 at 1:31 PM Biao Liu 
> > > wrote:
> > > > > > >
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > Thanks for all the feedback!
> > > > > > > >
> > > > > > > > Based on the discussion[1], we seem to have a consensus. So
> I'd
> > > > like
> > > > > to
> > > > > > > > start a vote on FLIP-281: Sink Supports Speculative Execution
> > For
> > > > > Batch
> > > > > > > > Job[2]. The vote will last for 72 hours, unless there is an
> > > > > objection or
> > > > > > > > insufficient votes.
> > > > > > > >
> > > > > > > > [1]
> > > > https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > > > > > [2]
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-281+Sink+Supports+Speculative+Execution+For+Batch+Job
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Biao /'bɪ.aʊ/
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-281 Sink Supports Speculative Execution For Batch Job

2023-01-18 Thread Biao Liu

Hi Martijn & Jing,

Thanks for feedback!

Currently, SinkFunction is in a subtle circumstance. Like Jing pointed out,
SinkFunction is still marked as public. Technically, according to the
Flink Bylaws[1],
the decision should be approved through an official voting. Although many
of the community maintainers (including me) thought it should be
deprecated, we still should not assume it has been the fact. Considering
the discussion and voting may last 1 or 2 weeks and it may last longer
if someone has an objection. I'd like to keep pushing the FLIP-281 forward
with current design. I hope it can catch up with the release of 1.17.

By the way, if nobody drives the deprecating thing, I would like to start
another discussion to talk about it. What do you think?

[1] https://cwiki.apache.org/confluence/display/FLINK/Flink+Bylaws

Thanks,
Biao /'bɪ.aʊ/



On Fri, 13 Jan 2023 at 08:43, Jing Ge  wrote:

> Hi Biao,
>
> Thanks for driving this. Like Martijn already pointed out. We will spend
> effort to remove SinkFunction after we deprecate it. The more
> functionality added into it, the bigger effort we will have to deprecate
> and remove the SinkFunction. Commonly, It is not recommended to add new
> features into an interface which we already decided to deprecate but do not
> do yet. But, this FLIP is a special case and there are some reasons that
> lead us to support this proposal.
>
> First, the FLIP offered an equivalent solution for the new SinkV2, which
> means the migration from SinkFunction to SinkV2 for this feature is
> predictable and acceptable. The concern I raised above has been solved.
>
> Second, since the SinkFunction is still marked as public now [1], it should
> be fine to add new features into it (follow the rules), especially if the
> requirement is urgent. Similar to [2] described for API graduation, it
> should also take 8 months (two release cycles, ideal case is 8 months,
> could be longer) to go from @Public to @Deprecated and to be removed.
> Additionally, considering the SinkFunction is one core function whose
> deletion will trigger a lot of further downstream deletions. The duration
> will be increased to be 16 months (again, idea case) or even longer, e.g. 2
> years.
>
> Third, the SinkV2 is still marked as @PublicEvolving, which means a few
> more months (8 months?) in addition before we can start the deprecation of
> SinkFunction. It is not rational to say no features should be added into
> SinkFunction during the upcoming 2 or 3 years.
>
> After thinking about all these aspects, I would support this FLIP, so +1
>
> This discussion leads us to another issue: we should graduate SinkV2
> and deprecate and remove SinkFunction asap. The longer we keep
> the SinkFunktion in the code base, the bigger effort we will have while
> working on anything that might depend on sink or has impact on sink.
>
> Best regards,
> Jing
>
> [1]
>
> https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/sink/SinkFunction.java
> [2]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
>
> On Thu, Jan 12, 2023 at 8:56 AM Martijn Visser 
> wrote:
>
> > Hi Biao,
> >
> > While I rather wouldn't add new features to (to-be) deprecated features,
> I
> > would be +0 for this.
> >
> > Best regards,
> >
> > Martijn
> >
> > Op do 12 jan. 2023 om 08:42 schreef Biao Liu :
> >
> > > Hi Martijn,
> > >
> > > Thanks for your feedback!
> > >
> > > Yes, we propose to support speculative execution for SinkFunction.
> > > 1. From the perspective of compatibility, SinkFunction is the most
> > original
> > > Sink implementation.There are lots of implementations based on
> > > SinkFunction, not only in Flink official codebase but also in user's
> > > private codebase. It's a more serious issue than Sink V1. Of course we
> > hope
> > > users could migrate the legacy implementation to the new interface.
> > However
> > > migration is always hard.
> > > 2. From the perspective of cost, we don't need to do much extra work to
> > > support speculative execution for SinkFunction. All we need to do is
> > check
> > > whether the SinkFunction implementation
> > > inherits SupportsConcurrentExecutionAttempts or not. The other parts of
> > > work are the same with Sink V2.
> > >
> > > To summarize, it's cheap to support speculative execution for
> > SinkFunction.
> > > And it may allow more existing scenarios to run with speculative
> > execution.
> > >
> > > T

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

2023-01-18 Thread Biao Liu

Hi Martijn,

Thanks for bringing us this discussion!

I think it's time to mark SinkFunction as deprecated. It may help a lot to
encourage users to migrate existing sink connectors to the new interface.

About the concern of Lijie, I'm not sure if it's OK to compatibly change
the interface with @Public annotation, like adding a new method. If it is
allowed, I think it would be fine to promote SinkV2 to @Public.

Thanks,
Biao /'bɪ.aʊ/



On Thu, 19 Jan 2023 at 10:26, Lijie Wang  wrote:

> Hi Martijn,
>
> Thanks for driving this. I have a only concern about the Sink.InitContext.
>
> Does the Sink.InitContext will also be changed to @Public ? As described in
> FLIP-287, currently the Sink.InitContext still lacks some necessary
> information to migrate existing connectors to new sinks. If it is marked as
> public/stable, we can no longer modify it in the future(since most
> connectors are not migrated to SinkV2 currently, we may find we need more
> information via InitContext in the future migrations).
>
> Best,
> Lijie
>
> Yun Tang  于2023年1月18日周三 21:13写道：
>
> > SinkV2 was introduced in Flink-1.15 and annotated as @PublicEvolving from
> > the 1st day [1]. From FLIP-197, we can promote it to @Public since it
> > already existed with two releases.
> > And I didn't find a FLIP to discuss the process to deprecate APIs,
> > considering the SinkFunction has actually been stale for some time, I
> think
> > we can deprecate it with the @Public SinkV2.
> >
> > Thus, +1 (binding) for this proposal.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-2
> >
> > Best
> > Yun Tang
> >
> > 
> > From: Martijn Visser 
> > Sent: Wednesday, January 18, 2023 18:50
> > To: dev ; Jing Ge ; Yun Tang <
> > myas...@live.com>
> > Subject: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction
> >
> > Hi all,
> >
> > While discussing FLIP-281 [1] the discussion also turned to the
> > SinkFunction and the SinkV2 API. For a broader discussion I'm opening up
> a
> > separate discussion thread.
> >
> > As Yun Tang has mentioned in that discussion thread, it would be a good
> > time to deprecate the SinkFunction to avoid the need to introduce new
> > functions towards (to be) deprecated APIs. Jing rightfully mentioned that
> > it would be confusing to deprecate the SinkFunction if its successor is
> not
> > yet marked as @Public (it's currently @PublicEvolving).
> >
> > My proposal would be to promote the SinkV2 API to @public in Flink 1.17
> > and mark the SinkFunction as @deprecated in Flink 1.17
> >
> > The original Sink interface was introduced in Flink 1.12 with FLIP-143
> [2]
> > and extended with FLIP-177 in Flink 1.14 [3] and has been improved on
> > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> >
> > Looking at the API stability graduation process [5], the fact that Sink
> V2
> > was introduced in Flink 1.15 would mean that we could warrant a promotion
> > to @public already (given that there have been two releases with 1.15 and
> > 1.16 where it was introduced). Combined with the fact that SinkV2 has
> been
> > the result of iteration over the introduction of the original Sink API
> > since Flink 1.12, I would argue that the promotion is overdue.
> >
> > If we promote the Sink API to @public, I think we should also immediately
> > mark the SinkFunction as @deprecated.
> >
> > Looking forward to your thoughts.
> >
> > Best regards,
> >
> > Martijn
> >
> >
> > [1] https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > [2]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > [3]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > [4]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > [5]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> >
> >
>

Re: [ANNOUNCE] Flink Table Store Joins Apache Incubator as Apache Paimon(incubating)

2023-03-27 Thread Biao Liu

Congrats!

Thanks,
Biao /'bɪ.aʊ/



On Tue, 28 Mar 2023 at 10:29, Hang Ruan  wrote:

> Congratulations!
>
> Best,
> Hang
>
> yu zelin  于2023年3月28日周二 10:27写道：
>
>> Congratulations!
>>
>> Best,
>> Yu Zelin
>>
>> 2023年3月27日 17:23，Yu Li  写道：
>>
>> Dear Flinkers,
>>
>>
>>
>> As you may have noticed, we are pleased to announce that Flink Table Store 
>> has joined the Apache Incubator as a separate project called Apache 
>> Paimon(incubating) [1] [2] [3]. The new project still aims at building a 
>> streaming data lake platform for high-speed data ingestion, change data 
>> tracking and efficient real-time analytics, with the vision of supporting a 
>> larger ecosystem and establishing a vibrant and neutral open source 
>> community.
>>
>>
>>
>> We would like to thank everyone for their great support and efforts for the 
>> Flink Table Store project, and warmly welcome everyone to join the 
>> development and activities of the new project. Apache Flink will continue to 
>> be one of the first-class citizens supported by Paimon, and we believe that 
>> the Flink and Paimon communities will maintain close cooperation.
>>
>>
>> 亲爱的Flinkers，
>>
>>
>> 正如您可能已经注意到的，我们很高兴地宣布，Flink Table Store 已经正式加入 Apache
>> 孵化器独立孵化 [1] [2] [3]。新项目的名字是
>> Apache 
>> Paimon(incubating)，仍致力于打造一个支持高速数据摄入、流式数据订阅和高效实时分析的新一代流式湖仓平台。此外，新项目将支持更加丰富的生态，并建立一个充满活力和中立的开源社区。
>>
>>
>> 在这里我们要感谢大家对 Flink Table Store
>> 项目的大力支持和投入，并热烈欢迎大家加入新项目的开发和社区活动。Apache Flink 将继续作为 Paimon 支持的主力计算引擎之一，我们也相信
>> Flink 和 Paimon 社区将继续保持密切合作。
>>
>>
>> Best Regards,
>> Yu (on behalf of the Apache Flink PMC and Apache Paimon PPMC)
>>
>> 致礼，
>> 李钰（谨代表 Apache Flink PMC 和 Apache Paimon PPMC）
>>
>> [1] https://paimon.apache.org/
>> [2] https://github.com/apache/incubator-paimon
>> [3] https://cwiki.apache.org/confluence/display/INCUBATOR/PaimonProposal
>>
>>
>>

Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache Flink Committer

2021-01-20 Thread Biao Liu

Congrats, Guowei!

Thanks,
Biao /'bɪ.aʊ/



On Thu, 21 Jan 2021 at 09:30, Paul Lam  wrote:

> Congrats, Guowei!
>
> Best,
> Paul Lam
>
> > 2021年1月21日 07:21，Steven Wu  写道：
> >
> > Congrats, Guowei!
> >
> > On Wed, Jan 20, 2021 at 10:32 AM Seth Wiesman 
> wrote:
> >
> >> Congratulations!
> >>
> >> On Wed, Jan 20, 2021 at 3:41 AM hailongwang <18868816...@163.com>
> wrote:
> >>
> >>> Congratulations, Guowei!
> >>>
> >>> Best,
> >>> Hailong
> >>>
> >>> 在 2021-01-20 15:55:24，"Till Rohrmann"  写道：
>  Congrats, Guowei!
> 
>  Cheers,
>  Till
> 
>  On Wed, Jan 20, 2021 at 8:32 AM Matthias Pohl  >
>  wrote:
> 
> > Congrats, Guowei!
> >
> > On Wed, Jan 20, 2021 at 8:22 AM Congxian Qiu  >
> > wrote:
> >
> >> Congrats Guowei!
> >>
> >> Best,
> >> Congxian
> >>
> >>
> >> Danny Chan  于2021年1月20日周三 下午2:59写道：
> >>
> >>> Congratulations Guowei!
> >>>
> >>> Best,
> >>> Danny
> >>>
> >>> Jark Wu  于2021年1月20日周三 下午2:47写道：
> >>>
>  Congratulations Guowei!
> 
>  Cheers,
>  Jark
> 
>  On Wed, 20 Jan 2021 at 14:36, SHI Xiaogang <
> >>> shixiaoga...@gmail.com>
> >>> wrote:
> 
> > Congratulations MA!
> >
> > Regards,
> > Xiaogang
> >
> > Yun Tang  于2021年1月20日周三 下午2:24写道：
> >
> >> Congratulations Guowei!
> >>
> >> Best
> >> Yun Tang
> >> 
> >> From: Yang Wang 
> >> Sent: Wednesday, January 20, 2021 13:59
> >> To: dev 
> >> Subject: Re: Re: [ANNOUNCE] Welcome Guowei Ma as a new
> >> Apache
> > Flink
> >> Committer
> >>
> >> Congratulations Guowei!
> >>
> >>
> >> Best,
> >> Yang
> >>
> >> Yun Gao  于2021年1月20日周三
> >>> 下午1:52写道：
> >>
> >>> Congratulations Guowei!
> >>>
> >>> Best,
> >>>
> 
> >>> Yun--
> >>> Sender:Yangze Guo
> >>> Date:2021/01/20 13:48:52
> >>> Recipient:dev
> >>> Theme:Re: [ANNOUNCE] Welcome Guowei Ma as a new Apache
> >> Flink
>  Committer
> >>>
> >>> Congratulations, Guowei! Well deserved.
> >>>
> >>>
> >>> Best,
> >>> Yangze Guo
> >>>
> >>> On Wed, Jan 20, 2021 at 1:46 PM Xintong Song <
> >>> tonysong...@gmail.com>
> >>> wrote:
> 
>  Congratulations, Guowei~!
> 
> 
>  Thank you~
> 
>  Xintong Song
> 
> 
> 
>  On Wed, Jan 20, 2021 at 1:42 PM Yuan Mei <
> >> yuanmei.w...@gmail.com
> 
> >> wrote:
> 
> > Congrats Guowei :-)
> >
> > Best,
> > Yuan
> >
> > On Wed, Jan 20, 2021 at 1:36 PM tison <
> > wander4...@gmail.com>
> > wrote:
> >
> >> Congrats Guowei!
> >>
> >> Best,
> >> tison.
> >>
> >>
> >> Kurt Young  于2021年1月20日周三
> >> 下午1:34写道：
> >>
> >>> Hi everyone,
> >>>
> >>> I'm very happy to announce that Guowei Ma has
> >>> accepted
> >> the
> >>> invitation
> > to
> >>> become a Flink committer.
> >>>
> >>> Guowei is a very long term Flink developer, he has
> >>> been
> > extremely
> > helpful
> >>> with
> >>> some important runtime changes, and also been
> >>> active
> >> with
> >>> answering
> > user
> >>> questions as well as discussing designs.
> >>>
> >>> Please join me in congratulating Guowei for
> >>> becoming a
> >>> Flink
> >>> committer!
> >>>
> >>> Best,
> >>> Kurt
> >>>
> >>
> >
> >>>
> >>
> >
> 
> >>>
> >
> >>>
> >>
>
>

Re: [ANNOUNCE] Apache Flink 1.16.0 released

2022-10-28 Thread Biao Liu

Congrats! Glad to hear that.

BTW, I just found the document link of 1.16 from https://flink.apache.org/
is not correct.

[image: 截屏2022-10-28 17.01.28.png]

Thanks,
Biao /'bɪ.aʊ/



On Fri, 28 Oct 2022 at 14:46, Xingbo Huang  wrote:

> The Apache Flink community is very happy to announce the release of Apache
> Flink 1.16.0, which is the first release for the Apache Flink 1.16 series.
>
> Apache Flink® is an open-source stream processing framework for
> distributed, high-performing, always-available, and accurate data streaming
> applications.
>
> The release is available for download at:
> https://flink.apache.org/downloads.html
>
> Please check out the release blog post for an overview of the
> improvements for this release:
> https://flink.apache.org/news/2022/10/28/1.16-announcement.html
>
> The full release notes are available in Jira:
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12351275
>
> We would like to thank all contributors of the Apache Flink community
> who made this release possible!
>
> Regards,
> Chesnay, Martijn, Godfrey & Xingbo
>

Re: [ANNOUNCE] New Apache Flink Committer - Xintong Song

2020-06-04 Thread Biao Liu

Congrats!

Thanks,
Biao /'bɪ.aʊ/



On Fri, 5 Jun 2020 at 13:32, Thomas Weise  wrote:

> Congratulations!
>
>
> On Thu, Jun 4, 2020, 10:17 PM Yuan Mei  wrote:
>
> > Congrats, Xintong!
> >
> > On Fri, Jun 5, 2020 at 12:45 PM Becket Qin  wrote:
> >
> > > Hi all,
> > >
> > > On behalf of the PMC, I’m very happy to announce Xintong Song as a new
> > > Flink committer.
> > >
> > > Xintong started to contribute to Flink about two years ago and has been
> > > active since. His major work is in Flink resource management, and have
> > also
> > > participated in discussions, bug fixes and answering questions.
> > >
> > > Please join me in congratulating Xintong for becoming a Flink
> committer!
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> >
>

Re: [ANNOUNCE] Yu Li is now part of the Flink PMC

2020-06-21 Thread Biao Liu

Congrats!

Thanks,
Biao /'bɪ.aʊ/



On Thu, 18 Jun 2020 at 19:41, Ufuk Celebi  wrote:

> Congrats! :-)
>
> – Ufuk
>
>
> On Thu, Jun 18, 2020 at 9:47 AM Marta Paes Moreira 
> wrote:
>
> > Awesome! Congratulations, Yu!
> >
> > On Thu, Jun 18, 2020 at 9:16 AM Zhu Zhu  wrote:
> >
> > > Congratulations!
> > >
> > > Thanks,
> > > Zhu Zhu
> > >
> > > Guowei Ma  于2020年6月18日周四 上午10:41写道：
> > >
> > > > Congratulations , Yu!
> > > > Best,
> > > > Guowei
> > > >
> > > >
> > > > Yang Wang  于2020年6月18日周四 上午10:36写道：
> > > >
> > > > > Congratulations , Yu!
> > > > >
> > > > > Best,
> > > > > Yang
> > > > >
> > > > > Piotr Nowojski  于2020年6月17日周三 下午9:21写道：
> > > > >
> > > > > > Congratulations :)
> > > > > >
> > > > > > > On 17 Jun 2020, at 14:53, Yun Tang  wrote:
> > > > > > >
> > > > > > > Congratulations , Yu! well deserved.
> > > > > > >
> > > > > > > Best
> > > > > > > Yun Tang
> > > > > > > 
> > > > > > > From: Yu Li 
> > > > > > > Sent: Wednesday, June 17, 2020 20:03
> > > > > > > To: dev 
> > > > > > > Subject: Re: [ANNOUNCE] Yu Li is now part of the Flink PMC
> > > > > > >
> > > > > > > Thanks everyone! Really happy to work in such a great and
> > > encouraging
> > > > > > > community!
> > > > > > >
> > > > > > > Best Regards,
> > > > > > > Yu
> > > > > > >
> > > > > > >
> > > > > > > On Wed, 17 Jun 2020 at 19:59, Congxian Qiu <
> > qcx978132...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > >> Congratulations Yu !
> > > > > > >> Best,
> > > > > > >> Congxian
> > > > > > >>
> > > > > > >>
> > > > > > >> Thomas Weise  于2020年6月17日周三 下午6:23写道：
> > > > > > >>
> > > > > > >>> Congratulations!
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> On Wed, Jun 17, 2020, 2:59 AM Fabian Hueske <
> fhue...@gmail.com
> > >
> > > > > wrote:
> > > > > > >>>
> > > > > >  Congrats Yu!
> > > > > > 
> > > > > >  Cheers, Fabian
> > > > > > 
> > > > > >  Am Mi., 17. Juni 2020 um 10:20 Uhr schrieb Till Rohrmann <
> > > > > >  trohrm...@apache.org>:
> > > > > > 
> > > > > > > Congratulations Yu!
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Till
> > > > > > >
> > > > > > > On Wed, Jun 17, 2020 at 7:53 AM Jingsong Li <
> > > > > jingsongl...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Congratulations Yu, well deserved!
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> Jingsong
> > > > > > >>
> > > > > > >> On Wed, Jun 17, 2020 at 1:42 PM Yuan Mei <
> > > > yuanmei.w...@gmail.com>
> > > > > >  wrote:
> > > > > > >>
> > > > > > >>> Congrats, Yu!
> > > > > > >>>
> > > > > > >>> GXGX & well deserved!!
> > > > > > >>>
> > > > > > >>> Best Regards,
> > > > > > >>>
> > > > > > >>> Yuan
> > > > > > >>>
> > > > > > >>> On Wed, Jun 17, 2020 at 9:15 AM jincheng sun <
> > > > > >  sunjincheng...@gmail.com>
> > > > > > >>> wrote:
> > > > > > >>>
> > > > > >  Hi all,
> > > > > > 
> > > > > >  On behalf of the Flink PMC, I'm happy to announce that
> Yu
> > Li
> > > > is
> > > > > > >> now
> > > > > >  part of the Apache Flink Project Management Committee
> > (PMC).
> > > > > > 
> > > > > >  Yu Li has been very active on Flink's Statebackend
> > > component,
> > > > > > >>> working
> > > > > > > on
> > > > > >  various improvements, for example the RocksDB memory
> > > > management
> > > > > > >> for
> > > > > > > 1.10.
> > > > > >  and keeps checking and voting for our releases, and also
> > has
> > > > > > > successfully
> > > > > >  produced two releases(1.10.0&1.10.1) as RM.
> > > > > > 
> > > > > >  Congratulations & Welcome Yu Li!
> > > > > > 
> > > > > >  Best,
> > > > > >  Jincheng (on behalf of the Flink PMC)
> > > > > > 
> > > > > > >>>
> > > > > > >>
> > > > > > >> --
> > > > > > >> Best, Jingsong Lee
> > > > > > >>
> > > > > > >
> > > > > > 
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [ANNOUNCE] New PMC member: Piotr Nowojski

2020-07-06 Thread Biao Liu

Congrats Piotr! Well deserved!

Thanks,
Biao /'bɪ.aʊ/



On Tue, 7 Jul 2020 at 13:03, Congxian Qiu  wrote:

> Congratulations Piotr!
>
> Best,
> Congxian
>
>
> Zhijiang  于2020年7月7日周二 下午12:25写道：
>
> > Congratulations Piotr!
> >
> > Best,
> > Zhijiang
> >
> >
> > --
> > From:Rui Li 
> > Send Time:2020年7月7日(星期二) 11:55
> > To:dev 
> > Cc:pnowojski 
> > Subject:Re: [ANNOUNCE] New PMC member: Piotr Nowojski
> >
> > Congrats!
> >
> > On Tue, Jul 7, 2020 at 11:25 AM Yangze Guo  wrote:
> >
> > > Congratulations!
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Tue, Jul 7, 2020 at 11:01 AM Jiayi Liao 
> > > wrote:
> > > >
> > > > Congratulations Piotr!
> > > >
> > > > Best,
> > > > Jiayi Liao
> > > >
> > > > On Tue, Jul 7, 2020 at 10:54 AM Jark Wu  wrote:
> > > >
> > > > > Congratulations Piotr!
> > > > >
> > > > > Best,
> > > > > Jark
> > > > >
> > > > > On Tue, 7 Jul 2020 at 10:50, Yuan Mei 
> > wrote:
> > > > >
> > > > > > Congratulations, Piotr!
> > > > > >
> > > > > > On Tue, Jul 7, 2020 at 1:07 AM Stephan Ewen 
> > > wrote:
> > > > > >
> > > > > > > Hi all!
> > > > > > >
> > > > > > > It is my pleasure to announce that Piotr Nowojski joined the
> > Flink
> > > PMC.
> > > > > > >
> > > > > > > Many of you may know Piotr from the work he does on the data
> > > processing
> > > > > > > runtime and the network stack, from the mailing list, or the
> > > release
> > > > > > > manager work.
> > > > > > >
> > > > > > > Congrats, Piotr!
> > > > > > >
> > > > > > > Best,
> > > > > > > Stephan
> > > > > > >
> > > > > >
> > > > >
> > >
> >
> >
> > --
> > Best regards!
> > Rui Li
> >
> >
>

Re: [DISCUSS] FLIP-19: Improved BLOB storage architecture

2017-06-15 Thread Biao Liu

I have the same concern with Chesnay Schepler. AFIK Flink does not support
DC as well as Mapreduce and Spark. We only support DC in DataSet API. And
DC in flink do not support local files. Is this a good change to refactor
DC too?

I have another concern, currently BLOB server has some conflicts with
FLIP-6 architecture. We start JM while submitting job instead of starting
it before in FLIP-6. If BLOB server is embedded in JM we can not upload
jars and files before JM started. But the fact is that we need jars
uploaded before starting JM. Correct me is I was wrong.
To solve this problem we can separate submitting job into different stage.
Or we can separate BLOB server as a independent component parallel with RM.

Maybe we can think more about these in FLIP-19, what do you think? @Nico

Re: [DISCUSS] FLIP-19: Improved BLOB storage architecture

2017-06-16 Thread Biao Liu

Hi Till

I agree with you about the Flink's DC. It is another topic indeed. I just
thought that we can think more about it before refactoring BLOB service.
Make sure that it's easy to implement DC on the refactored architecture.

I have another question about BLOB service. Can we abstract the BLOB
service to some high-level interfaces? May be just some put/get methods in
the interfaces. Easy to extend will be useful in some scenarios.

For example in Yarn mode, there are some cool features interesting us.
1. Yarn can localize files only once in one slave machine, all TMs in the
same job can share these files. That may save lots of bandwidth for large
scale jobs or jobs which have large BLOBs.
2. We can skip uploading files if they are already on DFS. That's a common
scenario in distributed cache.
3. Even more, actually we don't need a BlobServer component in Yarn mode.
We can rely on DFS to distribute files. There is always a DFS available in
Yarn cluster.

If we do so, the BLOB service through network can be the default
implementation. It could work in any situation. It's also clear that it
does not dependent on Hadoop explicitly. And we can do some optimization in
different kinds of clusters without any hacking.

That are just some rough ideas above. But I think well abstracted
interfaces will be very helpful.

[jira] [Created] (FLINK-13042) Make slot sharing configurable on ExecutionConfig

2019-07-01 Thread Biao Liu (JIRA)

Biao Liu created FLINK-13042:


 Summary: Make slot sharing configurable on ExecutionConfig
 Key: FLINK-13042
 URL: https://issues.apache.org/jira/browse/FLINK-13042
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Reporter: Biao Liu
Assignee: Biao Liu
 Fix For: 1.9.0


There is a requirement of Blink batch planner that providing a global setting 
that disabling slot sharing. To support that, will expose a {{PublicEvolving}} 
method on {{ExecutionConfig}} to globally disable slot sharing.

Note that, this method might be removed if there is a better approach to 
satisfy Blink batch planner in the future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-13098) Add a new type UNDEFINED of shuffle mode

2019-07-04 Thread Biao Liu (JIRA)

Biao Liu created FLINK-13098:


 Summary: Add a new type UNDEFINED of shuffle mode
 Key: FLINK-13098
 URL: https://issues.apache.org/jira/browse/FLINK-13098
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Reporter: Biao Liu
Assignee: Biao Liu
 Fix For: 1.9.0


The {{UNDEFINED}} type is the default value of shuffle mode. If there is no 
specific {{PartitionTransformation}}, the shuffle mode would be {{UNDEFINED}}.
 This new shuffle type leaves some space for optimization later. The 
optimization might be based on resources or some global settings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-13099) Add a new type UNDEFINED of shuffle mode

2019-07-04 Thread Biao Liu (JIRA)

Biao Liu created FLINK-13099:


 Summary: Add a new type UNDEFINED of shuffle mode
 Key: FLINK-13099
 URL: https://issues.apache.org/jira/browse/FLINK-13099
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Reporter: Biao Liu
Assignee: Biao Liu
 Fix For: 1.9.0


The {{UNDEFINED}} type is the default value of shuffle mode. If there is no 
specific {{PartitionTransformation}}, the shuffle mode would be {{UNDEFINED}}.
 This new shuffle type leaves some space for optimization later. The 
optimization might be based on resources or some global settings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-13101) Introduce "blocking after chaining off" property of StreamGraph

2019-07-04 Thread Biao Liu (JIRA)

Biao Liu created FLINK-13101:


 Summary: Introduce "blocking after chaining off" property of 
StreamGraph
 Key: FLINK-13101
 URL: https://issues.apache.org/jira/browse/FLINK-13101
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Reporter: Biao Liu
Assignee: Biao Liu
 Fix For: 1.9.0


The property "blocking after chaining off" means, if there are some stream 
edges that can not be chained and the shuffle mode of edge is not specified, 
translate these edges into {{BLOCKING}} result partition type.

The reason of introducing it is to satisfy the requirement of Blink batch 
planner. Because the current scheduling strategy is a bit simple. It can not 
support some complex scenarios, like a batch job with resources limited.

To be honest, it's probably a work-around solution. However it's an internal 
implementation, we can replace it when we are able to support batch job by 
scheduling strategy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-13635) Unexpectedly interrupted in AsyncFunction#timeout

2019-08-07 Thread Biao Liu (JIRA)

Biao Liu created FLINK-13635:


 Summary: Unexpectedly interrupted in AsyncFunction#timeout
 Key: FLINK-13635
 URL: https://issues.apache.org/jira/browse/FLINK-13635
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Affects Versions: 1.9.0
Reporter: Biao Liu
 Fix For: 1.10.0


Currently the way of handling {{AsyncFunction#timeout}} is a bit weird in 
{{AsyncWaitOperator#processElement}}.
 
There are two methods in {{AsyncFunction}}, {{asyncInvoke}} and {{timeout}}. 
The {{asyncInvoke}} is executed in task thread, while the {{timeout}} is 
executed in system time service. When the {{asyncInvoke}} finished, it might 
complete the {{ResultFuture}}. Then it cancels the registered timer of 
{{timeout}}. However there is no any synchronization between the 
{{asyncFunction}}, {{timeout}} and the cancelation. Moreover this cancelation 
is with interruption enabled.

The {{timeout}} must be implemented very carefully. Because when the 
{{timeout}} is executing, there might be an interruption triggered at the same 
time (due to a completion of {{ResultFuture}}). That means the {{timeout}} must 
handle {{InterruptedException}} well everywhere if there is any operation 
reacting with this exception.

My proposals are described below.
1. It should be written down in document that the {{asyncInvoke}} and 
{{timeout}} might be invoked at the same time.
2. This interruption of {{timeout}} should be avoided. There should be a 
synchronization between cancelation and {{timeout}}. If the {{timeout}} is 
executing, the cancelation should be avoided. If the cancelation has been 
invoked, this {{timeout}} should not be invoked anymore. Or we could simply 
cancel the timer without an interruption.

CC [~kkl0u], [~till.rohrmann]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (FLINK-13732) Enhance JobManagerMetricGroup with FLIP-6 architecture

2019-08-14 Thread Biao Liu (JIRA)

Biao Liu created FLINK-13732:


 Summary: Enhance JobManagerMetricGroup with FLIP-6 architecture
 Key: FLINK-13732
 URL: https://issues.apache.org/jira/browse/FLINK-13732
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics
Reporter: Biao Liu
 Fix For: 1.10.0


This is a requirement from user mailing list [1]. I think it's reasonable 
enough to support.

The scenario is that when deploying a Flink cluster on Yarn, there might be 
several {{JM(RM)}} s running on the same host. IMO that's quite a general 
scenario. However we can't distinguish the metrics from different 
{{JobManagerMetricGroup}}, because there is only one variable "hostname" we can 
use.

I think there are some problems of current implementation of 
{{JobManagerMetricGroup}}. It's still non-FLIP-6 style. We should split the 
metric group into {{RM}} and {{Dispatcher}} to match the FLIP-6 architecture. 
And there should be an identification variable supported, just like {{tm_id}}.

CC [~StephanEwen], [~till.rohrmann], [~Zentol]

1. 
[http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-metrics-scope-for-YARN-single-job-td29389.html]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (FLINK-13848) Support “scheduleAtFixedRate/scheduleAtFixedDelay” in RpcEndpoint#MainThreadExecutor

2019-08-25 Thread Biao Liu (Jira)

Biao Liu created FLINK-13848:


 Summary: Support “scheduleAtFixedRate/scheduleAtFixedDelay” in 
RpcEndpoint#MainThreadExecutor
 Key: FLINK-13848
 URL: https://issues.apache.org/jira/browse/FLINK-13848
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Biao Liu
 Fix For: 1.10.0


Currently the methods “scheduleAtFixedRate/scheduleAtFixedDelay" of 
{{RpcEndpoint#MainThreadExecutor}} are not implemented. Because there was no 
requirement on them before.

Now we are planning to implement these methods to support periodic checkpoint 
triggering.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Created] (FLINK-13904) Avoid competition between different rounds of checkpoint triggering

2019-08-29 Thread Biao Liu (Jira)

Biao Liu created FLINK-13904:


 Summary: Avoid competition between different rounds of checkpoint 
triggering
 Key: FLINK-13904
 URL: https://issues.apache.org/jira/browse/FLINK-13904
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Biao Liu
 Fix For: 1.10.0


As a part of {{CheckpointCoordinator}} refactoring, I'd like to simplify the 
concurrent triggering logic.
The different rounds of checkpoint triggering would be processed sequentially. 
The final target is getting rid of timer thread and {{triggerLock}}.

Note that we can't avoid all competitions of triggering for now. There is still 
a competition between normal checkpoint triggering and savepoint triggering. We 
could avoid this competition by executing triggering in main thread. But it 
could not be achieved until all blocking operations are handled well in IO 
threads.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Created] (FLINK-13905) Separate checkpoint triggering into stages

2019-08-29 Thread Biao Liu (Jira)

Biao Liu created FLINK-13905:


 Summary: Separate checkpoint triggering into stages
 Key: FLINK-13905
 URL: https://issues.apache.org/jira/browse/FLINK-13905
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Biao Liu
 Fix For: 1.10.0


Currently {{CheckpointCoordinator#triggerCheckpoint}} includes some heavy IO 
operations. We plan to separate the triggering into different stages. The IO 
operations are executed in IO threads, while other on-memory operations are not.

This is a preparation for making all on-memory operations of 
{{CheckpointCoordinator}} single threaded (in main thread).
Note that we could not put on-memory operations of triggering into main thread 
directly now. Because there are still some operations on a heavy lock 
(coordinator-wide).



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Created] (FLINK-16561) Resuming Externalized Checkpoint (rocks, incremental, no parallelism change) end-to-end test fails on Azure

2020-03-11 Thread Biao Liu (Jira)

Biao Liu created FLINK-16561:


 Summary: Resuming Externalized Checkpoint (rocks, incremental, no 
parallelism change) end-to-end test fails on Azure
 Key: FLINK-16561
 URL: https://issues.apache.org/jira/browse/FLINK-16561
 Project: Flink
  Issue Type: Test
  Components: Tests
Affects Versions: 1.11.0
Reporter: Biao Liu


{quote}Caused by: java.io.IOException: Cannot access file system for 
checkpoint/savepoint path 'file://.'.
at 
org.apache.flink.runtime.state.filesystem.AbstractFsCheckpointStorage.resolveCheckpointPointer(AbstractFsCheckpointStorage.java:233)
at 
org.apache.flink.runtime.state.filesystem.AbstractFsCheckpointStorage.resolveCheckpoint(AbstractFsCheckpointStorage.java:110)
at 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1332)
at 
org.apache.flink.runtime.scheduler.SchedulerBase.tryRestoreExecutionGraphFromSavepoint(SchedulerBase.java:314)
at 
org.apache.flink.runtime.scheduler.SchedulerBase.createAndRestoreExecutionGraph(SchedulerBase.java:247)
at 
org.apache.flink.runtime.scheduler.SchedulerBase.(SchedulerBase.java:223)
at 
org.apache.flink.runtime.scheduler.DefaultScheduler.(DefaultScheduler.java:118)
at 
org.apache.flink.runtime.scheduler.DefaultSchedulerFactory.createInstance(DefaultSchedulerFactory.java:103)
at 
org.apache.flink.runtime.jobmaster.JobMaster.createScheduler(JobMaster.java:281)
at 
org.apache.flink.runtime.jobmaster.JobMaster.(JobMaster.java:269)
at 
org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.createJobMasterService(DefaultJobMasterServiceFactory.java:98)
at 
org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.createJobMasterService(DefaultJobMasterServiceFactory.java:40)
at 
org.apache.flink.runtime.jobmaster.JobManagerRunnerImpl.(JobManagerRunnerImpl.java:146)
... 10 more
Caused by: java.io.IOException: Found local file path with authority '.' in 
path 'file://.'. Hint: Did you forget a slash? (correct path would be 
'file:///.')
at 
org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:441)
at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:389)
at org.apache.flink.core.fs.Path.getFileSystem(Path.java:298)
at 
org.apache.flink.runtime.state.filesystem.AbstractFsCheckpointStorage.resolveCheckpointPointer(AbstractFsCheckpointStorage.java:230)
... 22 more
{quote}

The original log is here, 
https://dev.azure.com/rmetzger/Flink/_build/results?buildId=6073&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=2b7514ee-e706-5046-657b-3430666e7bd9

There are some similar tickets about this case, but the stack here looks 
different. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16945) Execute CheckpointFailureManager.FailJobCallback directly in main thread executor

2020-04-02 Thread Biao Liu (Jira)

Biao Liu created FLINK-16945:


 Summary: Execute CheckpointFailureManager.FailJobCallback directly 
in main thread executor
 Key: FLINK-16945
 URL: https://issues.apache.org/jira/browse/FLINK-16945
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Checkpointing
Affects Versions: 1.10.0
Reporter: Biao Liu
 Fix For: 1.11.0


Since we have put all non-IO operations of {{CheckpointCoordinator}} into main 
thread executor, the {{CheckpointFailureManager.FailJobCallback}} could be 
executed directly now. In this way execution graph would fail immediately when 
{{CheckpointFailureManager}} invokes the callback. We could avoid the 
inconsistent scenario of FLINK-13497.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-14344) Snapshot master hook state asynchronously

2019-10-08 Thread Biao Liu (Jira)

Biao Liu created FLINK-14344:


 Summary: Snapshot master hook state asynchronously
 Key: FLINK-14344
 URL: https://issues.apache.org/jira/browse/FLINK-14344
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Checkpointing
Reporter: Biao Liu
 Fix For: 1.10.0


Currently we snapshot the master hook state synchronously. As a part of 
reworking threading model of {{CheckpointCoordinator}}, we have to make this 
non-blocking to satisfy the requirement of running in main thread.

The behavior of snapshotting master hook state should be similar to task state 
snapshotting. It should be launched after \{{PendingCheckpoint}} created. It 
could complete or fail the {{PendingCheckpoint}} like task state snapshotting. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

1 2 >

1 - 100 of 131 matches

Mail list logo