Re: [VOTE] FLIP-136: Improve interoperability between DataStream and Table API

2020-09-14 Thread Dawid Wysakowicz
+1

On 11/09/2020 16:19, Timo Walther wrote:
> Hi all,
>
> after the discussion in [1], I would like to open a voting thread for
> FLIP-136 [2] which covers different topic to improve the
> back-and-forth communication between DataStream API and Table API.
>
> The vote will be open until September 16th (72h + weekend), unless
> there is an objection or not enough votes.
>
> Regards,
> Timo
>
> [1]
> https://lists.apache.org/thread.html/r62b47ec6812ddbafed65ac79e31ca0305099893559f1e5a991dee550%40%3Cdev.flink.apache.org%3E
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-136%3A++Improve+interoperability+between+DataStream+and+Table+API



signature.asc
Description: OpenPGP digital signature


Re: [VOTE] Release 1.11.2, release candidate #1

2020-09-14 Thread Robert Metzger
Thanks a lot for putting a release candidate together!

+1

Checks:
- Manually checked the git diff
https://github.com/apache/flink/compare/release-1.11.1..release-1.11.2-rc1
  - in flink-kubernetes, the shading configuration got changed, but the
NOTICE file is correct (checked the shade plugin output)
- maven clean install from source
- checked staging repo files

Note: I observed that in one local build from source the log/ directory was
missing. I could not reproduce this issue (and I triggered the build before
the weekend .. maybe I did something weird in between). I believe this is a
problem caused by me, but it would be nice if you could keep your eyes open
for this issue, just to make sure.


On Thu, Sep 10, 2020 at 9:04 AM Zhu Zhu  wrote:

> Hi everyone,
>
> Please review and vote on the release candidate #1 for the version 1.11.2,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org [2], which are signed with the key with
> fingerprint C63E230EFFF519A5BBF2C9AE6767487CD505859C [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "release-1.11.2-rc1" [5],
> * website pull request listing the new release and adding announcement blog
> post [6].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Zhu
>
> [1]
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12348575
> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.11.2-rc1/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1397/
> [5]
>
> https://github.com/apache/flink/commit/fe3613574f76201a8d55d572a639a4ce7e18a9db
> [6] https://github.com/apache/flink-web/pull/377
>


Re: [DISCUSS] Deprecate and remove UnionList OperatorState

2020-09-14 Thread Alexey Trenikhun
-1

We use union state to generate sequences, each operator generates offset0 + 
number-of-tasks -  task-index + task-specific-counter * number-of-tasks (e.g. 
for 2 instances of operator -one instance produce even number, another odd). 
Last generated sequence number is stored union list state, on restart from 
where we should start to avoid collision with already generated numbers, to do 
saw we calculate offset0 as max over union list state.

Alexey


From: Seth Wiesman 
Sent: Wednesday, September 9, 2020 9:37:03 AM
To: dev 
Cc: Aljoscha Krettek ; user 
Subject: Re: [DISCUSS] Deprecate and remove UnionList OperatorState

Generally +1

The one use case I've seen of union state I've seen in production (outside of 
sources and sinks) is as a "poor mans" broadcast state. This was obviously 
before that feature was added which is now a few years ago so I don't know if 
those pipelines still exist. FWIW, if they do the state processor api can 
provide a migration path as it supports rewriting union state as broadcast 
state.

Seth

On Wed, Sep 9, 2020 at 10:21 AM Arvid Heise 
mailto:ar...@ververica.com>> wrote:
+1 to getting rid of non-keyed state as is in general and for union state
in particular. I had a hard time to wrap my head around the semantics of
non-keyed state when designing the rescale of unaligned checkpoint.

The only plausible use cases are legacy source and sinks. Both should also
be reworked in deprecated.

My main question is how to represent state in these two cases. For sources,
state should probably be bound to splits. In that regard, split (id) may
act as a key. More generally, there should be probably a concept that
supersedes keys and includes splits.

For sinks, I can see two cases:
- Either we are in a keyed context, then state should be bound to the key.
- Or we are in a non-keyed context, then state might be bound to the split
(?) in case of a source->sink chaining.
- Maybe it should also be a new(?) concept like output partition.

It's not clear to me if there are more cases and if we can always find a
good way to bind state to some sort of key, especially for arbitrary
communication patterns (which we may need to replace as well potentially).

On Wed, Sep 9, 2020 at 4:09 PM Aljoscha Krettek 
mailto:aljos...@apache.org>> wrote:

> Hi Devs,
>
> @Users: I'm cc'ing the user ML to see if there are any users that are
> relying on this feature. Please comment here if that is the case.
>
> I'd like to discuss the deprecation and eventual removal of UnionList
> Operator State, aka Operator State with Union Redistribution. If you
> don't know what I'm talking about you can take a look in the
> documentation: [1]. It's not documented thoroughly because it started
> out as mostly an internal feature.
>
> The immediate main reason for removing this is also mentioned in the
> documentation: "Do not use this feature if your list may have high
> cardinality. Checkpoint metadata will store an offset to each list
> entry, which could lead to RPC framesize or out-of-memory errors." The
> insidious part of this limitation is that you will only notice that
> there is a problem when it is too late. Checkpointing will still work
> and a program can continue when the state size is too big. The system
> will only fail when trying to restore from a snapshot that has union
> state that is too big. This could be fixed by working around that issue
> but I think there are more long-term issues with this type of state.
>
> I think we need to deprecate and remove API for state that is not tied
> to a key. Keyed state is easy to reason about, the system can
> re-partition state and also re-partition records and therefore scale the
> system in and out. Operator state, on the other hand is not tied to a
> key but an operator. This is a more "physical" concept, if you will,
> that potentially ties business logic closer to the underlying runtime
> execution model, which in turns means less degrees of freedom for the
> framework, that is Flink. This is future work, though, but we should
> start with deprecating union list state because it is the potentially
> most dangerous type of state.
>
> We currently use this state type internally in at least the
> StreamingFileSink, FlinkKafkaConsumer, and FlinkKafkaProducer. However,
> we're in the process of hopefully getting rid of it there with our work
> on sources and sinks. Before we fully remove it, we should of course
> signal this to users by deprecating it.
>
> What do you think?
>
> Best,
> Aljoscha
>


--

Arvid Heise | Senior Java Developer



Follow us @VervericaData

--

Join Flink Forward  - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(T

[jira] [Created] (FLINK-19213) Update the Chinese documentation

2020-09-14 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-19213:


 Summary: Update the Chinese documentation
 Key: FLINK-19213
 URL: https://issues.apache.org/jira/browse/FLINK-19213
 Project: Flink
  Issue Type: Sub-task
  Components: chinese-translation, Documentation
Reporter: Dawid Wysakowicz


We should update the Chinese documentation with the changes introduced in 
FLINK-18802



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19214) Update the flink-web

2020-09-14 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-19214:


 Summary: Update the flink-web 
 Key: FLINK-19214
 URL: https://issues.apache.org/jira/browse/FLINK-19214
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Project Website
Reporter: Dawid Wysakowicz
Assignee: Dawid Wysakowicz
 Fix For: 1.12.0


Update the avro sql format link in https://flink.apache.org/downloads.html 
(Optional components) 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Robert Metzger
Hi all,

On behalf of the PMC, I’m very happy to announce Niels Basjes as a new
Flink committer.

Niels has been an active community member since the early days of Flink,
with 19 commits dating back until 2015.
Besides his work on the code, he has been driving initiatives on dev@ list,
supporting users and giving talks at conferences.

Please join me in congratulating Niels for becoming a Flink committer!

Best,
Robert Metzger


Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Aljoscha Krettek

Congratulations! 💐

Aljoscha

On 14.09.20 10:37, Robert Metzger wrote:

Hi all,

On behalf of the PMC, I’m very happy to announce Niels Basjes as a new
Flink committer.

Niels has been an active community member since the early days of Flink,
with 19 commits dating back until 2015.
Besides his work on the code, he has been driving initiatives on dev@ list,
supporting users and giving talks at conferences.

Please join me in congratulating Niels for becoming a Flink committer!

Best,
Robert Metzger





Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread tison
Congrats!

Best,
tison.


Aljoscha Krettek  于2020年9月14日周一 下午4:38写道:

> Congratulations! 💐
>
> Aljoscha
>
> On 14.09.20 10:37, Robert Metzger wrote:
> > Hi all,
> >
> > On behalf of the PMC, I’m very happy to announce Niels Basjes as a new
> > Flink committer.
> >
> > Niels has been an active community member since the early days of Flink,
> > with 19 commits dating back until 2015.
> > Besides his work on the code, he has been driving initiatives on dev@
> list,
> > supporting users and giving talks at conferences.
> >
> > Please join me in congratulating Niels for becoming a Flink committer!
> >
> > Best,
> > Robert Metzger
> >
>
>


Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Konstantin Knauf
Congratulations!

On Mon, Sep 14, 2020 at 10:51 AM tison  wrote:

> Congrats!
>
> Best,
> tison.
>
>
> Aljoscha Krettek  于2020年9月14日周一 下午4:38写道:
>
> > Congratulations! 💐
> >
> > Aljoscha
> >
> > On 14.09.20 10:37, Robert Metzger wrote:
> > > Hi all,
> > >
> > > On behalf of the PMC, I’m very happy to announce Niels Basjes as a new
> > > Flink committer.
> > >
> > > Niels has been an active community member since the early days of
> Flink,
> > > with 19 commits dating back until 2015.
> > > Besides his work on the code, he has been driving initiatives on dev@
> > list,
> > > supporting users and giving talks at conferences.
> > >
> > > Please join me in congratulating Niels for becoming a Flink committer!
> > >
> > > Best,
> > > Robert Metzger
> > >
> >
> >
>


-- 

Konstantin Knauf | Head of Product

+49 160 91394525


Follow us @VervericaData Ververica 


--

Join Flink Forward  - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Yip Park Tung Jason, Jinwei (Kevin) Zhang, Karl Anton
Wehner


Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Matt Wang
Congratulations, Niels!


--

Best,
Matt Wang


On 09/14/2020 17:02,Konstantin Knauf wrote:
Congratulations!

On Mon, Sep 14, 2020 at 10:51 AM tison  wrote:

Congrats!

Best,
tison.


Aljoscha Krettek  于2020年9月14日周一 下午4:38写道:

Congratulations! 💐

Aljoscha

On 14.09.20 10:37, Robert Metzger wrote:
Hi all,

On behalf of the PMC, I’m very happy to announce Niels Basjes as a new
Flink committer.

Niels has been an active community member since the early days of
Flink,
with 19 commits dating back until 2015.
Besides his work on the code, he has been driving initiatives on dev@
list,
supporting users and giving talks at conferences.

Please join me in congratulating Niels for becoming a Flink committer!

Best,
Robert Metzger






--

Konstantin Knauf | Head of Product

+49 160 91394525


Follow us @VervericaData Ververica 


--

Join Flink Forward  - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Yip Park Tung Jason, Jinwei (Kevin) Zhang, Karl Anton
Wehner


Re: [VOTE] Release 1.11.2, release candidate #1

2020-09-14 Thread David Anderson
+1

Checks:
- Verified that the fix for FLINK-19109 solves the problem I reported,
running against the maven artifacts



On Thu, Sep 10, 2020 at 9:04 AM Zhu Zhu  wrote:

> Hi everyone,
>
> Please review and vote on the release candidate #1 for the version 1.11.2,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org [2], which are signed with the key with
> fingerprint C63E230EFFF519A5BBF2C9AE6767487CD505859C [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "release-1.11.2-rc1" [5],
> * website pull request listing the new release and adding announcement blog
> post [6].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Zhu
>
> [1]
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12348575
> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.11.2-rc1/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1397/
> [5]
>
> https://github.com/apache/flink/commit/fe3613574f76201a8d55d572a639a4ce7e18a9db
> [6] https://github.com/apache/flink-web/pull/377
>


Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Zhu Zhu
Congratulations!

Thanks,
Zhu

Matt Wang  于2020年9月14日周一 下午5:22写道:

> Congratulations, Niels!
>
>
> --
>
> Best,
> Matt Wang
>
>
> On 09/14/2020 17:02,Konstantin Knauf wrote:
> Congratulations!
>
> On Mon, Sep 14, 2020 at 10:51 AM tison  wrote:
>
> Congrats!
>
> Best,
> tison.
>
>
> Aljoscha Krettek  于2020年9月14日周一 下午4:38写道:
>
> Congratulations! 💐
>
> Aljoscha
>
> On 14.09.20 10:37, Robert Metzger wrote:
> Hi all,
>
> On behalf of the PMC, I’m very happy to announce Niels Basjes as a new
> Flink committer.
>
> Niels has been an active community member since the early days of
> Flink,
> with 19 commits dating back until 2015.
> Besides his work on the code, he has been driving initiatives on dev@
> list,
> supporting users and giving talks at conferences.
>
> Please join me in congratulating Niels for becoming a Flink committer!
>
> Best,
> Robert Metzger
>
>
>
>
>
>
> --
>
> Konstantin Knauf | Head of Product
>
> +49 160 91394525
>
>
> Follow us @VervericaData Ververica 
>
>
> --
>
> Join Flink Forward  - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
> --
>
> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>
> --
> Ververica GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Yip Park Tung Jason, Jinwei (Kevin) Zhang, Karl Anton
> Wehner
>


Re: [VOTE] Release 1.11.2, release candidate #1

2020-09-14 Thread Congxian Qiu
+1 (no-binding)

 checked
- sha verified ok
- gpg verifed ok
- build from source, ok
- check license ok, use the diff generated here[1]
- run some demo locally, ok
[1]

https://github.com/apache/flink/compare/release-1.11.1..release-1.11.2-rc1


Best,
Congxian


David Anderson  于2020年9月14日周一 下午5:33写道:

> +1
>
> Checks:
> - Verified that the fix for FLINK-19109 solves the problem I reported,
> running against the maven artifacts
>
>
>
> On Thu, Sep 10, 2020 at 9:04 AM Zhu Zhu  wrote:
>
> > Hi everyone,
> >
> > Please review and vote on the release candidate #1 for the version
> 1.11.2,
> > as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> >
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1],
> > * the official Apache source release and binary convenience releases to
> be
> > deployed to dist.apache.org [2], which are signed with the key with
> > fingerprint C63E230EFFF519A5BBF2C9AE6767487CD505859C [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag "release-1.11.2-rc1" [5],
> > * website pull request listing the new release and adding announcement
> blog
> > post [6].
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> > approval, with at least 3 PMC affirmative votes.
> >
> > Thanks,
> > Zhu
> >
> > [1]
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12348575
> > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.11.2-rc1/
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [4]
> > https://repository.apache.org/content/repositories/orgapacheflink-1397/
> > [5]
> >
> >
> https://github.com/apache/flink/commit/fe3613574f76201a8d55d572a639a4ce7e18a9db
> > [6] https://github.com/apache/flink-web/pull/377
> >
>


Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Dian Fu
Congratulations!

Regards,
Dian

> 在 2020年9月14日,下午7:45,Zhu Zhu  写道:
> 
> Congratulations!
> 
> Thanks,
> Zhu
> 
> Matt Wang  于2020年9月14日周一 下午5:22写道:
> 
>> Congratulations, Niels!
>> 
>> 
>> --
>> 
>> Best,
>> Matt Wang
>> 
>> 
>> On 09/14/2020 17:02,Konstantin Knauf wrote:
>> Congratulations!
>> 
>> On Mon, Sep 14, 2020 at 10:51 AM tison  wrote:
>> 
>> Congrats!
>> 
>> Best,
>> tison.
>> 
>> 
>> Aljoscha Krettek  于2020年9月14日周一 下午4:38写道:
>> 
>> Congratulations! 💐
>> 
>> Aljoscha
>> 
>> On 14.09.20 10:37, Robert Metzger wrote:
>> Hi all,
>> 
>> On behalf of the PMC, I’m very happy to announce Niels Basjes as a new
>> Flink committer.
>> 
>> Niels has been an active community member since the early days of
>> Flink,
>> with 19 commits dating back until 2015.
>> Besides his work on the code, he has been driving initiatives on dev@
>> list,
>> supporting users and giving talks at conferences.
>> 
>> Please join me in congratulating Niels for becoming a Flink committer!
>> 
>> Best,
>> Robert Metzger
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> 
>> Konstantin Knauf | Head of Product
>> 
>> +49 160 91394525
>> 
>> 
>> Follow us @VervericaData Ververica 
>> 
>> 
>> --
>> 
>> Join Flink Forward  - The Apache Flink
>> Conference
>> 
>> Stream Processing | Event Driven | Real Time
>> 
>> --
>> 
>> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>> 
>> --
>> Ververica GmbH
>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>> Managing Directors: Yip Park Tung Jason, Jinwei (Kevin) Zhang, Karl Anton
>> Wehner
>> 



[VOTE] FLIP-134: Batch execution for the DataStream API

2020-09-14 Thread Aljoscha Krettek



Hi all,

After the discussion in [1], I would like to open a voting thread for 
FLIP-134 (https://s.apache.org/FLIP-134) [2] which discusses a new BATCH 
execution mode for the DataStream API.


The vote will be open until September 17, unless there is an objection 
or not enough votes.


Regards,
Aljoscha

[1] 
https://lists.apache.org/thread.html/reb368f095ec13638b95cd5d885a0aa8e69af06d6e982a5f045f50022%40%3Cdev.flink.apache.org%3E

[2] https://cwiki.apache.org/confluence/x/4i94CQ


[jira] [Created] (FLINK-19215) "Resuming Savepoint (rocks, scale down, rocks timers) end-to-end test" failed with "Dispatcher REST endpoint has not started within a timeout of 20 sec"

2020-09-14 Thread Dian Fu (Jira)
Dian Fu created FLINK-19215:
---

 Summary: "Resuming Savepoint (rocks, scale down, rocks timers) 
end-to-end test" failed with "Dispatcher REST endpoint has not started within a 
timeout of 20 sec"
 Key: FLINK-19215
 URL: https://issues.apache.org/jira/browse/FLINK-19215
 Project: Flink
  Issue Type: Bug
  Components: Runtime / State Backends
Affects Versions: 1.11.0
Reporter: Dian Fu


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=6476&view=logs&j=08866332-78f7-59e4-4f7e-49a56faa3179&t=3e8647c1-5a28-5917-dd93-bf78594ea994

{code}
2020-09-13T21:26:23.3646770Z Running 'Resuming Savepoint (rocks, scale down, 
rocks timers) end-to-end test'
2020-09-13T21:26:23.3647852Z 
==
2020-09-13T21:26:23.3689605Z TEST_DATA_DIR: 
/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-23367497881
2020-09-13T21:26:23.7122791Z Flink dist directory: 
/home/vsts/work/1/s/flink-dist/target/flink-1.11-SNAPSHOT-bin/flink-1.11-SNAPSHOT
2020-09-13T21:26:23.9988115Z Starting cluster.
2020-09-13T21:26:27.3702750Z Starting standalonesession daemon on host fv-az655.
2020-09-13T21:26:35.1213853Z Starting taskexecutor daemon on host fv-az655.
2020-09-13T21:26:35.2756714Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:36.4111928Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:37.5358508Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:38.7156039Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:39.8602294Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:41.0399056Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:42.1680966Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:43.2520250Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:44.3833552Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:45.5204296Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:46.6730448Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:47.8274365Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:49.0147447Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:51.5463623Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:52.7366058Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:53.8867521Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:55.0469025Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:56.1901349Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:57.3124935Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:58.4596457Z Waiting for Dispatcher REST endpoint to come up...
2020-09-13T21:26:59.4828675Z Dispatcher REST endpoint has not started within a 
timeout of 20 sec
2020-09-13T21:26:59.4831446Z [FAIL] Test script contains errors.
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.11.2, release candidate #1

2020-09-14 Thread Zhu Zhu
Thank you all for the verification and voting!

@Robert
I tried building from sources again and the log/ dir is there.
I will keep watching if anyone else encounters this problem.

Thanks,
Zhu

Congxian Qiu  于2020年9月14日周一 下午7:47写道:

> +1 (no-binding)
>
>  checked
> - sha verified ok
> - gpg verifed ok
> - build from source, ok
> - check license ok, use the diff generated here[1]
> - run some demo locally, ok
> [1]
>
> https://github.com/apache/flink/compare/release-1.11.1..release-1.11.2-rc1
>
>
> Best,
> Congxian
>
>
> David Anderson  于2020年9月14日周一 下午5:33写道:
>
> > +1
> >
> > Checks:
> > - Verified that the fix for FLINK-19109 solves the problem I reported,
> > running against the maven artifacts
> >
> >
> >
> > On Thu, Sep 10, 2020 at 9:04 AM Zhu Zhu  wrote:
> >
> > > Hi everyone,
> > >
> > > Please review and vote on the release candidate #1 for the version
> > 1.11.2,
> > > as follows:
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > >
> > >
> > > The complete staging area is available for your review, which includes:
> > > * JIRA release notes [1],
> > > * the official Apache source release and binary convenience releases to
> > be
> > > deployed to dist.apache.org [2], which are signed with the key with
> > > fingerprint C63E230EFFF519A5BBF2C9AE6767487CD505859C [3],
> > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > * source code tag "release-1.11.2-rc1" [5],
> > > * website pull request listing the new release and adding announcement
> > blog
> > > post [6].
> > >
> > > The vote will be open for at least 72 hours. It is adopted by majority
> > > approval, with at least 3 PMC affirmative votes.
> > >
> > > Thanks,
> > > Zhu
> > >
> > > [1]
> > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12348575
> > > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.11.2-rc1/
> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > [4]
> > >
> https://repository.apache.org/content/repositories/orgapacheflink-1397/
> > > [5]
> > >
> > >
> >
> https://github.com/apache/flink/commit/fe3613574f76201a8d55d572a639a4ce7e18a9db
> > > [6] https://github.com/apache/flink-web/pull/377
> > >
> >
>


Re: [VOTE] Release 1.11.2, release candidate #1

2020-09-14 Thread Fabian Paul
+1 (non-binding)

Checks:

- Verified signature
- Built from source (Java8)
- Ran custom jobs on Kubernetes

Regards,
Fabian


Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-14 Thread Piotr Nowojski
Hi,

I've just briefly skimmed over the proposed interfaces. I would suggest one
addition to the Writer interface (as I understand this is the runtime
interface in this proposal?): add some availability method, to avoid, if
possible, blocking calls on the sink. We already have similar
availability methods in the new sources [1] and in various places in the
network stack [2].

I'm aware that many implementations wouldn't be able to implement it, but
some may. For example `FlinkKafkaProducer` could
use `FlinkKafkaProducer#pendingRecords` to control `Writer`'s availability.
Also any sink that would be implementing records handover to some writer
thread could easily provide availability as well.

Non blocking calls are important for many things, for example they are
crucial for unaligned checkpoints to complete quickly.

Piotrek

[1] org.apache.flink.api.connector.source.SourceReader#isAvailable
[2] org.apache.flink.runtime.io.AvailabilityProvider

pon., 14 wrz 2020 o 01:23 Steven Wu  napisał(a):

> Aljoscha, thanks a lot for the detailed response. Now I have a better
> understanding of the initial scope.
>
> To me, there are possibly three different committer behaviors. For the lack
> of better names, let's call them
> * No/NoopCommitter
> * Committer / LocalCommitter (file sink?)
> * GlobalCommitter (Iceberg)
>
> ## Writer interface
>
> For the Writer interface, should we add "*prepareSnapshot"* before the
> checkpoint barrier emitted downstream?  IcebergWriter would need it. Or
> would the framework call "*flush*" before the barrier emitted downstream?
> that guarantee would achieve the same goal.
> -
> // before barrier emitted to downstream
> void prepareSnapshot(long checkpointId) throws Exception;
>
> // or will flush be called automatically before the barrier emitted
> downstream?
> // if yes, we need the checkpointId arg for the reason listed in [1]
> void flush(WriterOutput output) throws IOException;
>
> In [1], we discussed the reason for Writer to emit (checkpointId, CommT)
> tuple to the committer. The committer needs checkpointId to separate out
> data files for different checkpoints if concurrent checkpoints are enabled.
> For that reason, writers need to know the checkpointId where the restore
> happened. Can we add a RestoreContext interface to the restoreWriter
> method?
> ---
> Writer restoreWriter(InitContext context,
> RestoreContext restoreContext, List state, List share);
>
> interface RestoreContext {
>   long getCheckpointId();
> }
>
>
> ## Committer interface
>
> For the Committer interface, I am wondering if we should split the single
> commit method into separate "*collect"* and "*commit"* methods? This way,
> it can handle both single and multiple CommT objects.
> --
> void commit(CommT committable) throws Exception;
>   ==>
> void collect(CommT committable) throws Exception;
> void commit() throws Exception;
>
> As discussed in [1] and mentioned above, the Iceberg committer needs to
> know which checkpointId is the commit for. So can we add checkpiontId arg
> to the commit API. However, I don't know how this would affect the batch
> execution where checkpoints are disabled.
> --
> void commit(long checkpointId) throws Exception;
>
> For Iceberg, writers don't need any state. But the GlobalCommitter needs to
> checkpoint StateT. For the committer, CommT is "DataFile". Since a single
> committer can collect thousands (or more) data files in one checkpoint
> cycle, as an optimization we checkpoint a single "ManifestFile" (for the
> collected thousands data files) as StateT. This allows us to absorb
> extended commit outages without losing written/uploaded data files, as
> operator state size is as small as one manifest file per checkpoint cycle
> [2].
> --
> StateT snapshotState(SnapshotContext context) throws Exception;
>
> That means we also need the restoreCommitter API in the Sink interface
> ---
> Committer restoreCommitter(InitContext context, StateT
> state);
>
>
> Thanks,
> Steven
>
> [1] https://github.com/apache/iceberg/pull/1185#discussion_r479589663
> [2] https://github.com/apache/iceberg/pull/1185#discussion_r479457104
>
>
>
> On Fri, Sep 11, 2020 at 3:27 AM Aljoscha Krettek 
> wrote:
>
> > Regarding the FLIP itself, I like the motivation section and the
> > What/How/When/Where section a lot!
> >
> > I don't understand why we need the "Drain and Snapshot" section. It
> > seems to be some details about stop-with-savepoint and drain, and the
> > relation to BATCH execution but I don't know if it is needed to
> > understand the rest of the document. I'm happy to be wrong here, though,
> > if there's good reasons for the section.
> >
> > On the question of Alternative 1 and 2, I have a strong preference for
> > Alternative 1 because we could avoid strong coupling to other modules.
> > With Alternative 2 we would depend on `flink-streaming-java` and even
> > `flink-runtime`. For the n

Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-14 Thread Aljoscha Krettek

On 14.09.20 01:23, Steven Wu wrote:

## Writer interface

For the Writer interface, should we add "*prepareSnapshot"* before the
checkpoint barrier emitted downstream?  IcebergWriter would need it. Or
would the framework call "*flush*" before the barrier emitted downstream?
that guarantee would achieve the same goal.


I would think that we only need flush() and the semantics are that it 
prepares for a commit, so on a physical level it would be called from 
"prepareSnapshotPreBarrier". Now that I'm thinking about it more I think 
flush() should be renamed to something like "prepareCommit()".


@Guowei, what do you think about this?


In [1], we discussed the reason for Writer to emit (checkpointId, CommT)
tuple to the committer. The committer needs checkpointId to separate out
data files for different checkpoints if concurrent checkpoints are enabled.


When can this happen? Even with concurrent checkpoints the snapshot 
barriers would still cleanly segregate the input stream of an operator 
into tranches that should manifest in only one checkpoint. With 
concurrent checkpoints, all that can happen is that we start a 
checkpoint before a last one is confirmed completed.


Unless there is some weirdness in the sources and some sources start 
chk1 first and some other ones start chk2 first?


@Piotrek, do you think this is a problem?


For the Committer interface, I am wondering if we should split the single
commit method into separate "*collect"* and "*commit"* methods? This way,
it can handle both single and multiple CommT objects.


I think we can't do this. If the sink only needs a regular Commiter, we 
can perform the commits in parallel, possibly on different machines. 
Only when the sink needs a GlobalCommitter would we need to ship all 
commits to a single process and perform the commit there. If both 
methods were unified in one interface we couldn't make the decision of 
were to commit in the framework code.



For Iceberg, writers don't need any state. But the GlobalCommitter needs to
checkpoint StateT. For the committer, CommT is "DataFile". Since a single
committer can collect thousands (or more) data files in one checkpoint
cycle, as an optimization we checkpoint a single "ManifestFile" (for the
collected thousands data files) as StateT. This allows us to absorb
extended commit outages without losing written/uploaded data files, as
operator state size is as small as one manifest file per checkpoint cycle


You could have a point here. Is the code for this available in 
open-source? I was checking out 
https://github.com/apache/iceberg/blob/master/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergFilesCommitter.java 
and didn't find the ManifestFile optimization there.


Best,
Aljoscha



[jira] [Created] (FLINK-19216) Reduce the duplicate argument check

2020-09-14 Thread darion yaphet (Jira)
darion yaphet created FLINK-19216:
-

 Summary: Reduce the duplicate argument check
 Key: FLINK-19216
 URL: https://issues.apache.org/jira/browse/FLINK-19216
 Project: Flink
  Issue Type: Improvement
  Components: API / Core
Reporter: darion yaphet


Whether the parameter is a number or not, we will put it into the map and then 
increment the index. So maybe I can merge them into one .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.11.2, release candidate #1

2020-09-14 Thread Xingbo Huang
+1 (non-binding)

Checks:

- Pip install PyFlink from wheel packages with Python 3.5,3.6 and 3.7 in
Mac and Linux.
- Test Python UDF/Pandas UDF
- Test from_pandas/to_pandas

Best,
Xingbo

Fabian Paul  于2020年9月14日周一 下午8:46写道:

> +1 (non-binding)
>
> Checks:
>
> - Verified signature
> - Built from source (Java8)
> - Ran custom jobs on Kubernetes
>
> Regards,
> Fabian
>


Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Xingbo Huang
Congratulations!

Best,
Xingbo

Dian Fu  于2020年9月14日周一 下午8:06写道:

> Congratulations!
>
> Regards,
> Dian
>
> > 在 2020年9月14日,下午7:45,Zhu Zhu  写道:
> >
> > Congratulations!
> >
> > Thanks,
> > Zhu
> >
> > Matt Wang  于2020年9月14日周一 下午5:22写道:
> >
> >> Congratulations, Niels!
> >>
> >>
> >> --
> >>
> >> Best,
> >> Matt Wang
> >>
> >>
> >> On 09/14/2020 17:02,Konstantin Knauf wrote:
> >> Congratulations!
> >>
> >> On Mon, Sep 14, 2020 at 10:51 AM tison  wrote:
> >>
> >> Congrats!
> >>
> >> Best,
> >> tison.
> >>
> >>
> >> Aljoscha Krettek  于2020年9月14日周一 下午4:38写道:
> >>
> >> Congratulations! 💐
> >>
> >> Aljoscha
> >>
> >> On 14.09.20 10:37, Robert Metzger wrote:
> >> Hi all,
> >>
> >> On behalf of the PMC, I’m very happy to announce Niels Basjes as a new
> >> Flink committer.
> >>
> >> Niels has been an active community member since the early days of
> >> Flink,
> >> with 19 commits dating back until 2015.
> >> Besides his work on the code, he has been driving initiatives on dev@
> >> list,
> >> supporting users and giving talks at conferences.
> >>
> >> Please join me in congratulating Niels for becoming a Flink committer!
> >>
> >> Best,
> >> Robert Metzger
> >>
> >>
> >>
> >>
> >>
> >>
> >> --
> >>
> >> Konstantin Knauf | Head of Product
> >>
> >> +49 160 91394525
> >>
> >>
> >> Follow us @VervericaData Ververica 
> >>
> >>
> >> --
> >>
> >> Join Flink Forward  - The Apache Flink
> >> Conference
> >>
> >> Stream Processing | Event Driven | Real Time
> >>
> >> --
> >>
> >> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> >>
> >> --
> >> Ververica GmbH
> >> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> >> Managing Directors: Yip Park Tung Jason, Jinwei (Kevin) Zhang, Karl
> Anton
> >> Wehner
> >>
>
>


Re: [VOTE] FLIP-140: Introduce bounded style execution for keyed streams

2020-09-14 Thread Seth Wiesman
+1 (binding)

Seth

On Thu, Sep 10, 2020 at 9:13 AM Aljoscha Krettek 
wrote:

> +1 (binding)
>
> Aljoscha
>


Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Benchao Li
Congratulations!

Xingbo Huang  于2020年9月14日周一 下午9:36写道:

> Congratulations!
>
> Best,
> Xingbo
>
> Dian Fu  于2020年9月14日周一 下午8:06写道:
>
> > Congratulations!
> >
> > Regards,
> > Dian
> >
> > > 在 2020年9月14日,下午7:45,Zhu Zhu  写道:
> > >
> > > Congratulations!
> > >
> > > Thanks,
> > > Zhu
> > >
> > > Matt Wang  于2020年9月14日周一 下午5:22写道:
> > >
> > >> Congratulations, Niels!
> > >>
> > >>
> > >> --
> > >>
> > >> Best,
> > >> Matt Wang
> > >>
> > >>
> > >> On 09/14/2020 17:02,Konstantin Knauf wrote:
> > >> Congratulations!
> > >>
> > >> On Mon, Sep 14, 2020 at 10:51 AM tison  wrote:
> > >>
> > >> Congrats!
> > >>
> > >> Best,
> > >> tison.
> > >>
> > >>
> > >> Aljoscha Krettek  于2020年9月14日周一 下午4:38写道:
> > >>
> > >> Congratulations! 💐
> > >>
> > >> Aljoscha
> > >>
> > >> On 14.09.20 10:37, Robert Metzger wrote:
> > >> Hi all,
> > >>
> > >> On behalf of the PMC, I’m very happy to announce Niels Basjes as a new
> > >> Flink committer.
> > >>
> > >> Niels has been an active community member since the early days of
> > >> Flink,
> > >> with 19 commits dating back until 2015.
> > >> Besides his work on the code, he has been driving initiatives on dev@
> > >> list,
> > >> supporting users and giving talks at conferences.
> > >>
> > >> Please join me in congratulating Niels for becoming a Flink committer!
> > >>
> > >> Best,
> > >> Robert Metzger
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> --
> > >>
> > >> Konstantin Knauf | Head of Product
> > >>
> > >> +49 160 91394525
> > >>
> > >>
> > >> Follow us @VervericaData Ververica 
> > >>
> > >>
> > >> --
> > >>
> > >> Join Flink Forward  - The Apache Flink
> > >> Conference
> > >>
> > >> Stream Processing | Event Driven | Real Time
> > >>
> > >> --
> > >>
> > >> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> > >>
> > >> --
> > >> Ververica GmbH
> > >> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> > >> Managing Directors: Yip Park Tung Jason, Jinwei (Kevin) Zhang, Karl
> > Anton
> > >> Wehner
> > >>
> >
> >
>


-- 

Best,
Benchao Li


Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-14 Thread Aljoscha Krettek
I thought about this some more. One of the important parts of the 
Iceberg sink is to know whether we have already committed some 
DataFiles. Currently, this is implemented by writing a (JobId, 
MaxCheckpointId) tuple to the Iceberg table when committing. When 
restoring from a failure we check this and discard committables 
(DataFile) that we know to already be committed.


I think this can have some problems, for example when checkpoint ids are 
not strictly sequential, when we wrap around, or when the JobID changes. 
This will happen when doing a stop/start-from-savepoint cycle, for example.


I think we could fix this by having Flink provide a nonce to the 
GlobalCommitter where Flink guarantees that this nonce is unique and 
will not change for repeated invocations of the GlobalCommitter with the 
same set of committables. The GlobalCommitter could use this to 
determine whether a set of committables has already been committed to 
the Iceberg table.


It's seems very tailor-made for Iceberg for now but other systems should 
suffer from the same problem.


Best,
Aljoscha


[jira] [Created] (FLINK-19217) Functions repeated extend Serializable interface

2020-09-14 Thread darion yaphet (Jira)
darion yaphet created FLINK-19217:
-

 Summary: Functions repeated extend Serializable interface
 Key: FLINK-19217
 URL: https://issues.apache.org/jira/browse/FLINK-19217
 Project: Flink
  Issue Type: Improvement
Reporter: darion yaphet


When I learning the functions, I found the Function interface have extend 
Serializable. But other Functions such as EdgesFunction, CoFlatMapFunction and 
MapFunction. they are all extends Serializable duplicate. Maybe it's 
unnecessary ? Could remove them ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19218) Remove inconsistent host logic for LocalFileSystem

2020-09-14 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-19218:


 Summary: Remove inconsistent host logic for LocalFileSystem
 Key: FLINK-19218
 URL: https://issues.apache.org/jira/browse/FLINK-19218
 Project: Flink
  Issue Type: Bug
  Components: API / Core
Affects Versions: 1.11.1
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.12.0


The {{LocalFileSystem}} returns file splits with a host information as returned 
via
{code}
InetAddress.getLocalHost().getHostName();
{code}

This might be different, though, from the host name that the TaskManager is 
configured to use, which results in incorrect location matching if this 
information is used.

It is also incorrect in cases where the file system is in fact not local, but a 
mounted NAS.

Since this information is anyways not useful (there no good way to support 
locality-aware file access for the LocalFileSystem) I would suggest to remove 
this code. That would be better than having code in place that tries to suggest 
locality information that is frequently incorrect.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19219) Run JobManager initialization in a separate thread, to make it cancellable

2020-09-14 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19219:
--

 Summary: Run JobManager initialization in a separate thread, to 
make it cancellable
 Key: FLINK-19219
 URL: https://issues.apache.org/jira/browse/FLINK-19219
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Affects Versions: 1.12.0
Reporter: Robert Metzger
Assignee: Robert Metzger


FLINK-16866 made the job submission non-blocking. The job submission will be 
executed asynchronously in a thread pool, submitted through a future.
The problem is that we can not cancel a hanging job submission once it is 
running in the threadpool.
This ticket is about running the initialization in a separate thread, so that 
we can interrupt it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Java and Scala code format

2020-09-14 Thread Darion Yaphet
Hi team:

I have an idea about code format. This is more readable and
 good for development. But it also may bring a lot of changes. Could you
tell me what you think ? thanks ~

-- 

long is the way and hard  that out of Hell leads up to light


Re: Java and Scala code format

2020-09-14 Thread Aljoscha Krettek
For some reference, this is a Jira issue that was created by the OP 
about using Scalafmt for the Flink code base: 
https://issues.apache.org/jira/browse/FLINK-19159.


Best,
Aljoscha


[VOTE] FLIP-142: Disentangle StateBackends from Checkpointing

2020-09-14 Thread Seth Wiesman
Hi all,

After the discussion in [1], I would like to open a voting thread for
FLIP-142 [2] which discusses disentangling state backends from
checkpointing.

The vote will be open until 16th September (72h), unless there is an
objection or not enough votes.

Seth

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-142-Disentangle-StateBackends-from-Checkpointing-td44496.html
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-142%3A+Disentangle+StateBackends+from+Checkpointing


Re: Java and Scala code format

2020-09-14 Thread Darion Yaphet
Yes, this is my submission

Aljoscha Krettek  于2020年9月14日周一 下午10:58写道:

> For some reference, this is a Jira issue that was created by the OP
> about using Scalafmt for the Flink code base:
> https://issues.apache.org/jira/browse/FLINK-19159.
>
> Best,
> Aljoscha
>


-- 

long is the way and hard  that out of Hell leads up to light


[jira] [Created] (FLINK-19220) Add a way to close internal resources

2020-09-14 Thread Igal Shilman (Jira)
Igal Shilman created FLINK-19220:


 Summary: Add a way to close internal resources
 Key: FLINK-19220
 URL: https://issues.apache.org/jira/browse/FLINK-19220
 Project: Flink
  Issue Type: Improvement
  Components: Stateful Functions
Reporter: Igal Shilman


Currently the internal Http stateful function, obtains transitively few 
resources like a thread pool and a connection pool via an http client. But it 
is unaware of a task cancellation and redeploying creating the potential for 
these resources to leak.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-14 Thread Dawid Wysakowicz
Hi all,

> I would think that we only need flush() and the semantics are that it
> prepares for a commit, so on a physical level it would be called from
> "prepareSnapshotPreBarrier". Now that I'm thinking about it more I
> think flush() should be renamed to something like "prepareCommit()". 

Generally speaking it is a good point that emitting the committables
should happen before emitting the checkpoint barrier downstream.
However, if I remember offline discussions well, the idea behind
Writer#flush and Writer#snapshotState was to differentiate commit on
checkpoint vs final checkpoint at the end of the job. Both of these
methods could emit committables, but the flush should not leave any in
progress state (e.g. in case of file sink in STREAM mode, in
snapshotState it could leave some open files that would be committed in
a subsequent cycle, however flush should close all files). The
snapshotState as it is now can not be called in
prepareSnapshotPreBarrier as it can store some state, which should
happen in Operator#snapshotState as otherwise it would always be
synchronous. Therefore I think we would need sth like:

void prepareCommit(boolean flush, WriterOutput output);

ver 1:

List snapshotState();

ver 2:

void snapshotState(); // not sure if we need that method at all in option 2

> The Committer is as described in the FLIP, it's basically a function
> "void commit(Committable)". The GobalCommitter would be a function "void
> commit(List)". The former would be used by an S3 sink where
> we can individually commit files to S3, a committable would be the list
> of part uploads that will form the final file and the commit operation
> creates the metadata in S3. The latter would be used by something like
> Iceberg where the Committer needs a global view of all the commits to be
> efficient and not overwhelm the system.
>
> I don't know yet if sinks would only implement on type of commit
> function or potentially both at the same time, and maybe Commit can
> return some CommitResult that gets shipped to the GlobalCommit function.
I must admit it I did not get the need for Local/Normal + Global
committer at first. The Iceberg example helped a lot. I think it makes a
lot of sense.

> For Iceberg, writers don't need any state. But the GlobalCommitter
> needs to
> checkpoint StateT. For the committer, CommT is "DataFile". Since a single
> committer can collect thousands (or more) data files in one checkpoint
> cycle, as an optimization we checkpoint a single "ManifestFile" (for the
> collected thousands data files) as StateT. This allows us to absorb
> extended commit outages without losing written/uploaded data files, as
> operator state size is as small as one manifest file per checkpoint cycle
> [2].
> --
> StateT snapshotState(SnapshotContext context) throws Exception;
>
> That means we also need the restoreCommitter API in the Sink interface
> ---
> Committer restoreCommitter(InitContext context, StateT
> state);
I think this might be a valid case. Not sure though if I would go with a
"state" there. Having a state in a committer would imply we need a
collect method as well. So far we needed a single method commit(...) and
the bookkeeping of the committables could be handled by the framework. I
think something like an optional combiner in the GlobalCommitter would
be enough. What do you think?

GlobalCommitter {

    void commit(GlobalCommT globalCommittables);

    GlobalCommT combine(List committables);

}

A different problem that I see here is how do we handle commit failures.
Should the committables (both normal and global be included in the next
cycle, shall we retry it, ...) I think it would be worth laying it out
in the FLIP.

@Aljoscha I think you can find the code Steven was referring in here:
https://github.com/Netflix-Skunkworks/nfflink-connector-iceberg/blob/master/nfflink-connector-iceberg/src/main/java/com/netflix/spaas/nfflink/connector/iceberg/sink/IcebergCommitter.java

Best,

Dawid

On 14/09/2020 15:19, Aljoscha Krettek wrote:
> On 14.09.20 01:23, Steven Wu wrote:
>> ## Writer interface
>>
>> For the Writer interface, should we add "*prepareSnapshot"* before the
>> checkpoint barrier emitted downstream?  IcebergWriter would need it. Or
>> would the framework call "*flush*" before the barrier emitted
>> downstream?
>> that guarantee would achieve the same goal.
>
> I would think that we only need flush() and the semantics are that it
> prepares for a commit, so on a physical level it would be called from
> "prepareSnapshotPreBarrier". Now that I'm thinking about it more I
> think flush() should be renamed to something like "prepareCommit()".
>
> @Guowei, what do you think about this?
>
>> In [1], we discussed the reason for Writer to emit (checkpointId, CommT)
>> tuple to the committer. The committer needs checkpointId to separate out
>> data files for different checkpoints if concurrent checkpoints are
>> enabled.
>
> When can this happen? Even with concurrent

[jira] [Created] (FLINK-19221) Exploit LocatableFileStatus from Hadoop

2020-09-14 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-19221:


 Summary: Exploit LocatableFileStatus from Hadoop
 Key: FLINK-19221
 URL: https://issues.apache.org/jira/browse/FLINK-19221
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hadoop Compatibility
Affects Versions: 1.11.1
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.12.0


When the HDFS Client returns a {{FileStatus}} (description of a file) it 
sometimes returns a {{LocatedFileStatus}} which already contains all the 
{{BlockLocation}} information.

We should expose this on the Flink side, because it may save is a lot of RPC 
calls to the name node. The file enumerators often request block locations for 
all files, currently doing an RPC call for each file.

When the FileStatus obtained from listing the directory (or getting details for 
a file) already has all the block locations, we can save the extra RPC call per 
file.

The suggested implementation is as follows:

  1. We introduce a {{LocatedInputSplit}} in Flink that we integrate with the 
built-in LocalFileSystem
  2. We integrate this with the HadoopFileSystems by creating a Flink 
{{LocatedInputSplit}} whenever the underlying file system created a {{Hadoop 
LocatedInputSplit}}
  3. As a safety net, the FS methods to access block information check whether 
the presented file status already contains the block information and return 
that information directly.

Steps one and two are for simplification of FileSystem users (no need to ask 
for extra info if it is available).

Step three is the transparent shortcut that all applications get even if they 
do not explicitly use the {{LocatedInputSplit}} and keep calling 
{{FileSystem.getBlockLocations()}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19222) Elevate external SDKs

2020-09-14 Thread Seth Wiesman (Jira)
Seth Wiesman created FLINK-19222:


 Summary: Elevate external SDKs
 Key: FLINK-19222
 URL: https://issues.apache.org/jira/browse/FLINK-19222
 Project: Flink
  Issue Type: Improvement
  Components: Documentation, Stateful Functions
Reporter: Seth Wiesman
Assignee: Seth Wiesman






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Arvid Heise
Congrats Niels!

On Mon, Sep 14, 2020 at 4:04 PM Benchao Li  wrote:

> Congratulations!
>
> Xingbo Huang  于2020年9月14日周一 下午9:36写道:
>
> > Congratulations!
> >
> > Best,
> > Xingbo
> >
> > Dian Fu  于2020年9月14日周一 下午8:06写道:
> >
> > > Congratulations!
> > >
> > > Regards,
> > > Dian
> > >
> > > > 在 2020年9月14日,下午7:45,Zhu Zhu  写道:
> > > >
> > > > Congratulations!
> > > >
> > > > Thanks,
> > > > Zhu
> > > >
> > > > Matt Wang  于2020年9月14日周一 下午5:22写道:
> > > >
> > > >> Congratulations, Niels!
> > > >>
> > > >>
> > > >> --
> > > >>
> > > >> Best,
> > > >> Matt Wang
> > > >>
> > > >>
> > > >> On 09/14/2020 17:02,Konstantin Knauf
> wrote:
> > > >> Congratulations!
> > > >>
> > > >> On Mon, Sep 14, 2020 at 10:51 AM tison 
> wrote:
> > > >>
> > > >> Congrats!
> > > >>
> > > >> Best,
> > > >> tison.
> > > >>
> > > >>
> > > >> Aljoscha Krettek  于2020年9月14日周一 下午4:38写道:
> > > >>
> > > >> Congratulations! 💐
> > > >>
> > > >> Aljoscha
> > > >>
> > > >> On 14.09.20 10:37, Robert Metzger wrote:
> > > >> Hi all,
> > > >>
> > > >> On behalf of the PMC, I’m very happy to announce Niels Basjes as a
> new
> > > >> Flink committer.
> > > >>
> > > >> Niels has been an active community member since the early days of
> > > >> Flink,
> > > >> with 19 commits dating back until 2015.
> > > >> Besides his work on the code, he has been driving initiatives on
> dev@
> > > >> list,
> > > >> supporting users and giving talks at conferences.
> > > >>
> > > >> Please join me in congratulating Niels for becoming a Flink
> committer!
> > > >>
> > > >> Best,
> > > >> Robert Metzger
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >>
> > > >> Konstantin Knauf | Head of Product
> > > >>
> > > >> +49 160 91394525
> > > >>
> > > >>
> > > >> Follow us @VervericaData Ververica 
> > > >>
> > > >>
> > > >> --
> > > >>
> > > >> Join Flink Forward  - The Apache Flink
> > > >> Conference
> > > >>
> > > >> Stream Processing | Event Driven | Real Time
> > > >>
> > > >> --
> > > >>
> > > >> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> > > >>
> > > >> --
> > > >> Ververica GmbH
> > > >> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> > > >> Managing Directors: Yip Park Tung Jason, Jinwei (Kevin) Zhang, Karl
> > > Anton
> > > >> Wehner
> > > >>
> > >
> > >
> >
>
>
> --
>
> Best,
> Benchao Li
>


-- 

Arvid Heise | Senior Java Developer



Follow us @VervericaData

--

Join Flink Forward  - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Toni) Cheng


Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread David Anderson
Congratulations!

--David

On Mon, Sep 14, 2020 at 8:24 PM Arvid Heise  wrote:

> Congrats Niels!
>
> On Mon, Sep 14, 2020 at 4:04 PM Benchao Li  wrote:
>
> > Congratulations!
> >
> > Xingbo Huang  于2020年9月14日周一 下午9:36写道:
> >
> > > Congratulations!
> > >
> > > Best,
> > > Xingbo
> > >
> > > Dian Fu  于2020年9月14日周一 下午8:06写道:
> > >
> > > > Congratulations!
> > > >
> > > > Regards,
> > > > Dian
> > > >
> > > > > 在 2020年9月14日,下午7:45,Zhu Zhu  写道:
> > > > >
> > > > > Congratulations!
> > > > >
> > > > > Thanks,
> > > > > Zhu
> > > > >
> > > > > Matt Wang  于2020年9月14日周一 下午5:22写道:
> > > > >
> > > > >> Congratulations, Niels!
> > > > >>
> > > > >>
> > > > >> --
> > > > >>
> > > > >> Best,
> > > > >> Matt Wang
> > > > >>
> > > > >>
> > > > >> On 09/14/2020 17:02,Konstantin Knauf
> > wrote:
> > > > >> Congratulations!
> > > > >>
> > > > >> On Mon, Sep 14, 2020 at 10:51 AM tison 
> > wrote:
> > > > >>
> > > > >> Congrats!
> > > > >>
> > > > >> Best,
> > > > >> tison.
> > > > >>
> > > > >>
> > > > >> Aljoscha Krettek  于2020年9月14日周一 下午4:38写道:
> > > > >>
> > > > >> Congratulations! 💐
> > > > >>
> > > > >> Aljoscha
> > > > >>
> > > > >> On 14.09.20 10:37, Robert Metzger wrote:
> > > > >> Hi all,
> > > > >>
> > > > >> On behalf of the PMC, I’m very happy to announce Niels Basjes as a
> > new
> > > > >> Flink committer.
> > > > >>
> > > > >> Niels has been an active community member since the early days of
> > > > >> Flink,
> > > > >> with 19 commits dating back until 2015.
> > > > >> Besides his work on the code, he has been driving initiatives on
> > dev@
> > > > >> list,
> > > > >> supporting users and giving talks at conferences.
> > > > >>
> > > > >> Please join me in congratulating Niels for becoming a Flink
> > committer!
> > > > >>
> > > > >> Best,
> > > > >> Robert Metzger
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >>
> > > > >> Konstantin Knauf | Head of Product
> > > > >>
> > > > >> +49 160 91394525
> > > > >>
> > > > >>
> > > > >> Follow us @VervericaData Ververica 
> > > > >>
> > > > >>
> > > > >> --
> > > > >>
> > > > >> Join Flink Forward  - The Apache
> Flink
> > > > >> Conference
> > > > >>
> > > > >> Stream Processing | Event Driven | Real Time
> > > > >>
> > > > >> --
> > > > >>
> > > > >> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> > > > >>
> > > > >> --
> > > > >> Ververica GmbH
> > > > >> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> > > > >> Managing Directors: Yip Park Tung Jason, Jinwei (Kevin) Zhang,
> Karl
> > > > Anton
> > > > >> Wehner
> > > > >>
> > > >
> > > >
> > >
> >
> >
> > --
> >
> > Best,
> > Benchao Li
> >
>
>
> --
>
> Arvid Heise | Senior Java Developer
>
> 
>
> Follow us @VervericaData
>
> --
>
> Join Flink Forward  - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
> --
>
> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>
> --
> Ververica GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
> (Toni) Cheng
>


Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-14 Thread Guowei Ma
Hi all,


Very thanks for the discussion and the valuable opinions! Currently there
are several ongoing issues and we would like to show what we are thinking
in the next few mails.

It seems that the biggest issue now is about the topology of the sinks.
Before deciding what the sink API would look like, I would like to first
summarize the different topologies we have mentioned so that we could sync
on the same page and gain more insights about this issue. There are four
types of topology I could see. Please correct me if I misunderstand what
you mean:

   1.

   Commit individual files. (StreamingFileSink)
   1.

  FileWriter -> FileCommitter
  2.

   Commit a directory (HiveSink)
   1.

  FileWriter -> FileCommitter -> GlobalCommitter
  3.

   Commit a bundle of files (Iceberg)
   1.

  DataFileWriter  -> GlobalCommitter
  4.

   Commit a directory with merged files(Some user want to merge the files
   in a directory before committing the directory to Hive meta store)
   1.

  FileWriter -> SingleFileCommit -> FileMergeWriter  -> GlobalCommitter


It can be seen from the above that the topologies are different according
to different requirements. Not only that there may be other options for the
second and third categories. E.g

A alternative topology option for the IcebergSink might be : DataFileWriter
-> Agg -> GlobalCommitter. One pro of this method is that we can let Agg
take care of the cleanup instead of coupling the cleanup logic to the
committer.


In the long run I think we might provide the sink developer the ability to
build arbitrary topologies. Maybe Flink could only provide a basic commit
transformation and let the user build other parts of the topology. In the
1.12 we might first provide different patterns for these different
scenarios at first and I think these components could be reused in the
future.

Best,
Guowei


On Mon, Sep 14, 2020 at 11:19 PM Dawid Wysakowicz 
wrote:

> Hi all,
>
> > I would think that we only need flush() and the semantics are that it
> > prepares for a commit, so on a physical level it would be called from
> > "prepareSnapshotPreBarrier". Now that I'm thinking about it more I
> > think flush() should be renamed to something like "prepareCommit()".
>
> Generally speaking it is a good point that emitting the committables
> should happen before emitting the checkpoint barrier downstream.
> However, if I remember offline discussions well, the idea behind
> Writer#flush and Writer#snapshotState was to differentiate commit on
> checkpoint vs final checkpoint at the end of the job. Both of these
> methods could emit committables, but the flush should not leave any in
> progress state (e.g. in case of file sink in STREAM mode, in
> snapshotState it could leave some open files that would be committed in
> a subsequent cycle, however flush should close all files). The
> snapshotState as it is now can not be called in
> prepareSnapshotPreBarrier as it can store some state, which should
> happen in Operator#snapshotState as otherwise it would always be
> synchronous. Therefore I think we would need sth like:
>
> void prepareCommit(boolean flush, WriterOutput output);
>
> ver 1:
>
> List snapshotState();
>
> ver 2:
>
> void snapshotState(); // not sure if we need that method at all in option 2
>
> > The Committer is as described in the FLIP, it's basically a function
> > "void commit(Committable)". The GobalCommitter would be a function "void
> > commit(List)". The former would be used by an S3 sink where
> > we can individually commit files to S3, a committable would be the list
> > of part uploads that will form the final file and the commit operation
> > creates the metadata in S3. The latter would be used by something like
> > Iceberg where the Committer needs a global view of all the commits to be
> > efficient and not overwhelm the system.
> >
> > I don't know yet if sinks would only implement on type of commit
> > function or potentially both at the same time, and maybe Commit can
> > return some CommitResult that gets shipped to the GlobalCommit function.
> I must admit it I did not get the need for Local/Normal + Global
> committer at first. The Iceberg example helped a lot. I think it makes a
> lot of sense.
>
> > For Iceberg, writers don't need any state. But the GlobalCommitter
> > needs to
> > checkpoint StateT. For the committer, CommT is "DataFile". Since a single
> > committer can collect thousands (or more) data files in one checkpoint
> > cycle, as an optimization we checkpoint a single "ManifestFile" (for the
> > collected thousands data files) as StateT. This allows us to absorb
> > extended commit outages without losing written/uploaded data files, as
> > operator state size is as small as one manifest file per checkpoint cycle
> > [2].
> > --
> > StateT snapshotState(SnapshotContext context) throws Exception;
> >
> > That means we also need the restoreCommitter API in the Sink interface
> > ---
> > C

Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Marta Paes Moreira
Congrats, Niels!

On Mon, Sep 14, 2020 at 8:56 PM David Anderson 
wrote:

> Congratulations!
>
> --David
>
> On Mon, Sep 14, 2020 at 8:24 PM Arvid Heise  wrote:
>
> > Congrats Niels!
> >
> > On Mon, Sep 14, 2020 at 4:04 PM Benchao Li  wrote:
> >
> > > Congratulations!
> > >
> > > Xingbo Huang  于2020年9月14日周一 下午9:36写道:
> > >
> > > > Congratulations!
> > > >
> > > > Best,
> > > > Xingbo
> > > >
> > > > Dian Fu  于2020年9月14日周一 下午8:06写道:
> > > >
> > > > > Congratulations!
> > > > >
> > > > > Regards,
> > > > > Dian
> > > > >
> > > > > > 在 2020年9月14日,下午7:45,Zhu Zhu  写道:
> > > > > >
> > > > > > Congratulations!
> > > > > >
> > > > > > Thanks,
> > > > > > Zhu
> > > > > >
> > > > > > Matt Wang  于2020年9月14日周一 下午5:22写道:
> > > > > >
> > > > > >> Congratulations, Niels!
> > > > > >>
> > > > > >>
> > > > > >> --
> > > > > >>
> > > > > >> Best,
> > > > > >> Matt Wang
> > > > > >>
> > > > > >>
> > > > > >> On 09/14/2020 17:02,Konstantin Knauf
> > > wrote:
> > > > > >> Congratulations!
> > > > > >>
> > > > > >> On Mon, Sep 14, 2020 at 10:51 AM tison 
> > > wrote:
> > > > > >>
> > > > > >> Congrats!
> > > > > >>
> > > > > >> Best,
> > > > > >> tison.
> > > > > >>
> > > > > >>
> > > > > >> Aljoscha Krettek  于2020年9月14日周一 下午4:38写道:
> > > > > >>
> > > > > >> Congratulations! 💐
> > > > > >>
> > > > > >> Aljoscha
> > > > > >>
> > > > > >> On 14.09.20 10:37, Robert Metzger wrote:
> > > > > >> Hi all,
> > > > > >>
> > > > > >> On behalf of the PMC, I’m very happy to announce Niels Basjes
> as a
> > > new
> > > > > >> Flink committer.
> > > > > >>
> > > > > >> Niels has been an active community member since the early days
> of
> > > > > >> Flink,
> > > > > >> with 19 commits dating back until 2015.
> > > > > >> Besides his work on the code, he has been driving initiatives on
> > > dev@
> > > > > >> list,
> > > > > >> supporting users and giving talks at conferences.
> > > > > >>
> > > > > >> Please join me in congratulating Niels for becoming a Flink
> > > committer!
> > > > > >>
> > > > > >> Best,
> > > > > >> Robert Metzger
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> --
> > > > > >>
> > > > > >> Konstantin Knauf | Head of Product
> > > > > >>
> > > > > >> +49 160 91394525
> > > > > >>
> > > > > >>
> > > > > >> Follow us @VervericaData Ververica 
> > > > > >>
> > > > > >>
> > > > > >> --
> > > > > >>
> > > > > >> Join Flink Forward  - The Apache
> > Flink
> > > > > >> Conference
> > > > > >>
> > > > > >> Stream Processing | Event Driven | Real Time
> > > > > >>
> > > > > >> --
> > > > > >>
> > > > > >> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> > > > > >>
> > > > > >> --
> > > > > >> Ververica GmbH
> > > > > >> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> > > > > >> Managing Directors: Yip Park Tung Jason, Jinwei (Kevin) Zhang,
> > Karl
> > > > > Anton
> > > > > >> Wehner
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > >
> > > Best,
> > > Benchao Li
> > >
> >
> >
> > --
> >
> > Arvid Heise | Senior Java Developer
> >
> > 
> >
> > Follow us @VervericaData
> >
> > --
> >
> > Join Flink Forward  - The Apache Flink
> > Conference
> >
> > Stream Processing | Event Driven | Real Time
> >
> > --
> >
> > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> >
> > --
> > Ververica GmbH
> > Registered at Amtsgericht Charlottenburg: HRB 158244 B
> > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
> > (Toni) Cheng
> >
>


[jira] [Created] (FLINK-19223) Simplify Availability Future Model in Base Connector

2020-09-14 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-19223:


 Summary: Simplify Availability Future Model in Base Connector
 Key: FLINK-19223
 URL: https://issues.apache.org/jira/browse/FLINK-19223
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Common
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.12.0


The current model implemented by the {{FutureNotifier}} and the 
{{SourceReaderBase}} has a shortcoming:
  - It does not support availability notifications where the notification comes 
before the check. IN that case the notification is lost.

  - One can see the added complexity created by this model also in the 
{{SourceReaderBase#isAvailable()}} where the returned future needs to be 
"post-processed" and eagerly completed if the reader is in fact available. This 
is based on queue size, which makes it hard to have other conditions.

I think we can do something that is both easier and a bit more efficient by 
following a similar model as the 
{{org.apache.flink.runtime.io.AvailabilityProvider.AvailabilityHelper}}.

Furthermore, I believe we can win more efficiency by integrating this better 
with the {{FutureCompletingBlockingQueue}}.

I suggest to do a similar implementation as the {{AvailabilityHelper}} directly 
in the {{FutureCompletingBlockingQueue}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Caito Scherr
Congratulations, Niels!!

On Mon, Sep 14, 2020 at 1:37 AM Robert Metzger  wrote:

> Hi all,
>
>
>
> On behalf of the PMC, I’m very happy to announce Niels Basjes as a new
>
> Flink committer.
>
>
>
> Niels has been an active community member since the early days of Flink,
>
> with 19 commits dating back until 2015.
>
> Besides his work on the code, he has been driving initiatives on dev@
> list,
>
> supporting users and giving talks at conferences.
>
>
>
> Please join me in congratulating Niels for becoming a Flink committer!
>
>
>
> Best,
>
> Robert Metzger
>
> --
*Caito Scherr*


Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Peter Huang
Congratulations, Niels!!


On Mon, Sep 14, 2020 at 2:28 PM Caito Scherr  wrote:

> Congratulations, Niels!!
>
> On Mon, Sep 14, 2020 at 1:37 AM Robert Metzger 
> wrote:
>
> > Hi all,
> >
> >
> >
> > On behalf of the PMC, I’m very happy to announce Niels Basjes as a new
> >
> > Flink committer.
> >
> >
> >
> > Niels has been an active community member since the early days of Flink,
> >
> > with 19 commits dating back until 2015.
> >
> > Besides his work on the code, he has been driving initiatives on dev@
> > list,
> >
> > supporting users and giving talks at conferences.
> >
> >
> >
> > Please join me in congratulating Niels for becoming a Flink committer!
> >
> >
> >
> > Best,
> >
> > Robert Metzger
> >
> > --
> *Caito Scherr*
>


[jira] [Created] (FLINK-19224) Provide an easy way to read window state

2020-09-14 Thread Seth Wiesman (Jira)
Seth Wiesman created FLINK-19224:


 Summary: Provide an easy way to read window state
 Key: FLINK-19224
 URL: https://issues.apache.org/jira/browse/FLINK-19224
 Project: Flink
  Issue Type: Sub-task
  Components: API / State Processor
Reporter: Seth Wiesman
Assignee: Seth Wiesman






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19225) Improve code and logging in SourceReaderBase

2020-09-14 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-19225:


 Summary: Improve code and logging in SourceReaderBase
 Key: FLINK-19225
 URL: https://issues.apache.org/jira/browse/FLINK-19225
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Common
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.12.0


An umbrella issue for minor improvements to the {{SourceReaderBase}}, such as 
logging, thread names, code simplifications.

The concrete change is described in the messages of the commits tagged with 
this issue (separate commits to better track the changes).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-14 Thread Steven Wu
## concurrent checkpoints

@Aljoscha Krettek  regarding the concurrent
checkpoints, let me illustrate with a simple DAG below.
[image: image.png]

Let's assume each writer emits one file per checkpoint cycle and *writer-2
is slow*. Now let's look at what the global committer receives

timeline:
--> Now
from Writer-1:  file-1-1, ck-1, file-1-2, ck-2
from Writer-2:
file-2-1, ck-1

In this case, the committer shouldn't include "file-1-2" into the commit
for ck-1.

## Committable bookkeeping and combining

I like David's proposal where the framework takes care of the
bookkeeping of committables and provides a combiner API (CommT ->
GlobalCommT) for GlobalCommitter. The only requirement is to tie the
commit/CommT/GlobalCommT to a checkpoint.

When a commit is successful for checkpoint-N, the framework needs to remove
the GlobalCommT from the state corresponding to checkpoints <= N. If a
commit fails, the GlobalCommT accumulates and will be included in the next
cycle. That is how the Iceberg sink works. I think it is good to piggyback
retries with Flink's periodic checkpoints for Iceberg sink. Otherwise, it
can get complicated to implement retry logic that won't interfere with
Flink checkpoints.

The main question is if this pattern is generic to be put into the sink
framework or not.

> A alternative topology option for the IcebergSink might be :
DataFileWriter
-> Agg -> GlobalCommitter. One pro of this method is that we can let Agg
take care of the cleanup instead of coupling the cleanup logic to the
committer.

@Guowei Ma  I would favor David's suggestion of
"combine" API rather than a separate "Agg" operator.

## Using checkpointId

> I think this can have some problems, for example when checkpoint ids are
not strictly sequential, when we wrap around, or when the JobID changes.
This will happen when doing a stop/start-from-savepoint cycle, for example.

checkpointId can work if it is monotonically increasing, which I believe is
the case for Flink today. Restoring from checkpoint or savepoint will
resume the checkpointIds.

We can deal with JobID change by saving it into the state and Iceberg
snapshot metadata. There is already a PR [1] for that.

## Nonce

> Flink provide a nonce to the GlobalCommitter where Flink guarantees that
this nonce is unique

That is actually how we implemented internally. Flink Iceberg sink
basically hashes the Manifest file location as the nonce. Since the Flink
generated Manifest file location is unique, it  guarantees the nonce is
unique.

IMO, checkpointId is also one way of implementing Nonce based on today's
Flink behavior.

> and will not change for repeated invocations of the GlobalCommitter with
the same set of committables

 if the same set of committables are combined into one GlobalCommT (like
ManifestFile in Iceberg), then the Nonce could be part of the GlobalCommT
interface.

BTW, as David pointed out, the ManifestFile optimization is only in our
internal implementation [2] right now. For the open source version, there
is a github issue [3] to track follow-up improvements.

Thanks,
Steven

[1] https://github.com/apache/iceberg/pull/1404
[2]
https://github.com/Netflix-Skunkworks/nfflink-connector-iceberg/blob/master/nfflink-connector-iceberg/src/main/java/com/netflix/spaas/nfflink/connector/iceberg/sink/IcebergCommitter.java#L363
[3] https://github.com/apache/iceberg/issues/1403


On Mon, Sep 14, 2020 at 12:03 PM Guowei Ma  wrote:

> Hi all,
>
>
> Very thanks for the discussion and the valuable opinions! Currently there
> are several ongoing issues and we would like to show what we are thinking
> in the next few mails.
>
> It seems that the biggest issue now is about the topology of the sinks.
> Before deciding what the sink API would look like, I would like to first
> summarize the different topologies we have mentioned so that we could sync
> on the same page and gain more insights about this issue. There are four
> types of topology I could see. Please correct me if I misunderstand what
> you mean:
>
>1.
>
>Commit individual files. (StreamingFileSink)
>1.
>
>   FileWriter -> FileCommitter
>   2.
>
>Commit a directory (HiveSink)
>1.
>
>   FileWriter -> FileCommitter -> GlobalCommitter
>   3.
>
>Commit a bundle of files (Iceberg)
>1.
>
>   DataFileWriter  -> GlobalCommitter
>   4.
>
>Commit a directory with merged files(Some user want to merge the files
>in a directory before committing the directory to Hive meta store)
>1.
>
>   FileWriter -> SingleFileCommit -> FileMergeWriter  -> GlobalCommitter
>
>
> It can be seen from the above that the topologies are different according
> to different requirements. Not only that there may be other options for the
> second and third categories. E.g
>
> A alternative topology option for the IcebergSink might be : DataFileWriter
> -> Agg -> GlobalCommitter. One pro of this method is that we can let Agg
> 

[jira] [Created] (FLINK-19226) [Kinesis] [EFO] Connector reaches default max attempts for describeStream and describeStreamConsumer when parallelism is high

2020-09-14 Thread Hong Liang Teoh (Jira)
Hong Liang Teoh created FLINK-19226:
---

 Summary: [Kinesis] [EFO] Connector reaches default max attempts 
for describeStream and describeStreamConsumer when parallelism is high
 Key: FLINK-19226
 URL: https://issues.apache.org/jira/browse/FLINK-19226
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Kinesis
Reporter: Hong Liang Teoh
 Fix For: 1.12.0


*Background*

When lazily registering the stream consumer on apps with high parallelism, EFO 
connector hits default maximum number of attempts when calling describeStream 
and describeStreamConsumer on the Kinesis Streams API.

The default FullJitterBackoff constants are tuned to prevent this when 
parallelism of 1024 is used.

*Scope*
 * See 
[FLIP|https://cwiki.apache.org/confluence/display/FLINK/FLIP-128%3A+Enhanced+Fan+Out+for+AWS+Kinesis+Consumers]
 for full list of configuration options
 * Suggested changes:
|flink.stream.describe.maxretries|50|
|flink.stream.describe.backoff.base|2000L|
|flink.stream.describe.backoff.max|5000L|
|flink.stream.describestreamconsumer.maxretries|50|
|flink.stream.describestreamconsumer.backoff.base|2000L|
|flink.stream.describestreamconsumer.backoff.max|5000L|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Leonard Xu
Congrats, Niels!

Best,
Leonard
> 在 2020年9月15日,05:29,Peter Huang  写道:
> 
> Congratulations, Niels!!
> 
> 
> On Mon, Sep 14, 2020 at 2:28 PM Caito Scherr  wrote:
> 
>> Congratulations, Niels!!
>> 
>> On Mon, Sep 14, 2020 at 1:37 AM Robert Metzger 
>> wrote:
>> 
>>> Hi all,
>>> 
>>> 
>>> 
>>> On behalf of the PMC, I’m very happy to announce Niels Basjes as a new
>>> 
>>> Flink committer.
>>> 
>>> 
>>> 
>>> Niels has been an active community member since the early days of Flink,
>>> 
>>> with 19 commits dating back until 2015.
>>> 
>>> Besides his work on the code, he has been driving initiatives on dev@
>>> list,
>>> 
>>> supporting users and giving talks at conferences.
>>> 
>>> 
>>> 
>>> Please join me in congratulating Niels for becoming a Flink committer!
>>> 
>>> 
>>> 
>>> Best,
>>> 
>>> Robert Metzger
>>> 
>>> --
>> *Caito Scherr*
>> 



[jira] [Created] (FLINK-19227) The catalog is still created after opening failed in catalog registering

2020-09-14 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-19227:


 Summary: The catalog is still created after opening failed in 
catalog registering
 Key: FLINK-19227
 URL: https://issues.apache.org/jira/browse/FLINK-19227
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / API
Reporter: Jingsong Lee
 Fix For: 1.11.3


In CatalogManager.registerCatalog.
Consider open is a relatively easy operation to fail, we should put catalog 
into catalog manager after its open.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Danny Chan
Congratulations! 💐

Best,
Danny Chan
在 2020年9月15日 +0800 AM9:31,dev@flink.apache.org,写道:
>
> Congratulations! 💐


Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread 刘建刚
Congratulations!

Best,
liujiangang

Danny Chan  于2020年9月15日周二 上午9:44写道:

> Congratulations! 💐
>
> Best,
> Danny Chan
> 在 2020年9月15日 +0800 AM9:31,dev@flink.apache.org,写道:
> >
> > Congratulations! 💐
>


[jira] [Created] (FLINK-19228) Avoid accessing FileSystem in client for file system connector

2020-09-14 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-19228:


 Summary: Avoid accessing FileSystem in client for file system 
connector
 Key: FLINK-19228
 URL: https://issues.apache.org/jira/browse/FLINK-19228
 Project: Flink
  Issue Type: Bug
  Components: Connectors / FileSystem
Reporter: Jingsong Lee
 Fix For: 1.11.3


On the client, there may not be a corresponding file system plugin, so we can 
not access the specific file system. We can not access the file system on the 
client, but put the work on the job manager or task manager.

Currently, it seems that only creating temporary directory through Filesystem 
in {{toStagingPath}}, but this is completely avoidable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Darion Yaphet
Congratulations!

刘建刚  于2020年9月15日周二 上午9:53写道:

> Congratulations!
>
> Best,
> liujiangang
>
> Danny Chan  于2020年9月15日周二 上午9:44写道:
>
> > Congratulations! 💐
> >
> > Best,
> > Danny Chan
> > 在 2020年9月15日 +0800 AM9:31,dev@flink.apache.org,写道:
> > >
> > > Congratulations! 💐
> >
>


-- 

long is the way and hard  that out of Hell leads up to light


[jira] [Created] (FLINK-19229) Support ValueState and Python UDAF on blink stream planner

2020-09-14 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-19229:
-

 Summary: Support ValueState and Python UDAF on blink stream planner
 Key: FLINK-19229
 URL: https://issues.apache.org/jira/browse/FLINK-19229
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19231) Support ListState and ListView for Python UDAF

2020-09-14 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-19231:
-

 Summary: Support ListState and ListView for Python UDAF
 Key: FLINK-19231
 URL: https://issues.apache.org/jira/browse/FLINK-19231
 Project: Flink
  Issue Type: Sub-task
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19232) Support MapState and MapView for Python UDAF

2020-09-14 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-19232:
-

 Summary: Support MapState and MapView for Python UDAF
 Key: FLINK-19232
 URL: https://issues.apache.org/jira/browse/FLINK-19232
 Project: Flink
  Issue Type: Sub-task
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19230) Support Python UDAF on blink batch planner

2020-09-14 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-19230:
-

 Summary: Support Python UDAF on blink batch planner
 Key: FLINK-19230
 URL: https://issues.apache.org/jira/browse/FLINK-19230
 Project: Flink
  Issue Type: Sub-task
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.11.2, release candidate #1

2020-09-14 Thread Dian Fu
+1 (binding)

- checked the signature and checksum
- reviewed the web-site PR and it looks good to me
- checked the diff for dependencies changes since 1.11.1: 
https://github.com/apache/flink/compare/release-1.11.1..release-1.11.2-rc1 

- checked the release note

Thanks,
Dian

> 在 2020年9月14日,下午9:30,Xingbo Huang  写道:
> 
> +1 (non-binding)
> 
> Checks:
> 
> - Pip install PyFlink from wheel packages with Python 3.5,3.6 and 3.7 in
> Mac and Linux.
> - Test Python UDF/Pandas UDF
> - Test from_pandas/to_pandas
> 
> Best,
> Xingbo
> 
> Fabian Paul  于2020年9月14日周一 下午8:46写道:
> 
>> +1 (non-binding)
>> 
>> Checks:
>> 
>> - Verified signature
>> - Built from source (Java8)
>> - Ran custom jobs on Kubernetes
>> 
>> Regards,
>> Fabian
>> 



[jira] [Created] (FLINK-19234) Support FILTER KeyWord for Python UDAF

2020-09-14 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-19234:
-

 Summary: Support FILTER KeyWord for Python UDAF
 Key: FLINK-19234
 URL: https://issues.apache.org/jira/browse/FLINK-19234
 Project: Flink
  Issue Type: Sub-task
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19235) Support mixed use with built-in aggs for Python UDAF

2020-09-14 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-19235:
-

 Summary: Support mixed use with built-in aggs for Python UDAF
 Key: FLINK-19235
 URL: https://issues.apache.org/jira/browse/FLINK-19235
 Project: Flink
  Issue Type: Sub-task
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19233) Support DISTINCT KeyWord for Python UDAF

2020-09-14 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-19233:
-

 Summary: Support DISTINCT KeyWord for Python UDAF
 Key: FLINK-19233
 URL: https://issues.apache.org/jira/browse/FLINK-19233
 Project: Flink
  Issue Type: Sub-task
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19236) Optimize the performance of Python UDAF

2020-09-14 Thread Wei Zhong (Jira)
Wei Zhong created FLINK-19236:
-

 Summary: Optimize the performance of Python UDAF
 Key: FLINK-19236
 URL: https://issues.apache.org/jira/browse/FLINK-19236
 Project: Flink
  Issue Type: Sub-task
Reporter: Wei Zhong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-107: Reading table columns from different parts of source records

2020-09-14 Thread Leonard Xu
Hi, Timo

Thanks for your explanation, it makes sense to me.

Best,
Leonard


>> Hi, Timo
>> Thanks for the update
>> I have a minor suggestion about the debezium metadata key,
>> Could we use the original  debezium key rather than import new key?
>> debezium-json.schema=> debezium-json.schema
>> debezium-json.ingestion-timestamp  =>  debezium-json.ts_ms
>> debezium-json.source.database   =>  debezium-json.source.db
>> debezium-json.source.schema =>  debezium-json.source.schema
>> debezium-json.source.table  =>  debezium-json.source.table
>> debezium-json.source.timestamp =>  debezium-json.source.ts_ms
>> debezium-json.source.properties  =>  debezium-json.source MAP> STRING>
>>  User who familiar with debezium will understand the key easier,  and the 
>> key syntax is more json-path like. HDYT?
>> The other part looks really good to me.
>> Regards,
>> Leonard
>>> 在 2020年9月10日,18:26,Aljoscha Krettek  写道:
>>> 
>>> I've only been watching this from the sidelines but that latest proposal 
>>> looks very good to me!
>>> 
>>> Aljoscha
>>> 
>>> On 10.09.20 12:20, Kurt Young wrote:
 The new syntax looks good to me.
 Best,
 Kurt
 On Thu, Sep 10, 2020 at 5:57 PM Jark Wu  wrote:
> Hi Timo,
> 
> I have one minor suggestion.
> Maybe the default data type of `timestamp`  can be `TIMESTAMP(3) WITH
> LOCAL TIME ZONE`, because this is the type that users want to use, this 
> can
> avoid unnecessary casting.
> Besides, currently, the bigint is casted to timestamp in seconds, so the
> implicit cast may not work...
> 
> I don't have other objections. But maybe we should wait for the
> opinion from @Kurt for the new syntax.
> 
> Best,
> Jark
> 
> 
> On Thu, 10 Sep 2020 at 16:21, Danny Chan  wrote:
> 
>> Thanks for driving this Timo, +1 for voting ~
>> 
>> Best,
>> Danny Chan
>> 在 2020年9月10日 +0800 PM3:47,Timo Walther ,写道:
>>> Thanks everyone for this healthy discussion. I updated the FLIP with the
>>> outcome. I think the result is very powerful but also very easy to
>>> declare. Thanks for all the contributions.
>>> 
>>> If there are no objections, I would continue with a voting.
>>> 
>>> What do you think?
>>> 
>>> Regards,
>>> Timo
>>> 
>>> 
>>> On 09.09.20 16:52, Timo Walther wrote:
 "If virtual by default, when a user types "timestamp int" ==>
>> persisted
 column, then adds a "metadata" after that ==> virtual column, then
>> adds
 a "persisted" after that ==> persisted column."
 
 Thanks for this nice mental model explanation, Jark. This makes total
 sense to me. Also making the the most common case as short at just
 adding `METADATA` is a very good idea. Thanks, Danny!
 
 Let me update the FLIP again with all these ideas.
 
 Regards,
 Timo
 
 
 On 09.09.20 15:03, Jark Wu wrote:
> I'm also +1 to Danny's proposal: timestamp INT METADATA [FROM
> 'my-timestamp-field'] [VIRTUAL]
> Especially I like the shortcut: timestamp INT METADATA, this makes
>> the
> most
> common case to be supported in the simplest way.
> 
> I also think the default should be "PERSISTED", so VIRTUAL is
>> optional
> when
> you are accessing a read-only metadata. Because:
> 1. The "timestamp INT METADATA" should be a normal column, because
> "METADATA" is just a modifier to indicate it is from metadata, a
>> normal
> column should be persisted.
>  If virtual by default, when a user types "timestamp int" ==>
> persisted
> column, then adds a "metadata" after that ==> virtual column, then
>> adds a
> "persisted" after that ==> persisted column.
>  I think this looks reversed several times and makes users
>> confused.
> Physical fields are also prefixed with "fieldName TYPE", so
>> "timestamp
> INT
> METADATA" is persisted is very straightforward.
> 2. From the collected user question [1], we can see that "timestamp"
> is the
> most common use case. "timestamp" is a read-write metadata.
>> Persisted by
> default doesn't break the reading behavior.
> 
> Best,
> Jark
> 
> [1]: https://issues.apache.org/jira/browse/FLINK-15869
> 
> On Wed, 9 Sep 2020 at 20:56, Leonard Xu  wrote:
> 
>> Thanks @Dawid for the nice summary, I think you catch all
>> opinions of
>> the
>> long discussion well.
>> 
>> @Danny
>> “ timestamp INT METADATA [FROM 'my-timestamp-field'] [VIRTUAL]
>>   Note that the "FROM 'field name'" is only needed when the name
>> conflict
>>   wit

Re: [VOTE] FLIP-107: Handling of metadata in SQL connectors

2020-09-14 Thread Leonard Xu
+1(non-binding)

Leonard

> 在 2020年9月12日,21:46,Danny Chan  写道:
> 
> +1, non-binding ~
> 
> Konstantin Knauf 于2020年9月11日 周五上午2:04写道:
> 
>> +1 (binding)
>> 
>> 
>> 
>> On Thu, Sep 10, 2020 at 4:29 PM Dawid Wysakowicz 
>> 
>> wrote:
>> 
>> 
>> 
>>> +1 (binding)
>> 
>>> 
>> 
>>> On 10/09/2020 14:03, Aljoscha Krettek wrote:
>> 
 +1 (binding)
>> 
 
>> 
 Aljoscha
>> 
 
>> 
 On 10.09.20 13:57, Timo Walther wrote:
>> 
> Hi all,
>> 
> 
>> 
> after the discussion in [1], I would like to open a voting thread for
>> 
> FLIP-107 [2] which discusses how to handle data next to the main
>> 
> payload (i.e. key and value formats as well as metadata in general)
>> 
> in SQL connectors and DDL.
>> 
> 
>> 
> The vote will be open until September 15th (72h + weekend), unless
>> 
> there is an objection or not enough votes.
>> 
> 
>> 
> Regards,
>> 
> Timo
>> 
> 
>> 
> [1]
>> 
> 
>> 
>>> 
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-107-Reading-table-columns-from-different-parts-of-source-records-td38277.html
>> 
> 
>> 
> [2]
>> 
> 
>> 
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-107%3A+Handling+of+metadata+in+SQL+connectors
>> 
> 
>> 
 
>> 
>>> 
>> 
>>> 
>> 
>> 
>> 
>> --
>> 
>> 
>> 
>> Konstantin Knauf
>> 
>> 
>> 
>> https://twitter.com/snntrable
>> 
>> 
>> 
>> https://github.com/knaufk
>> 
>> 



[ANNOUNCE] New Apache Flink Committer - Arvid Heise

2020-09-14 Thread Zhijiang
Hi all,

On behalf of the PMC, I’m very happy to announce Arvid Heise as a new Flink 
committer.

Arvid has been an active community member for more than a year, with 138 
contributions including 116 commits, reviewed many PRs with good quality 
comments.
He is mainly working on the runtime scope, involved in critical features like 
task mailbox model and unaligned checkpoint, etc.
Besides that, he was super active to reply questions in the user mail list (34 
emails in March, 51 emails in June, etc), also active in dev mail list and Jira 
issue discussions.

Please join me in congratulating Arvid for becoming a Flink committer!

Best,
Zhijiang

Re: [ANNOUNCE] New Apache Flink Committer - Arvid Heise

2020-09-14 Thread Xintong Song
Congratulations Arvid, welcome abord~!


Thank you~

Xintong Song



On Tue, Sep 15, 2020 at 10:38 AM Zhijiang
 wrote:

> Hi all,
>
> On behalf of the PMC, I’m very happy to announce Arvid Heise as a new
> Flink committer.
>
> Arvid has been an active community member for more than a year, with 138
> contributions including 116 commits, reviewed many PRs with good quality
> comments.
> He is mainly working on the runtime scope, involved in critical features
> like task mailbox model and unaligned checkpoint, etc.
> Besides that, he was super active to reply questions in the user mail list
> (34 emails in March, 51 emails in June, etc), also active in dev mail list
> and Jira issue discussions.
>
> Please join me in congratulating Arvid for becoming a Flink committer!
>
> Best,
> Zhijiang


Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Zhijiang
Congrats, Niels!

Best,
Zhijiang


--
From:Darion Yaphet 
Send Time:2020年9月15日(星期二) 10:02
To:dev 
Subject:Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

Congratulations!

刘建刚  于2020年9月15日周二 上午9:53写道:

> Congratulations!
>
> Best,
> liujiangang
>
> Danny Chan  于2020年9月15日周二 上午9:44写道:
>
> > Congratulations! 💐
> >
> > Best,
> > Danny Chan
> > 在 2020年9月15日 +0800 AM9:31,dev@flink.apache.org,写道:
> > >
> > > Congratulations! 💐
> >
>


-- 

long is the way and hard  that out of Hell leads up to light



Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Xintong Song
Congrats!

Thank you~

Xintong Song



On Tue, Sep 15, 2020 at 10:40 AM Zhijiang
 wrote:

> Congrats, Niels!
>
> Best,
> Zhijiang
>
>
> --
> From:Darion Yaphet 
> Send Time:2020年9月15日(星期二) 10:02
> To:dev 
> Subject:Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes
>
> Congratulations!
>
> 刘建刚  于2020年9月15日周二 上午9:53写道:
>
> > Congratulations!
> >
> > Best,
> > liujiangang
> >
> > Danny Chan  于2020年9月15日周二 上午9:44写道:
> >
> > > Congratulations! 💐
> > >
> > > Best,
> > > Danny Chan
> > > 在 2020年9月15日 +0800 AM9:31,dev@flink.apache.org,写道:
> > > >
> > > > Congratulations! 💐
> > >
> >
>
>
> --
>
> long is the way and hard  that out of Hell leads up to light
>
>


Re:[ANNOUNCE] New Apache Flink Committer - Arvid Heise

2020-09-14 Thread Matt Wang
Congrats!


--

Best,
Matt Wang


On 09/15/2020 10:38,Zhijiang wrote:
Hi all,

On behalf of the PMC, I’m very happy to announce Arvid Heise as a new Flink 
committer.

Arvid has been an active community member for more than a year, with 138 
contributions including 116 commits, reviewed many PRs with good quality 
comments.
He is mainly working on the runtime scope, involved in critical features like 
task mailbox model and unaligned checkpoint, etc.
Besides that, he was super active to reply questions in the user mail list (34 
emails in March, 51 emails in June, etc), also active in dev mail list and Jira 
issue discussions.

Please join me in congratulating Arvid for becoming a Flink committer!

Best,
Zhijiang

Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Yang Wang
Congratulations!

Best,
Yang

Xintong Song  于2020年9月15日周二 上午10:41写道:

> Congrats!
>
> Thank you~
>
> Xintong Song
>
>
>
> On Tue, Sep 15, 2020 at 10:40 AM Zhijiang
>  wrote:
>
> > Congrats, Niels!
> >
> > Best,
> > Zhijiang
> >
> >
> > --
> > From:Darion Yaphet 
> > Send Time:2020年9月15日(星期二) 10:02
> > To:dev 
> > Subject:Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes
> >
> > Congratulations!
> >
> > 刘建刚  于2020年9月15日周二 上午9:53写道:
> >
> > > Congratulations!
> > >
> > > Best,
> > > liujiangang
> > >
> > > Danny Chan  于2020年9月15日周二 上午9:44写道:
> > >
> > > > Congratulations! 💐
> > > >
> > > > Best,
> > > > Danny Chan
> > > > 在 2020年9月15日 +0800 AM9:31,dev@flink.apache.org,写道:
> > > > >
> > > > > Congratulations! 💐
> > > >
> > >
> >
> >
> > --
> >
> > long is the way and hard  that out of Hell leads up to light
> >
> >
>


Re: [ANNOUNCE] New Apache Flink Committer - Arvid Heise

2020-09-14 Thread Dian Fu
Congratulations!

Regards,
Dian

> 在 2020年9月15日,上午10:41,Matt Wang  写道:
> 
> Congrats!
> 
> 
> --
> 
> Best,
> Matt Wang
> 
> 
> On 09/15/2020 10:38,Zhijiang wrote:
> Hi all,
> 
> On behalf of the PMC, I’m very happy to announce Arvid Heise as a new Flink 
> committer.
> 
> Arvid has been an active community member for more than a year, with 138 
> contributions including 116 commits, reviewed many PRs with good quality 
> comments.
> He is mainly working on the runtime scope, involved in critical features like 
> task mailbox model and unaligned checkpoint, etc.
> Besides that, he was super active to reply questions in the user mail list 
> (34 emails in March, 51 emails in June, etc), also active in dev mail list 
> and Jira issue discussions.
> 
> Please join me in congratulating Arvid for becoming a Flink committer!
> 
> Best,
> Zhijiang



Re: [ANNOUNCE] New Apache Flink Committer - Arvid Heise

2020-09-14 Thread jincheng sun
Congratulations and welcome, Arvid !

Best,
Jincheng


Dian Fu  于2020年9月15日周二 上午10:53写道:

> Congratulations!
>
> Regards,
> Dian
>
> > 在 2020年9月15日,上午10:41,Matt Wang  写道:
> >
> > Congrats!
> >
> >
> > --
> >
> > Best,
> > Matt Wang
> >
> >
> > On 09/15/2020 10:38,Zhijiang wrote:
> > Hi all,
> >
> > On behalf of the PMC, I’m very happy to announce Arvid Heise as a new
> Flink committer.
> >
> > Arvid has been an active community member for more than a year, with 138
> contributions including 116 commits, reviewed many PRs with good quality
> comments.
> > He is mainly working on the runtime scope, involved in critical features
> like task mailbox model and unaligned checkpoint, etc.
> > Besides that, he was super active to reply questions in the user mail
> list (34 emails in March, 51 emails in June, etc), also active in dev mail
> list and Jira issue discussions.
> >
> > Please join me in congratulating Arvid for becoming a Flink committer!
> >
> > Best,
> > Zhijiang
>
>


Re: [ANNOUNCE] New Apache Flink Committer - Arvid Heise

2020-09-14 Thread Zhu Zhu
Congratulations, Arvid!

Thanks,
Zhu

jincheng sun  于2020年9月15日周二 上午10:55写道:

> Congratulations and welcome, Arvid !
>
> Best,
> Jincheng
>
>
> Dian Fu  于2020年9月15日周二 上午10:53写道:
>
> > Congratulations!
> >
> > Regards,
> > Dian
> >
> > > 在 2020年9月15日,上午10:41,Matt Wang  写道:
> > >
> > > Congrats!
> > >
> > >
> > > --
> > >
> > > Best,
> > > Matt Wang
> > >
> > >
> > > On 09/15/2020 10:38,Zhijiang
> wrote:
> > > Hi all,
> > >
> > > On behalf of the PMC, I’m very happy to announce Arvid Heise as a new
> > Flink committer.
> > >
> > > Arvid has been an active community member for more than a year, with
> 138
> > contributions including 116 commits, reviewed many PRs with good quality
> > comments.
> > > He is mainly working on the runtime scope, involved in critical
> features
> > like task mailbox model and unaligned checkpoint, etc.
> > > Besides that, he was super active to reply questions in the user mail
> > list (34 emails in March, 51 emails in June, etc), also active in dev
> mail
> > list and Jira issue discussions.
> > >
> > > Please join me in congratulating Arvid for becoming a Flink committer!
> > >
> > > Best,
> > > Zhijiang
> >
> >
>


[jira] [Created] (FLINK-19237) LeaderChangeClusterComponentsTest.testReelectionOfJobMaster failed with "NoResourceAvailableException: Could not allocate the required slot within slot request timeout"

2020-09-14 Thread Dian Fu (Jira)
Dian Fu created FLINK-19237:
---

 Summary: 
LeaderChangeClusterComponentsTest.testReelectionOfJobMaster failed with 
"NoResourceAvailableException: Could not allocate the required slot within slot 
request timeout"
 Key: FLINK-19237
 URL: https://issues.apache.org/jira/browse/FLINK-19237
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.12.0
Reporter: Dian Fu


[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=6499&view=logs&j=6bfdaf55-0c08-5e3f-a2d2-2a0285fd41cf&t=fd9796c3-9ce8-5619-781c-42f873e126a6]

{code}
2020-09-14T21:11:02.8200203Z [ERROR] 
testReelectionOfJobMaster(org.apache.flink.runtime.leaderelection.LeaderChangeClusterComponentsTest)
  Time elapsed: 300.14 s  <<< FAILURE!
2020-09-14T21:11:02.8201761Z java.lang.AssertionError: Job failed.
2020-09-14T21:11:02.8202749Zat 
org.apache.flink.runtime.jobmaster.utils.JobResultUtils.throwAssertionErrorOnFailedResult(JobResultUtils.java:54)
2020-09-14T21:11:02.8203794Zat 
org.apache.flink.runtime.jobmaster.utils.JobResultUtils.assertSuccess(JobResultUtils.java:30)
2020-09-14T21:11:02.8205177Zat 
org.apache.flink.runtime.leaderelection.LeaderChangeClusterComponentsTest.testReelectionOfJobMaster(LeaderChangeClusterComponentsTest.java:152)
2020-09-14T21:11:02.8206191Zat 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2020-09-14T21:11:02.8206985Zat 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2020-09-14T21:11:02.8207930Zat 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2020-09-14T21:11:02.8208927Zat 
java.lang.reflect.Method.invoke(Method.java:498)
2020-09-14T21:11:02.8209753Zat 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
2020-09-14T21:11:02.8210710Zat 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
2020-09-14T21:11:02.8211608Zat 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
2020-09-14T21:11:02.8214473Zat 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
2020-09-14T21:11:02.8215398Zat 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
2020-09-14T21:11:02.8216199Zat 
org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
2020-09-14T21:11:02.8216947Zat 
org.junit.rules.RunRules.evaluate(RunRules.java:20)
2020-09-14T21:11:02.8217695Zat 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
2020-09-14T21:11:02.8218635Zat 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
2020-09-14T21:11:02.8219499Zat 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
2020-09-14T21:11:02.8220313Zat 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
2020-09-14T21:11:02.8221060Zat 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
2020-09-14T21:11:02.8222171Zat 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
2020-09-14T21:11:02.8222937Zat 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
2020-09-14T21:11:02.8223688Zat 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
2020-09-14T21:11:02.8225191Zat 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
2020-09-14T21:11:02.8226086Zat 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
2020-09-14T21:11:02.8226761Zat 
org.junit.runners.ParentRunner.run(ParentRunner.java:363)
2020-09-14T21:11:02.8227453Zat 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
2020-09-14T21:11:02.8228392Zat 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
2020-09-14T21:11:02.8229256Zat 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
2020-09-14T21:11:02.8235798Zat 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
2020-09-14T21:11:02.8237650Zat 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
2020-09-14T21:11:02.8239039Zat 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
2020-09-14T21:11:02.8239894Zat 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
2020-09-14T21:11:02.8240591Zat 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
2020-09-14T21:11:02.8241325Z Caused by: org.apache.flink.runtime.JobException: 
Recovery is suppressed by NoRestartBackoffTimeStrategy
2020-09-14T21:11:02.8242225Zat 
org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandl

[DISCUSS] FLIP-144: Native Kubernetes HA for Flink

2020-09-14 Thread Yang Wang
 Hi devs and users,

I would like to start the discussion about FLIP-144[1], which will introduce
a new native high availability service for Kubernetes.

Currently, Flink has provided Zookeeper HA service and been widely used
in production environments. It could be integrated in standalone cluster,
Yarn, Kubernetes deployments. However, using the Zookeeper HA in K8s
will take additional cost since we need to manage a Zookeeper cluster.
In the meantime, K8s has provided some public API for leader election[2]
and configuration storage(i.e. ConfigMap[3]). We could leverage these
features and make running HA configured Flink cluster on K8s more
convenient.

Both the standalone on K8s and native K8s could benefit from the new
introduced KubernetesHaService.

[1].
https://cwiki.apache.org/confluence/display/FLINK/FLIP-144%3A+Native+Kubernetes+HA+for+Flink
[2].
https://kubernetes.io/blog/2016/01/simple-leader-election-with-kubernetes/
[3]. https://kubernetes.io/docs/concepts/configuration/configmap/

Looking forward to your feedback.

Best,
Yang


Re: [VOTE] Release 1.11.2, release candidate #1

2020-09-14 Thread Yang Wang
+1 (non-binding)

* Build from source and check the signature, checksum
* Check the K8s notice file is correct since the pom has changed
* Run standalone session/application clusters on K8s
* Run native Flink session/application clusters on K8s


Best,
Yang

Dian Fu  于2020年9月15日周二 上午10:11写道:

> +1 (binding)
>
> - checked the signature and checksum
> - reviewed the web-site PR and it looks good to me
> - checked the diff for dependencies changes since 1.11.1:
> https://github.com/apache/flink/compare/release-1.11.1..release-1.11.2-rc1
> <
> https://github.com/apache/flink/compare/release-1.11.1..release-1.11.2-rc1
> >
> - checked the release note
>
> Thanks,
> Dian
>
> > 在 2020年9月14日,下午9:30,Xingbo Huang  写道:
> >
> > +1 (non-binding)
> >
> > Checks:
> >
> > - Pip install PyFlink from wheel packages with Python 3.5,3.6 and 3.7 in
> > Mac and Linux.
> > - Test Python UDF/Pandas UDF
> > - Test from_pandas/to_pandas
> >
> > Best,
> > Xingbo
> >
> > Fabian Paul  于2020年9月14日周一 下午8:46写道:
> >
> >> +1 (non-binding)
> >>
> >> Checks:
> >>
> >> - Verified signature
> >> - Built from source (Java8)
> >> - Ran custom jobs on Kubernetes
> >>
> >> Regards,
> >> Fabian
> >>
>
>


Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-14 Thread Guowei Ma
## Concurrent checkpoints
AFAIK the committer would not see the file-1-2 when ck1 happens in the
ExactlyOnce mode.

## Committable bookkeeping and combining

I agree with you that the "CombineGlobalCommitter" would work. But we put
more optimization logic in the committer, which will make the committer
more and more complicated, and eventually become the same as the
Writer. For example, The committer needs to clean up some unused manifest
file when restoring from a failure if we introduce the optimizations to the
committer.

In this case another alternative might be to put this "merging"
optimization to a separate agg operator(maybe just like another `Writer`?).
The agg could produce an aggregated committable to the committer. The agg
operator could manage the whole life cycle of the manifest file it created.
It would make the committer have single responsibility.

>>The main question is if this pattern is generic to be put into the sink
framework or not.
Maybe I am wrong. But what I can feel from the current discussion is that
different requirements have different topological requirements.

## Using checkpointId
In the batch execution mode there would be no normal checkpoint any more.
That is why we do not introduce the checkpoint id in the API. So it is a
great thing that sink decouples its implementation from checkpointid. :)

Best,
Guowei


On Tue, Sep 15, 2020 at 7:33 AM Steven Wu  wrote:

>
> ## concurrent checkpoints
>
> @Aljoscha Krettek  regarding the concurrent
> checkpoints, let me illustrate with a simple DAG below.
> [image: image.png]
>
> Let's assume each writer emits one file per checkpoint cycle and *writer-2
> is slow*. Now let's look at what the global committer receives
>
> timeline:
> --> Now
> from Writer-1:  file-1-1, ck-1, file-1-2, ck-2
> from Writer-2:
> file-2-1, ck-1
>
> In this case, the committer shouldn't include "file-1-2" into the commit
> for ck-1.
>
> ## Committable bookkeeping and combining
>
> I like David's proposal where the framework takes care of the
> bookkeeping of committables and provides a combiner API (CommT ->
> GlobalCommT) for GlobalCommitter. The only requirement is to tie the
> commit/CommT/GlobalCommT to a checkpoint.
>
> When a commit is successful for checkpoint-N, the framework needs to
> remove the GlobalCommT from the state corresponding to checkpoints <= N. If
> a commit fails, the GlobalCommT accumulates and will be included in the
> next cycle. That is how the Iceberg sink works. I think it is good to
> piggyback retries with Flink's periodic checkpoints for Iceberg sink.
> Otherwise, it can get complicated to implement retry logic that won't
> interfere with Flink checkpoints.
>
> The main question is if this pattern is generic to be put into the sink
> framework or not.
>
> > A alternative topology option for the IcebergSink might be :
> DataFileWriter
> -> Agg -> GlobalCommitter. One pro of this method is that we can let Agg
> take care of the cleanup instead of coupling the cleanup logic to the
> committer.
>
> @Guowei Ma  I would favor David's suggestion of
> "combine" API rather than a separate "Agg" operator.
>
> ## Using checkpointId
>
> > I think this can have some problems, for example when checkpoint ids are
> not strictly sequential, when we wrap around, or when the JobID changes.
> This will happen when doing a stop/start-from-savepoint cycle, for example.
>
> checkpointId can work if it is monotonically increasing, which I believe
> is the case for Flink today. Restoring from checkpoint or savepoint will
> resume the checkpointIds.
>
> We can deal with JobID change by saving it into the state and Iceberg
> snapshot metadata. There is already a PR [1] for that.
>
> ## Nonce
>
> > Flink provide a nonce to the GlobalCommitter where Flink guarantees that
> this nonce is unique
>
> That is actually how we implemented internally. Flink Iceberg sink
> basically hashes the Manifest file location as the nonce. Since the Flink
> generated Manifest file location is unique, it  guarantees the nonce is
> unique.
>
> IMO, checkpointId is also one way of implementing Nonce based on today's
> Flink behavior.
>
> > and will not change for repeated invocations of the GlobalCommitter with
> the same set of committables
>
>  if the same set of committables are combined into one GlobalCommT (like
> ManifestFile in Iceberg), then the Nonce could be part of the GlobalCommT
> interface.
>
> BTW, as David pointed out, the ManifestFile optimization is only in our
> internal implementation [2] right now. For the open source version, there
> is a github issue [3] to track follow-up improvements.
>
> Thanks,
> Steven
>
> [1] https://github.com/apache/iceberg/pull/1404
> [2]
> https://github.com/Netflix-Skunkworks/nfflink-connector-iceberg/blob/master/nfflink-connector-iceberg/src/main/java/com/netflix/spaas/nfflink/connector/iceberg/sink/IcebergCommitter.java#L363
> [3] https://github.com/apache/iceber

Re: [VOTE] Release 1.11.2, release candidate #1

2020-09-14 Thread Zhijiang
+1 (binding)

- checked the checksums and GPG files 
- verified that the source archives do not contains any binaries
- checked that all POM files point to the same version
- reviewed the web site PR https://github.com/apache/flink-web/pull/377
- checked the release note

Best,
Zhijiang


--
From:Dian Fu 
Send Time:2020年9月15日(星期二) 10:11
To:dev 
Subject:Re: [VOTE] Release 1.11.2, release candidate #1

+1 (binding)

- checked the signature and checksum
- reviewed the web-site PR and it looks good to me
- checked the diff for dependencies changes since 1.11.1: 
https://github.com/apache/flink/compare/release-1.11.1..release-1.11.2-rc1 

- checked the release note

Thanks,
Dian

> 在 2020年9月14日,下午9:30,Xingbo Huang  写道:
> 
> +1 (non-binding)
> 
> Checks:
> 
> - Pip install PyFlink from wheel packages with Python 3.5,3.6 and 3.7 in
> Mac and Linux.
> - Test Python UDF/Pandas UDF
> - Test from_pandas/to_pandas
> 
> Best,
> Xingbo
> 
> Fabian Paul  于2020年9月14日周一 下午8:46写道:
> 
>> +1 (non-binding)
>> 
>> Checks:
>> 
>> - Verified signature
>> - Built from source (Java8)
>> - Ran custom jobs on Kubernetes
>> 
>> Regards,
>> Fabian
>> 




Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-14 Thread Guowei Ma
Hi, aljoscha

>I don't understand why we need the "Drain and Snapshot" section. It
>seems to be some details about stop-with-savepoint and drain, and the
>relation to BATCH execution but I don't know if it is needed to
>understand the rest of the document. I'm happy to be wrong here, though,
>if there's good reasons for the section.

The new unified sink API should provide a way for the sink developer to
deal with EOI(Drain) to guarantee the Exactly-once semantics. This is what
I want to say mostly in this section. Current streaming style sink API does
not provide a good way to deal with it. It is why the `StreamingFileSink`
does not commit the last part of data in the bounded scenario. Our theme is
unified. I am afraid that I will let users misunderstand that adding this
requirement to the new sink API is only for bounded scenarios, so I
explained in this paragraph that stop-with-savepoint might also have the
similar requirement.

For the snapshot I also want to prevent users from misunderstanding that it
is specially prepared for the unbounded scenario. Actually it might be also
possible with bounded + batch execution mode in the future.

I could reorganize the section if this section makes the reader confused
but I think we might need to keep the drain at least. WDYT?

>On the question of Alternative 1 and 2, I have a strong preference for
>Alternative 1 because we could avoid strong coupling to other modules.
>With Alternative 2 we would depend on `flink-streaming-java` and even
>`flink-runtime`. For the new source API (FLIP-27) we managed to keep the
>dependencies slim and the code is in flink-core. I'd be very happy if we
>can manage the same for the new sink API.

I am open to alternative 1. Maybe I miss something but I do not get why the
second alternative would depend on `flink-runtime` or
`flink-streaming-java`. The all the state api currently is in the
flink-core. Could you give some further explanation?  thanks :)

Best,
Guowei


On Tue, Sep 15, 2020 at 12:05 PM Guowei Ma  wrote:

> ## Concurrent checkpoints
> AFAIK the committer would not see the file-1-2 when ck1 happens in the
> ExactlyOnce mode.
>
> ## Committable bookkeeping and combining
>
> I agree with you that the "CombineGlobalCommitter" would work. But we put
> more optimization logic in the committer, which will make the committer
> more and more complicated, and eventually become the same as the
> Writer. For example, The committer needs to clean up some unused manifest
> file when restoring from a failure if we introduce the optimizations to the
> committer.
>
> In this case another alternative might be to put this "merging"
> optimization to a separate agg operator(maybe just like another `Writer`?).
> The agg could produce an aggregated committable to the committer. The agg
> operator could manage the whole life cycle of the manifest file it created.
> It would make the committer have single responsibility.
>
> >>The main question is if this pattern is generic to be put into the sink
> framework or not.
> Maybe I am wrong. But what I can feel from the current discussion is that
> different requirements have different topological requirements.
>
> ## Using checkpointId
> In the batch execution mode there would be no normal checkpoint any more.
> That is why we do not introduce the checkpoint id in the API. So it is a
> great thing that sink decouples its implementation from checkpointid. :)
>
> Best,
> Guowei
>
>
> On Tue, Sep 15, 2020 at 7:33 AM Steven Wu  wrote:
>
>>
>> ## concurrent checkpoints
>>
>> @Aljoscha Krettek  regarding the concurrent
>> checkpoints, let me illustrate with a simple DAG below.
>> [image: image.png]
>>
>> Let's assume each writer emits one file per checkpoint cycle and *writer-2
>> is slow*. Now let's look at what the global committer receives
>>
>> timeline:
>> --> Now
>> from Writer-1:  file-1-1, ck-1, file-1-2, ck-2
>> from Writer-2:
>> file-2-1, ck-1
>>
>> In this case, the committer shouldn't include "file-1-2" into the commit
>> for ck-1.
>>
>> ## Committable bookkeeping and combining
>>
>> I like David's proposal where the framework takes care of the
>> bookkeeping of committables and provides a combiner API (CommT ->
>> GlobalCommT) for GlobalCommitter. The only requirement is to tie the
>> commit/CommT/GlobalCommT to a checkpoint.
>>
>> When a commit is successful for checkpoint-N, the framework needs to
>> remove the GlobalCommT from the state corresponding to checkpoints <= N. If
>> a commit fails, the GlobalCommT accumulates and will be included in the
>> next cycle. That is how the Iceberg sink works. I think it is good to
>> piggyback retries with Flink's periodic checkpoints for Iceberg sink.
>> Otherwise, it can get complicated to implement retry logic that won't
>> interfere with Flink checkpoints.
>>
>> The main question is if this pattern is generic to be put into the sink
>> framework or not.
>>
>> > A alternative topology o

Re: [ANNOUNCE] New Apache Flink Committer - Arvid Heise

2020-09-14 Thread Benchao Li
Congratulations!

Zhu Zhu  于2020年9月15日周二 上午11:05写道:

> Congratulations, Arvid!
>
> Thanks,
> Zhu
>
> jincheng sun  于2020年9月15日周二 上午10:55写道:
>
> > Congratulations and welcome, Arvid !
> >
> > Best,
> > Jincheng
> >
> >
> > Dian Fu  于2020年9月15日周二 上午10:53写道:
> >
> > > Congratulations!
> > >
> > > Regards,
> > > Dian
> > >
> > > > 在 2020年9月15日,上午10:41,Matt Wang  写道:
> > > >
> > > > Congrats!
> > > >
> > > >
> > > > --
> > > >
> > > > Best,
> > > > Matt Wang
> > > >
> > > >
> > > > On 09/15/2020 10:38,Zhijiang
> > wrote:
> > > > Hi all,
> > > >
> > > > On behalf of the PMC, I’m very happy to announce Arvid Heise as a new
> > > Flink committer.
> > > >
> > > > Arvid has been an active community member for more than a year, with
> > 138
> > > contributions including 116 commits, reviewed many PRs with good
> quality
> > > comments.
> > > > He is mainly working on the runtime scope, involved in critical
> > features
> > > like task mailbox model and unaligned checkpoint, etc.
> > > > Besides that, he was super active to reply questions in the user mail
> > > list (34 emails in March, 51 emails in June, etc), also active in dev
> > mail
> > > list and Jira issue discussions.
> > > >
> > > > Please join me in congratulating Arvid for becoming a Flink
> committer!
> > > >
> > > > Best,
> > > > Zhijiang
> > >
> > >
> >
>


-- 

Best,
Benchao Li


Re: [DISCUSS] FLIP-36 - Support Interactive Programming in Flink Table API

2020-09-14 Thread Xuannan Su
Hi Aljoscha,

Thanks for your comment. I agree that we should not introduce tight coupling 
with PipelineExecutor to the execution environment. With that in mind, to 
distinguish the per-job and session mode, we can introduce a new method, naming 
isPerJobModeExecutor, in the PipelineExecutorFactory or PipelineExecutor so 
that the execution environment can recognize per-job mode without instanced 
checks on the PipelineExecutor. What do you think? Any thoughts or suggestions 
are very welcome.

Best,
Xuannan
On Sep 10, 2020, 4:51 PM +0800, Aljoscha Krettek , wrote:
> On 10.09.20 09:00, Xuannan Su wrote:
> > > How do you imagine that? Where do you distinguish between per-job and
> > > session mode?
> > The StreamExecutionEnvironment can distinguish between per-job and session 
> > mode by the type of the PipelineExecutor, i.e, AbstractJobClusterExecutor 
> > vs AbstractSessionClusterExecutor.
>
> I can just comment on this last part but we should not to instanceof
> checks on the PipelineExecutor. The PipelineExecutor is an interface on
> purpose and the execution environments should not try and guess
> knowledge about the executor implementations. This would introduce tight
> coupling which might break in the future if the executors were to change.
>
> Best,
> Aljoscha


Re: [VOTE] Release 1.11.2, release candidate #1

2020-09-14 Thread Zhu Zhu
Thank you all for helping to test and verify the release!
The vote has lasted for more than 72 hours and it already has enough
approvals.
I will finalize the vote result soon in a separate email.

Thanks,
Zhu

Zhijiang  于2020年9月15日周二 下午12:12写道:

> +1 (binding)
>
> - checked the checksums and GPG files
> - verified that the source archives do not contains any binaries
> - checked that all POM files point to the same version
> - reviewed the web site PR https://github.com/apache/flink-web/pull/377
> - checked the release note
>
> Best,
> Zhijiang
>
>
> --
> From:Dian Fu 
> Send Time:2020年9月15日(星期二) 10:11
> To:dev 
> Subject:Re: [VOTE] Release 1.11.2, release candidate #1
>
> +1 (binding)
>
> - checked the signature and checksum
> - reviewed the web-site PR and it looks good to me
> - checked the diff for dependencies changes since 1.11.1:
> https://github.com/apache/flink/compare/release-1.11.1..release-1.11.2-rc1
> <
> https://github.com/apache/flink/compare/release-1.11.1..release-1.11.2-rc1
> >
> - checked the release note
>
> Thanks,
> Dian
>
> > 在 2020年9月14日,下午9:30,Xingbo Huang  写道:
> >
> > +1 (non-binding)
> >
> > Checks:
> >
> > - Pip install PyFlink from wheel packages with Python 3.5,3.6 and 3.7 in
> > Mac and Linux.
> > - Test Python UDF/Pandas UDF
> > - Test from_pandas/to_pandas
> >
> > Best,
> > Xingbo
> >
> > Fabian Paul  于2020年9月14日周一 下午8:46写道:
> >
> >> +1 (non-binding)
> >>
> >> Checks:
> >>
> >> - Verified signature
> >> - Built from source (Java8)
> >> - Ran custom jobs on Kubernetes
> >>
> >> Regards,
> >> Fabian
> >>
>
>
>


[RESULT][VOTE] Release 1.11.2, release candidate #1

2020-09-14 Thread Zhu Zhu
Hi everyone,

I'm happy to announce that we have unanimously approved this release.

There are 8 approving votes, 3 of which are binding:
* Robert Metzger (binding)
* Dian Fu (binding)
* Zhijiang (binding)
* David Anderson
* Congxian Qiu
* Fabian Paul
* Xingbo Huang
* Yang Wang

There are no disapproving votes.

Thanks everyone!

Thanks,
Zhu


Re: [ANNOUNCE] New Apache Flink Committer - Arvid Heise

2020-09-14 Thread Congxian Qiu
Congratulations  Arvid

Best,
Congxian


Benchao Li  于2020年9月15日周二 下午12:52写道:

> Congratulations!
>
> Zhu Zhu  于2020年9月15日周二 上午11:05写道:
>
> > Congratulations, Arvid!
> >
> > Thanks,
> > Zhu
> >
> > jincheng sun  于2020年9月15日周二 上午10:55写道:
> >
> > > Congratulations and welcome, Arvid !
> > >
> > > Best,
> > > Jincheng
> > >
> > >
> > > Dian Fu  于2020年9月15日周二 上午10:53写道:
> > >
> > > > Congratulations!
> > > >
> > > > Regards,
> > > > Dian
> > > >
> > > > > 在 2020年9月15日,上午10:41,Matt Wang  写道:
> > > > >
> > > > > Congrats!
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Best,
> > > > > Matt Wang
> > > > >
> > > > >
> > > > > On 09/15/2020 10:38,Zhijiang
> > > wrote:
> > > > > Hi all,
> > > > >
> > > > > On behalf of the PMC, I’m very happy to announce Arvid Heise as a
> new
> > > > Flink committer.
> > > > >
> > > > > Arvid has been an active community member for more than a year,
> with
> > > 138
> > > > contributions including 116 commits, reviewed many PRs with good
> > quality
> > > > comments.
> > > > > He is mainly working on the runtime scope, involved in critical
> > > features
> > > > like task mailbox model and unaligned checkpoint, etc.
> > > > > Besides that, he was super active to reply questions in the user
> mail
> > > > list (34 emails in March, 51 emails in June, etc), also active in dev
> > > mail
> > > > list and Jira issue discussions.
> > > > >
> > > > > Please join me in congratulating Arvid for becoming a Flink
> > committer!
> > > > >
> > > > > Best,
> > > > > Zhijiang
> > > >
> > > >
> > >
> >
>
>
> --
>
> Best,
> Benchao Li
>


Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Congxian Qiu
Congratulations

Best,
Congxian


Yang Wang  于2020年9月15日周二 上午10:50写道:

> Congratulations!
>
> Best,
> Yang
>
> Xintong Song  于2020年9月15日周二 上午10:41写道:
>
> > Congrats!
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Tue, Sep 15, 2020 at 10:40 AM Zhijiang
> >  wrote:
> >
> > > Congrats, Niels!
> > >
> > > Best,
> > > Zhijiang
> > >
> > >
> > > --
> > > From:Darion Yaphet 
> > > Send Time:2020年9月15日(星期二) 10:02
> > > To:dev 
> > > Subject:Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes
> > >
> > > Congratulations!
> > >
> > > 刘建刚  于2020年9月15日周二 上午9:53写道:
> > >
> > > > Congratulations!
> > > >
> > > > Best,
> > > > liujiangang
> > > >
> > > > Danny Chan  于2020年9月15日周二 上午9:44写道:
> > > >
> > > > > Congratulations! 💐
> > > > >
> > > > > Best,
> > > > > Danny Chan
> > > > > 在 2020年9月15日 +0800 AM9:31,dev@flink.apache.org,写道:
> > > > > >
> > > > > > Congratulations! 💐
> > > > >
> > > >
> > >
> > >
> > > --
> > >
> > > long is the way and hard  that out of Hell leads up to light
> > >
> > >
> >
>


Re: [ANNOUNCE] New Apache Flink Committer - Arvid Heise

2020-09-14 Thread Yangze Guo
Congrats! Arvid.

Best,
Yangze Guo

On Tue, Sep 15, 2020 at 1:31 PM Congxian Qiu  wrote:
>
> Congratulations  Arvid
>
> Best,
> Congxian
>
>
> Benchao Li  于2020年9月15日周二 下午12:52写道:
>
> > Congratulations!
> >
> > Zhu Zhu  于2020年9月15日周二 上午11:05写道:
> >
> > > Congratulations, Arvid!
> > >
> > > Thanks,
> > > Zhu
> > >
> > > jincheng sun  于2020年9月15日周二 上午10:55写道:
> > >
> > > > Congratulations and welcome, Arvid !
> > > >
> > > > Best,
> > > > Jincheng
> > > >
> > > >
> > > > Dian Fu  于2020年9月15日周二 上午10:53写道:
> > > >
> > > > > Congratulations!
> > > > >
> > > > > Regards,
> > > > > Dian
> > > > >
> > > > > > 在 2020年9月15日,上午10:41,Matt Wang  写道:
> > > > > >
> > > > > > Congrats!
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Best,
> > > > > > Matt Wang
> > > > > >
> > > > > >
> > > > > > On 09/15/2020 10:38,Zhijiang
> > > > wrote:
> > > > > > Hi all,
> > > > > >
> > > > > > On behalf of the PMC, I’m very happy to announce Arvid Heise as a
> > new
> > > > > Flink committer.
> > > > > >
> > > > > > Arvid has been an active community member for more than a year,
> > with
> > > > 138
> > > > > contributions including 116 commits, reviewed many PRs with good
> > > quality
> > > > > comments.
> > > > > > He is mainly working on the runtime scope, involved in critical
> > > > features
> > > > > like task mailbox model and unaligned checkpoint, etc.
> > > > > > Besides that, he was super active to reply questions in the user
> > mail
> > > > > list (34 emails in March, 51 emails in June, etc), also active in dev
> > > > mail
> > > > > list and Jira issue discussions.
> > > > > >
> > > > > > Please join me in congratulating Arvid for becoming a Flink
> > > committer!
> > > > > >
> > > > > > Best,
> > > > > > Zhijiang
> > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> >
> > Best,
> > Benchao Li
> >


Re: [DISCUSS] FLIP-143: Unified Sink API

2020-09-14 Thread Guowei Ma
>> I would think that we only need flush() and the semantics are that it
>> prepares for a commit, so on a physical level it would be called from
>> "prepareSnapshotPreBarrier". Now that I'm thinking about it more I
>> think flush() should be renamed to something like "prepareCommit()".

> Generally speaking it is a good point that emitting the committables
> should happen before emitting the checkpoint barrier downstream.
> However, if I remember offline discussions well, the idea behind
> Writer#flush and Writer#snapshotState was to differentiate commit on
> checkpoint vs final checkpoint at the end of the job. Both of these
> methods could emit committables, but the flush should not leave any in
> progress state (e.g. in case of file sink in STREAM mode, in
> snapshotState it could leave some open files that would be committed in
> a subsequent cycle, however flush should close all files). The
> snapshotState as it is now can not be called in
> prepareSnapshotPreBarrier as it can store some state, which should
> happen in Operator#snapshotState as otherwise it would always be
> synchronous. Therefore I think we would need sth like:

> void prepareCommit(boolean flush, WriterOutput output);

> ver 1:

> List snapshotState();

> ver 2:

> void snapshotState(); // not sure if we need that method at all in option
2

I second Dawid's proposal. This is a valid scenario. And version2 does not
need the snapshotState() any more.

>> The Committer is as described in the FLIP, it's basically a function
>> "void commit(Committable)". The GobalCommitter would be a function "void
>> commit(List)". The former would be used by an S3 sink where
>> we can individually commit files to S3, a committable would be the list
>> of part uploads that will form the final file and the commit operation
>> creates the metadata in S3. The latter would be used by something like
>> Iceberg where the Committer needs a global view of all the commits to be
>> efficient and not overwhelm the system.
>>
>> I don't know yet if sinks would only implement on type of commit
>> function or potentially both at the same time, and maybe Commit can
>> return some CommitResult that gets shipped to the GlobalCommit function.
>> I must admit it I did not get the need for Local/Normal + Global
>> committer at first. The Iceberg example helped a lot. I think it makes a
>> lot of sense.

@Dawid
What I understand is that HiveSink's implementation might need the local
committer(FileCommitter) because the file rename is needed.
But the iceberg only needs to write the manifest file.  Would you like to
enlighten me why the Iceberg needs the local committer?
Thanks

Best,
Guowei


On Mon, Sep 14, 2020 at 11:19 PM Dawid Wysakowicz 
wrote:

> Hi all,
>
> > I would think that we only need flush() and the semantics are that it
> > prepares for a commit, so on a physical level it would be called from
> > "prepareSnapshotPreBarrier". Now that I'm thinking about it more I
> > think flush() should be renamed to something like "prepareCommit()".
>
> Generally speaking it is a good point that emitting the committables
> should happen before emitting the checkpoint barrier downstream.
> However, if I remember offline discussions well, the idea behind
> Writer#flush and Writer#snapshotState was to differentiate commit on
> checkpoint vs final checkpoint at the end of the job. Both of these
> methods could emit committables, but the flush should not leave any in
> progress state (e.g. in case of file sink in STREAM mode, in
> snapshotState it could leave some open files that would be committed in
> a subsequent cycle, however flush should close all files). The
> snapshotState as it is now can not be called in
> prepareSnapshotPreBarrier as it can store some state, which should
> happen in Operator#snapshotState as otherwise it would always be
> synchronous. Therefore I think we would need sth like:
>
> void prepareCommit(boolean flush, WriterOutput output);
>
> ver 1:
>
> List snapshotState();
>
> ver 2:
>
> void snapshotState(); // not sure if we need that method at all in option 2
>
> > The Committer is as described in the FLIP, it's basically a function
> > "void commit(Committable)". The GobalCommitter would be a function "void
> > commit(List)". The former would be used by an S3 sink where
> > we can individually commit files to S3, a committable would be the list
> > of part uploads that will form the final file and the commit operation
> > creates the metadata in S3. The latter would be used by something like
> > Iceberg where the Committer needs a global view of all the commits to be
> > efficient and not overwhelm the system.
> >
> > I don't know yet if sinks would only implement on type of commit
> > function or potentially both at the same time, and maybe Commit can
> > return some CommitResult that gets shipped to the GlobalCommit function.
> I must admit it I did not get the need for Local/Normal + Global
> committer at first. The Iceberg example helped a lot. 

Re: Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread Yun Gao
Congratulations Niels!

 Best,
  Yun


--
Sender:Congxian Qiu
Date:2020/09/15 13:33:31
Recipient:dev@flink.apache.org
Theme:Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

Congratulations

Best,
Congxian


Yang Wang  于2020年9月15日周二 上午10:50写道:

> Congratulations!
>
> Best,
> Yang
>
> Xintong Song  于2020年9月15日周二 上午10:41写道:
>
> > Congrats!
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Tue, Sep 15, 2020 at 10:40 AM Zhijiang
> >  wrote:
> >
> > > Congrats, Niels!
> > >
> > > Best,
> > > Zhijiang
> > >
> > >
> > > --
> > > From:Darion Yaphet 
> > > Send Time:2020年9月15日(星期二) 10:02
> > > To:dev 
> > > Subject:Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes
> > >
> > > Congratulations!
> > >
> > > 刘建刚  于2020年9月15日周二 上午9:53写道:
> > >
> > > > Congratulations!
> > > >
> > > > Best,
> > > > liujiangang
> > > >
> > > > Danny Chan  于2020年9月15日周二 上午9:44写道:
> > > >
> > > > > Congratulations! 💐
> > > > >
> > > > > Best,
> > > > > Danny Chan
> > > > > 在 2020年9月15日 +0800 AM9:31,dev@flink.apache.org,写道:
> > > > > >
> > > > > > Congratulations! 💐
> > > > >
> > > >
> > >
> > >
> > > --
> > >
> > > long is the way and hard  that out of Hell leads up to light
> > >
> > >
> >
>



Re: Re: [ANNOUNCE] New Apache Flink Committer - Arvid Heise

2020-09-14 Thread Yun Gao
Congratulations Arvid !

Best,
 Yun



 --Original Mail --
Sender:Yangze Guo 
Send Date:Tue Sep 15 13:39:56 2020
Recipients:dev 
Subject:Re: [ANNOUNCE] New Apache Flink Committer - Arvid Heise
Congrats! Arvid.

Best,
Yangze Guo

On Tue, Sep 15, 2020 at 1:31 PM Congxian Qiu  wrote:
>
> Congratulations  Arvid
>
> Best,
> Congxian
>
>
> Benchao Li  于2020年9月15日周二 下午12:52写道:
>
> > Congratulations!
> >
> > Zhu Zhu  于2020年9月15日周二 上午11:05写道:
> >
> > > Congratulations, Arvid!
> > >
> > > Thanks,
> > > Zhu
> > >
> > > jincheng sun  于2020年9月15日周二 上午10:55写道:
> > >
> > > > Congratulations and welcome, Arvid !
> > > >
> > > > Best,
> > > > Jincheng
> > > >
> > > >
> > > > Dian Fu  于2020年9月15日周二 上午10:53写道:
> > > >
> > > > > Congratulations!
> > > > >
> > > > > Regards,
> > > > > Dian
> > > > >
> > > > > > 在 2020年9月15日,上午10:41,Matt Wang  写道:
> > > > > >
> > > > > > Congrats!
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Best,
> > > > > > Matt Wang
> > > > > >
> > > > > >
> > > > > > On 09/15/2020 10:38,Zhijiang
> > > > wrote:
> > > > > > Hi all,
> > > > > >
> > > > > > On behalf of the PMC, I’m very happy to announce Arvid Heise as a
> > new
> > > > > Flink committer.
> > > > > >
> > > > > > Arvid has been an active community member for more than a year,
> > with
> > > > 138
> > > > > contributions including 116 commits, reviewed many PRs with good
> > > quality
> > > > > comments.
> > > > > > He is mainly working on the runtime scope, involved in critical
> > > > features
> > > > > like task mailbox model and unaligned checkpoint, etc.
> > > > > > Besides that, he was super active to reply questions in the user
> > mail
> > > > > list (34 emails in March, 51 emails in June, etc), also active in dev
> > > > mail
> > > > > list and Jira issue discussions.
> > > > > >
> > > > > > Please join me in congratulating Arvid for becoming a Flink
> > > committer!
> > > > > >
> > > > > > Best,
> > > > > > Zhijiang
> > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> >
> > Best,
> > Benchao Li
> >

Re: [VOTE] FLIP-142: Disentangle StateBackends from Checkpointing

2020-09-14 Thread Konstantin Knauf
+1 (binding)

On Mon, Sep 14, 2020 at 5:05 PM Seth Wiesman  wrote:

> Hi all,
>
> After the discussion in [1], I would like to open a voting thread for
> FLIP-142 [2] which discusses disentangling state backends from
> checkpointing.
>
> The vote will be open until 16th September (72h), unless there is an
> objection or not enough votes.
>
> Seth
>
> [1]
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-142-Disentangle-StateBackends-from-Checkpointing-td44496.html
> [2]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-142%3A+Disentangle+StateBackends+from+Checkpointing
>


-- 

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk


Re: Re: [VOTE] FLIP-140: Introduce bounded style execution for keyed streams

2020-09-14 Thread Yun Gao
Very thanks for bring this up!  

+1 (non-binding)

Best,
Yun


--
Sender:Seth Wiesman
Date:2020/09/14 21:56:55
Recipient:dev
Theme:Re: [VOTE] FLIP-140: Introduce bounded style execution for keyed streams

+1 (binding)

Seth

On Thu, Sep 10, 2020 at 9:13 AM Aljoscha Krettek 
wrote:

> +1 (binding)
>
> Aljoscha
>