Re: [DISCUSS] Flink client api enhancement for downstream project

2019-08-20 Thread Zili Chen
I would like to involve Till & Stephan here to clarify some concept of
per-job mode.

The term per-job is one of modes a cluster could run on. It is mainly aimed
at spawn
a dedicated cluster for a specific job while the job could be packaged with
Flink
itself and thus the cluster initialized with job so that get rid of a
separated
submission step.

This is useful for container deployments where one create his image with
the job
and then simply deploy the container.

However, it is out of client scope since a client(ClusterClient for
example) is for
communicate with an existing cluster and performance actions. Currently, in
per-job
mode, we extract the job graph and bundle it into cluster deployment and
thus no
concept of client get involved. It looks like reasonable to exclude the
deployment
of per-job cluster from client api and use dedicated utility
classes(deployers) for
deployment.

Zili Chen  于2019年8月20日周二 下午12:37写道:

> Hi Aljoscha,
>
> Thanks for your reply and participance. The Google Doc you linked to
> requires
> permission and I think you could use a share link instead.
>
> I agree with that we almost reach a consensus that JobClient is necessary
> to
> interacte with a running Job.
>
> Let me check your open questions one by one.
>
> 1. Separate cluster creation and job submission for per-job mode.
>
> As you mentioned here is where the opinions diverge. In my document there
> is
> an alternative[2] that proposes excluding per-job deployment from client
> api
> scope and now I find it is more reasonable we do the exclusion.
>
> When in per-job mode, a dedicated JobCluster is launched to execute the
> specific job. It is like a Flink Application more than a submission
> of Flink Job. Client only takes care of job submission and assume there is
> an existing cluster. In this way we are able to consider per-job issues
> individually and JobClusterEntrypoint would be the utility class for
> per-job
> deployment.
>
> Nevertheless, user program works in both session mode and per-job mode
> without
> necessary to change code. JobClient in per-job mode is returned from
> env.execute as normal. However, it would be no longer a wrapper of
> RestClusterClient but a wrapper of PerJobClusterClient which communicates
> to Dispatcher locally.
>
> 2. How to deal with plan preview.
>
> With env.compile functions users can get JobGraph or FlinkPlan and thus
> they can preview the plan with programming. Typically it looks like
>
> if (preview configured) {
> FlinkPlan plan = env.compile();
> new JSONDumpGenerator(...).dump(plan);
> } else {
> env.execute();
> }
>
> And `flink info` would be invalid any more.
>
> 3. How to deal with Jar Submission at the Web Frontend.
>
> There is one more thread talked on this topic[1]. Apart from removing
> the functions there are two alternatives.
>
> One is to introduce an interface has a method returns JobGraph/FilnkPlan
> and Jar Submission only support main-class implements this interface.
> And then extract the JobGraph/FlinkPlan just by calling the method.
> In this way, it is even possible to consider a separation of job creation
> and job submission.
>
> The other is, as you mentioned, let execute() do the actual execution.
> We won't execute the main method in the WebFrontend but spawn a process
> at WebMonitor side to execute. For return part we could generate the
> JobID from WebMonitor and pass it to the execution environemnt.
>
> 4. How to deal with detached mode.
>
> I think detached mode is a temporary solution for non-blocking submission.
> In my document both submission and execution return a CompletableFuture and
> users control whether or not wait for the result. In this point we don't
> need a detached option but the functionality is covered.
>
> 5. How does per-job mode interact with interactive programming.
>
> All of YARN, Mesos and Kubernetes scenarios follow the pattern launch a
> JobCluster now. And I don't think there would be inconsistency between
> different resource management.
>
> Best,
> tison.
>
> [1]
> https://lists.apache.org/x/thread.html/6db869c53816f4e2917949a7c6992c2b90856d7d639d7f2e1cd13768@%3Cdev.flink.apache.org%3E
> [2]
> https://docs.google.com/document/d/1UWJE7eYWiMuZewBKS0YmdVO2LUTqXPd6-pbOCof9ddY/edit?disco=DZaGGfs
>
> Aljoscha Krettek  于2019年8月16日周五 下午9:20写道:
>
>> Hi,
>>
>> I read both Jeffs initial design document and the newer document by
>> Tison. I also finally found the time to collect our thoughts on the issue,
>> I had quite some discussions with Kostas and this is the result: [1].
>>
>> I think overall we agree that this part of the code is in dire need of
>> some refactoring/improvements but I think there are still some open
>> questions and some differences in opinion what those refactorings should
>> look like.
>>
>> I think the API-side is quite clear, i.e. we need some JobClient API that
>> allows interacting with a running Job. It could be worthwhile to spin that
>> off into a separate FLIP because we 

Re: [DISCUSS] Upgrade kinesis connector to Apache 2.0 License and include it in official release

2019-08-20 Thread Dyana Rose
Ahh, brilliant, I had myself on notifications for the streams adapter
releases, but must have missed it. That's great news.

I've got the branch prepped for moving over to Apache 2.0, but staying on
KCL 1.x, which requires the least amount of change.

Considering the large amount of change required to update to KCL/SDK 2.x I
would recommend that be done in a parallel task. Making both connectors
available then for usage, 1.x and 2.x. If that makes sense.

The branch I push will have the English Language documents updated, but not
have the Chinese Language documents updated. Is there a process for this?

Thanks,
Dyana

On Mon, 19 Aug 2019 at 19:08, Bowen Li  wrote:

> Hi all,
>
> A while back we discussed upgrading flink-connector-kinesis module to
> Apache 2.0 license so that its jar can be published as part of official
> Flink releases. Given we have a large user base using Flink with
> kinesis/dynamodb streams, it'll free users from building and maintaining
> the module themselves, and improve user and developer experience. A ticket
> was created [1] but has been idle mainly due to new releases of some aws
> libs are not available yet then.
>
> As of today I see that all flink-connector-kinesis's aws dependencies have
> been updated to Apache 2.0 license and are released. They include:
>
> - aws-java-sdk-kinesis
> - aws-java-sdk-sts
> - amazon-kinesis-client
> - amazon-kinesis-producer (Apache 2.0 from 0.13.1, released 18 days ago)
> [2]
> - dynamodb-streams-kinesis-adapter (Apache 2.0 from 1.5.0, released 7 days
> ago) [3]
>
> Therefore, I'd suggest we kick off the initiative and aim for release 1.10
> which is roughly 3 months away, leaving us plenty of time to finish.
> According to @Dyana 's comment in the ticket [1], seems some large chunks
> of changes need to be made into multiple parts than simply upgrading lib
> versions, so we can further break the JIRA down into sub-tasks to limit
> scope of each change for easier code review.
>
> @Dyana would you still be interested in carrying the responsibility and
> forwarding the effort?
>
> Thanks,
> Bowen
>
> [1] https://issues.apache.org/jira/browse/FLINK-12847
> [2] https://github.com/awslabs/amazon-kinesis-producer/releases
> [3] https://github.com/awslabs/dynamodb-streams-kinesis-adapter/releases
>
>
>

-- 

Dyana Rose
Software Engineer


W: www.salecycle.com 
[image: Airline & Travel Booking Trends - Download Report]



[jira] [Created] (FLINK-13792) source and sink support manual rate limit

2019-08-20 Thread zzsmdfj (Jira)
zzsmdfj created FLINK-13792:
---

 Summary: source and sink support manual rate limit
 Key: FLINK-13792
 URL: https://issues.apache.org/jira/browse/FLINK-13792
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Common
Affects Versions: 1.8.1
Reporter: zzsmdfj


in current flink implement automatic flow control by back pressure, it is 
efficient for the most scene, but in some special scenario, do we need 
fine-grained flow control to avoid impact on other systems? For example: if i 
have window with days(a lot of datas), then do call ProcessWindowFunction when 
trigger, this will produce a lot of data to sink, if sink to message queue, it 
can have a huge impact to message queue. so if there is sink rate limiter, it 
is friendly to external system. for source rate limiter, it is appropriate for 
having window operator and accumulating a large amount of historical data.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Flink client api enhancement for downstream project

2019-08-20 Thread Flavio Pompermaier
In my opinion the client should not use any environment to get the Job
graph because the jar should reside ONLY on the cluster (and not in the
client classpath otherwise there are always inconsistencies between client
and Flink Job manager's classpath).
In the YARN, Mesos and Kubernetes scenarios you have the jar but you could
start a cluster that has the jar on the Job Manager as well (but this is
the only case where I think you can assume that the client has the jar on
the classpath..in the REST job submission you don't have any classpath).

Thus, always in my opinion, the JobGraph should be generated by the Job
Manager REST API.


On Tue, Aug 20, 2019 at 9:00 AM Zili Chen  wrote:

> I would like to involve Till & Stephan here to clarify some concept of
> per-job mode.
>
> The term per-job is one of modes a cluster could run on. It is mainly aimed
> at spawn
> a dedicated cluster for a specific job while the job could be packaged with
> Flink
> itself and thus the cluster initialized with job so that get rid of a
> separated
> submission step.
>
> This is useful for container deployments where one create his image with
> the job
> and then simply deploy the container.
>
> However, it is out of client scope since a client(ClusterClient for
> example) is for
> communicate with an existing cluster and performance actions. Currently, in
> per-job
> mode, we extract the job graph and bundle it into cluster deployment and
> thus no
> concept of client get involved. It looks like reasonable to exclude the
> deployment
> of per-job cluster from client api and use dedicated utility
> classes(deployers) for
> deployment.
>
> Zili Chen  于2019年8月20日周二 下午12:37写道:
>
> > Hi Aljoscha,
> >
> > Thanks for your reply and participance. The Google Doc you linked to
> > requires
> > permission and I think you could use a share link instead.
> >
> > I agree with that we almost reach a consensus that JobClient is necessary
> > to
> > interacte with a running Job.
> >
> > Let me check your open questions one by one.
> >
> > 1. Separate cluster creation and job submission for per-job mode.
> >
> > As you mentioned here is where the opinions diverge. In my document there
> > is
> > an alternative[2] that proposes excluding per-job deployment from client
> > api
> > scope and now I find it is more reasonable we do the exclusion.
> >
> > When in per-job mode, a dedicated JobCluster is launched to execute the
> > specific job. It is like a Flink Application more than a submission
> > of Flink Job. Client only takes care of job submission and assume there
> is
> > an existing cluster. In this way we are able to consider per-job issues
> > individually and JobClusterEntrypoint would be the utility class for
> > per-job
> > deployment.
> >
> > Nevertheless, user program works in both session mode and per-job mode
> > without
> > necessary to change code. JobClient in per-job mode is returned from
> > env.execute as normal. However, it would be no longer a wrapper of
> > RestClusterClient but a wrapper of PerJobClusterClient which communicates
> > to Dispatcher locally.
> >
> > 2. How to deal with plan preview.
> >
> > With env.compile functions users can get JobGraph or FlinkPlan and thus
> > they can preview the plan with programming. Typically it looks like
> >
> > if (preview configured) {
> > FlinkPlan plan = env.compile();
> > new JSONDumpGenerator(...).dump(plan);
> > } else {
> > env.execute();
> > }
> >
> > And `flink info` would be invalid any more.
> >
> > 3. How to deal with Jar Submission at the Web Frontend.
> >
> > There is one more thread talked on this topic[1]. Apart from removing
> > the functions there are two alternatives.
> >
> > One is to introduce an interface has a method returns JobGraph/FilnkPlan
> > and Jar Submission only support main-class implements this interface.
> > And then extract the JobGraph/FlinkPlan just by calling the method.
> > In this way, it is even possible to consider a separation of job creation
> > and job submission.
> >
> > The other is, as you mentioned, let execute() do the actual execution.
> > We won't execute the main method in the WebFrontend but spawn a process
> > at WebMonitor side to execute. For return part we could generate the
> > JobID from WebMonitor and pass it to the execution environemnt.
> >
> > 4. How to deal with detached mode.
> >
> > I think detached mode is a temporary solution for non-blocking
> submission.
> > In my document both submission and execution return a CompletableFuture
> and
> > users control whether or not wait for the result. In this point we don't
> > need a detached option but the functionality is covered.
> >
> > 5. How does per-job mode interact with interactive programming.
> >
> > All of YARN, Mesos and Kubernetes scenarios follow the pattern launch a
> > JobCluster now. And I don't think there would be inconsistency between
> > different resource management.
> >
> > Best,
> > tison.
> >
> > [1]
> >
> https://lists.apache.org/x/th

Re: [DISCUSS][CODE STYLE] Usage of Java Optional

2019-08-20 Thread Dawid Wysakowicz
Hi Andrey,

Just wanted to quickly elaborate on my opinion. I wouldn't say I am -1,
just -0 for the Optionals in private methods. I am ok with not
forbidding them there. I just think in all cases there is a better
solution than passing the Optionals around, even in private methods. I
just hope the outcome of the discussion won't be that it is no longer
allowed to suggest those during review.

Best,

Dawid

On 19/08/2019 17:53, Andrey Zagrebin wrote:
> Hi all,
>
> Sorry for not getting back to this discussion for some time.
> It looks like in general we agree on the initially suggested points:
>
>- Always use Optional only to return nullable values in the API/public
>methods
>   - Only if you can prove that Optional usage would lead to a
>   performance degradation in critical code then return nullable value
>   directly and annotate it with @Nullable
>- Passing an Optional argument to a method can be allowed if it is
>within a private helper method and simplifies the code
>- Optional should not be used for class fields
>
> The first point can be also elaborated by explicitly forbiding
> Optional/Nullable parameters in public methods.
> In general we can always avoid Optional parameters by either overloading
> the method or using a pojo with a builder to pass a set of parameters.
>
> The third point does not prevent from using @Nullable/@Nonnull for class
> fields.
> If we agree to not use Optional for fields then not sure I see any use case
> for SerializableOptional (please comment on that if you have more details).
>
> @Jingsong Lee
> Using Optional in Maps.
> I can see this as a possible use case.
> I would leave this decision to the developer/reviewer to reason about it
> and keep the scope of this discussion to the variables/parameters as it
> seems to be the biggest point of friction atm.
>
> Finally, I see a split regarding the second point:  argument to a method can be allowed if it is within a private helper method
> and simplifies the code>.
> There are people who have a strong opinion against allowing it.
> Let's vote then for whether to allow it or not.
> So far as I see we have the following votes (correct me if wrong and add
> more if you want):
> Piotr+1
> Biao+1
> Timo   -1
> Yu   -1
> Aljoscha -1
> Till  +1
> Igal+1
> Dawid-1
> Me +1
>
> So far: +5 / -4 (Optional arguments in private methods)
>
> Best,
> Andrey
>
>
> On Tue, Aug 6, 2019 at 8:53 AM Piotr Nowojski  wrote:
>
>> Hi Qi,
>>
>>> For example, SingleInputGate is already creating Optional for every
>> BufferOrEvent in getNextBufferOrEvent(). How much performance gain would we
>> get if it’s replaced by null check?
>>
>> When I was introducing it there, I have micro-benchmarked this change, and
>> there was no visible throughput change in a pure network only micro
>> benchmark (with whole Flink running around it any changes would be even
>> less visible).
>>
>> Piotrek
>>
>>> On 5 Aug 2019, at 15:20, Till Rohrmann  wrote:
>>>
>>> I'd be in favour of
>>>
>>> - Optional for method return values if not performance critical
>>> - Optional can be used for internal methods if it makes sense
>>> - No optional fields
>>>
>>> Cheers,
>>> Till
>>>
>>> On Mon, Aug 5, 2019 at 12:07 PM Aljoscha Krettek 
>>> wrote:
>>>
 Hi,

 I’m also in favour of using Optional only for method return values. I
 could be persuaded to allow them for parameters of internal methods but
>> I’m
 sceptical about that.

 Aljoscha

> On 2. Aug 2019, at 15:35, Yu Li  wrote:
>
> TL; DR: I second Timo that we should use Optional only as method return
> type for non-performance critical code.
>
> From the example given on our AvroFactory [1] I also noticed that
 Jetbrains
> marks the OptionalUsedAsFieldOrParameterType inspection as a warning.
 It's
> relatively easy to understand why it's not suggested to use (java.util)
> Optional as a field since it's not serializable. What made me feel
 curious
> is that why we shouldn't use it as a parameter type, so I did some
> investigation and here is what I found:
>
> There's a JB blog talking about java8 top tips [2] where we could find
 the
> advice around Optional, there I found another blog telling about the
> pragmatic approach of using Optional [3]. Reading further we could see
 the
> reason why we shouldn't use Optional as parameter type, please allow me
 to
> quote here:
>
> It is often the case that domain objects hang about in memory for a
>> fair
> while, as processing in the application occurs, making each optional
> instance rather long-lived (tied to the lifetime of the domain object).
 By
> contrast, the Optionalinstance returned from the getter is likely to be
> very short-lived. The caller will call the getter, interpret the
>> result,
> and then move on. If you know anything a

Re: [DISCUSS] FLIP-56: Dynamic Slot Allocation

2019-08-20 Thread Xintong Song
@Zili

As far as I know, Timo is drafting a FLIP that has taken the number 55.
There is a round-up number maintained on the FLIP wiki page [1] shows which
number should be used for the new FLIP, which should be increased by
whoever takes the number for a new FLIP.

Thank you~

Xintong Song


[1]
https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals

On Tue, Aug 20, 2019 at 3:28 AM Zili Chen  wrote:

> We suddenly skipped FLIP-55 lol.
>
>
> Xintong Song  于2019年8月19日周一 下午10:23写道:
>
> > Hi everyone,
> >
> > We would like to start a discussion thread on "FLIP-56: Dynamic Slot
> > Allocation" [1]. This is originally part of the discussion thread for
> > "FLIP-53: Fine Grained Resource Management" [2]. As Till suggested, we
> > would like split the original discussion into two topics, and start a
> > separate new discussion thread as well as FLIP process for this one.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation
> >
> > [2]
> >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-53-Fine-Grained-Resource-Management-td31831.html
> >
>


[jira] [Created] (FLINK-13793) Build different language docs in parallel

2019-08-20 Thread Nico Kruber (Jira)
Nico Kruber created FLINK-13793:
---

 Summary: Build different language docs in parallel
 Key: FLINK-13793
 URL: https://issues.apache.org/jira/browse/FLINK-13793
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Nico Kruber
Assignee: Nico Kruber


Unfortunately, jekyll is lacking parallel builds and thus not making use of 
unused resources. In the special case of building the documentation without 
serving it, we could build each language (en, zh) in a separate sub-process and 
thus get some parallelization.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS][CODE STYLE] Usage of Java Optional

2019-08-20 Thread Stephan Ewen
I think Dawid raised a very good point here.
One of the outcomes should be that we are consistent in our recommendations
and requests during PR reviews. Otherwise we'll just confuse contributors.

So I would be
  +1 for someone to use Optional in a private method if they believe it is
helpful
  -1 to push contributors during reviews to do that


On Tue, Aug 20, 2019 at 9:42 AM Dawid Wysakowicz 
wrote:

> Hi Andrey,
>
> Just wanted to quickly elaborate on my opinion. I wouldn't say I am -1,
> just -0 for the Optionals in private methods. I am ok with not
> forbidding them there. I just think in all cases there is a better
> solution than passing the Optionals around, even in private methods. I
> just hope the outcome of the discussion won't be that it is no longer
> allowed to suggest those during review.
>
> Best,
>
> Dawid
>
> On 19/08/2019 17:53, Andrey Zagrebin wrote:
> > Hi all,
> >
> > Sorry for not getting back to this discussion for some time.
> > It looks like in general we agree on the initially suggested points:
> >
> >- Always use Optional only to return nullable values in the API/public
> >methods
> >   - Only if you can prove that Optional usage would lead to a
> >   performance degradation in critical code then return nullable value
> >   directly and annotate it with @Nullable
> >- Passing an Optional argument to a method can be allowed if it is
> >within a private helper method and simplifies the code
> >- Optional should not be used for class fields
> >
> > The first point can be also elaborated by explicitly forbiding
> > Optional/Nullable parameters in public methods.
> > In general we can always avoid Optional parameters by either overloading
> > the method or using a pojo with a builder to pass a set of parameters.
> >
> > The third point does not prevent from using @Nullable/@Nonnull for class
> > fields.
> > If we agree to not use Optional for fields then not sure I see any use
> case
> > for SerializableOptional (please comment on that if you have more
> details).
> >
> > @Jingsong Lee
> > Using Optional in Maps.
> > I can see this as a possible use case.
> > I would leave this decision to the developer/reviewer to reason about it
> > and keep the scope of this discussion to the variables/parameters as it
> > seems to be the biggest point of friction atm.
> >
> > Finally, I see a split regarding the second point:  > argument to a method can be allowed if it is within a private helper
> method
> > and simplifies the code>.
> > There are people who have a strong opinion against allowing it.
> > Let's vote then for whether to allow it or not.
> > So far as I see we have the following votes (correct me if wrong and add
> > more if you want):
> > Piotr+1
> > Biao+1
> > Timo   -1
> > Yu   -1
> > Aljoscha -1
> > Till  +1
> > Igal+1
> > Dawid-1
> > Me +1
> >
> > So far: +5 / -4 (Optional arguments in private methods)
> >
> > Best,
> > Andrey
> >
> >
> > On Tue, Aug 6, 2019 at 8:53 AM Piotr Nowojski 
> wrote:
> >
> >> Hi Qi,
> >>
> >>> For example, SingleInputGate is already creating Optional for every
> >> BufferOrEvent in getNextBufferOrEvent(). How much performance gain
> would we
> >> get if it’s replaced by null check?
> >>
> >> When I was introducing it there, I have micro-benchmarked this change,
> and
> >> there was no visible throughput change in a pure network only micro
> >> benchmark (with whole Flink running around it any changes would be even
> >> less visible).
> >>
> >> Piotrek
> >>
> >>> On 5 Aug 2019, at 15:20, Till Rohrmann  wrote:
> >>>
> >>> I'd be in favour of
> >>>
> >>> - Optional for method return values if not performance critical
> >>> - Optional can be used for internal methods if it makes sense
> >>> - No optional fields
> >>>
> >>> Cheers,
> >>> Till
> >>>
> >>> On Mon, Aug 5, 2019 at 12:07 PM Aljoscha Krettek 
> >>> wrote:
> >>>
>  Hi,
> 
>  I’m also in favour of using Optional only for method return values. I
>  could be persuaded to allow them for parameters of internal methods
> but
> >> I’m
>  sceptical about that.
> 
>  Aljoscha
> 
> > On 2. Aug 2019, at 15:35, Yu Li  wrote:
> >
> > TL; DR: I second Timo that we should use Optional only as method
> return
> > type for non-performance critical code.
> >
> > From the example given on our AvroFactory [1] I also noticed that
>  Jetbrains
> > marks the OptionalUsedAsFieldOrParameterType inspection as a warning.
>  It's
> > relatively easy to understand why it's not suggested to use
> (java.util)
> > Optional as a field since it's not serializable. What made me feel
>  curious
> > is that why we shouldn't use it as a parameter type, so I did some
> > investigation and here is what I found:
> >
> > There's a JB blog talking about java8 top tips [2] where we could
> find
>  the
> > advice around Optional, there I found anothe

Re: [DISCUSS] Flink client api enhancement for downstream project

2019-08-20 Thread Till Rohrmann
I would not be in favour of getting rid of the per-job mode since it
simplifies the process of running Flink jobs considerably. Moreover, it is
not only well suited for container deployments but also for deployments
where you want to guarantee job isolation. For example, a user could use
the per-job mode on Yarn to execute his job on a separate cluster.

I think that having two notions of cluster deployments (session vs. per-job
mode) does not necessarily contradict your ideas for the client api
refactoring. For example one could have the following interfaces:

- ClusterDeploymentDescriptor: encapsulates the logic how to deploy a
cluster.
- ClusterClient: allows to interact with a cluster
- JobClient: allows to interact with a running job

Now the ClusterDeploymentDescriptor could have two methods:

- ClusterClient deploySessionCluster()
- JobClusterClient/JobClient deployPerJobCluster(JobGraph)

where JobClusterClient is either a supertype of ClusterClient which does
not give you the functionality to submit jobs or deployPerJobCluster
returns directly a JobClient.

When setting up the ExecutionEnvironment, one would then not provide a
ClusterClient to submit jobs but a JobDeployer which, depending on the
selected mode, either uses a ClusterClient (session mode) to submit jobs or
a ClusterDeploymentDescriptor to deploy per a job mode cluster with the job
to execute.

These are just some thoughts how one could make it working because I
believe there is some value in using the per job mode from the
ExecutionEnvironment.

Concerning the web submission, this is indeed a bit tricky. From a cluster
management stand point, I would in favour of not executing user code on the
REST endpoint. Especially when considering security, it would be good to
have a well defined cluster behaviour where it is explicitly stated where
user code and, thus, potentially risky code is executed. Ideally we limit
it to the TaskExecutor and JobMaster.

Cheers,
Till

On Tue, Aug 20, 2019 at 9:40 AM Flavio Pompermaier 
wrote:

> In my opinion the client should not use any environment to get the Job
> graph because the jar should reside ONLY on the cluster (and not in the
> client classpath otherwise there are always inconsistencies between client
> and Flink Job manager's classpath).
> In the YARN, Mesos and Kubernetes scenarios you have the jar but you could
> start a cluster that has the jar on the Job Manager as well (but this is
> the only case where I think you can assume that the client has the jar on
> the classpath..in the REST job submission you don't have any classpath).
>
> Thus, always in my opinion, the JobGraph should be generated by the Job
> Manager REST API.
>
>
> On Tue, Aug 20, 2019 at 9:00 AM Zili Chen  wrote:
>
>> I would like to involve Till & Stephan here to clarify some concept of
>> per-job mode.
>>
>> The term per-job is one of modes a cluster could run on. It is mainly
>> aimed
>> at spawn
>> a dedicated cluster for a specific job while the job could be packaged
>> with
>> Flink
>> itself and thus the cluster initialized with job so that get rid of a
>> separated
>> submission step.
>>
>> This is useful for container deployments where one create his image with
>> the job
>> and then simply deploy the container.
>>
>> However, it is out of client scope since a client(ClusterClient for
>> example) is for
>> communicate with an existing cluster and performance actions. Currently,
>> in
>> per-job
>> mode, we extract the job graph and bundle it into cluster deployment and
>> thus no
>> concept of client get involved. It looks like reasonable to exclude the
>> deployment
>> of per-job cluster from client api and use dedicated utility
>> classes(deployers) for
>> deployment.
>>
>> Zili Chen  于2019年8月20日周二 下午12:37写道:
>>
>> > Hi Aljoscha,
>> >
>> > Thanks for your reply and participance. The Google Doc you linked to
>> > requires
>> > permission and I think you could use a share link instead.
>> >
>> > I agree with that we almost reach a consensus that JobClient is
>> necessary
>> > to
>> > interacte with a running Job.
>> >
>> > Let me check your open questions one by one.
>> >
>> > 1. Separate cluster creation and job submission for per-job mode.
>> >
>> > As you mentioned here is where the opinions diverge. In my document
>> there
>> > is
>> > an alternative[2] that proposes excluding per-job deployment from client
>> > api
>> > scope and now I find it is more reasonable we do the exclusion.
>> >
>> > When in per-job mode, a dedicated JobCluster is launched to execute the
>> > specific job. It is like a Flink Application more than a submission
>> > of Flink Job. Client only takes care of job submission and assume there
>> is
>> > an existing cluster. In this way we are able to consider per-job issues
>> > individually and JobClusterEntrypoint would be the utility class for
>> > per-job
>> > deployment.
>> >
>> > Nevertheless, user program works in both session mode and per-job mode
>> > without
>> > necessary to chan

Re: [DISCUSS][CODE STYLE] Usage of Java Optional

2019-08-20 Thread Yu Li
Thanks for the summarize Andrey!

I'd also like to adjust my -1 to +0 on using Optional as parameter for
private methods due to the existence of the very first rule - "Avoid using
Optional in any performance critical code". I'd regard the "possible GC
burden while using Optional as parameter" also one performance related
factor.

And besides the code convention itself, I believe it's even more important
to make us contributors know the reason behind.

Thanks.

Best Regards,
Yu


On Tue, 20 Aug 2019 at 10:14, Stephan Ewen  wrote:

> I think Dawid raised a very good point here.
> One of the outcomes should be that we are consistent in our recommendations
> and requests during PR reviews. Otherwise we'll just confuse contributors.
>
> So I would be
>   +1 for someone to use Optional in a private method if they believe it is
> helpful
>   -1 to push contributors during reviews to do that
>
>
> On Tue, Aug 20, 2019 at 9:42 AM Dawid Wysakowicz 
> wrote:
>
> > Hi Andrey,
> >
> > Just wanted to quickly elaborate on my opinion. I wouldn't say I am -1,
> > just -0 for the Optionals in private methods. I am ok with not
> > forbidding them there. I just think in all cases there is a better
> > solution than passing the Optionals around, even in private methods. I
> > just hope the outcome of the discussion won't be that it is no longer
> > allowed to suggest those during review.
> >
> > Best,
> >
> > Dawid
> >
> > On 19/08/2019 17:53, Andrey Zagrebin wrote:
> > > Hi all,
> > >
> > > Sorry for not getting back to this discussion for some time.
> > > It looks like in general we agree on the initially suggested points:
> > >
> > >- Always use Optional only to return nullable values in the
> API/public
> > >methods
> > >   - Only if you can prove that Optional usage would lead to a
> > >   performance degradation in critical code then return nullable
> value
> > >   directly and annotate it with @Nullable
> > >- Passing an Optional argument to a method can be allowed if it is
> > >within a private helper method and simplifies the code
> > >- Optional should not be used for class fields
> > >
> > > The first point can be also elaborated by explicitly forbiding
> > > Optional/Nullable parameters in public methods.
> > > In general we can always avoid Optional parameters by either
> overloading
> > > the method or using a pojo with a builder to pass a set of parameters.
> > >
> > > The third point does not prevent from using @Nullable/@Nonnull for
> class
> > > fields.
> > > If we agree to not use Optional for fields then not sure I see any use
> > case
> > > for SerializableOptional (please comment on that if you have more
> > details).
> > >
> > > @Jingsong Lee
> > > Using Optional in Maps.
> > > I can see this as a possible use case.
> > > I would leave this decision to the developer/reviewer to reason about
> it
> > > and keep the scope of this discussion to the variables/parameters as it
> > > seems to be the biggest point of friction atm.
> > >
> > > Finally, I see a split regarding the second point:  > > argument to a method can be allowed if it is within a private helper
> > method
> > > and simplifies the code>.
> > > There are people who have a strong opinion against allowing it.
> > > Let's vote then for whether to allow it or not.
> > > So far as I see we have the following votes (correct me if wrong and
> add
> > > more if you want):
> > > Piotr+1
> > > Biao+1
> > > Timo   -1
> > > Yu   -1
> > > Aljoscha -1
> > > Till  +1
> > > Igal+1
> > > Dawid-1
> > > Me +1
> > >
> > > So far: +5 / -4 (Optional arguments in private methods)
> > >
> > > Best,
> > > Andrey
> > >
> > >
> > > On Tue, Aug 6, 2019 at 8:53 AM Piotr Nowojski 
> > wrote:
> > >
> > >> Hi Qi,
> > >>
> > >>> For example, SingleInputGate is already creating Optional for every
> > >> BufferOrEvent in getNextBufferOrEvent(). How much performance gain
> > would we
> > >> get if it’s replaced by null check?
> > >>
> > >> When I was introducing it there, I have micro-benchmarked this change,
> > and
> > >> there was no visible throughput change in a pure network only micro
> > >> benchmark (with whole Flink running around it any changes would be
> even
> > >> less visible).
> > >>
> > >> Piotrek
> > >>
> > >>> On 5 Aug 2019, at 15:20, Till Rohrmann  wrote:
> > >>>
> > >>> I'd be in favour of
> > >>>
> > >>> - Optional for method return values if not performance critical
> > >>> - Optional can be used for internal methods if it makes sense
> > >>> - No optional fields
> > >>>
> > >>> Cheers,
> > >>> Till
> > >>>
> > >>> On Mon, Aug 5, 2019 at 12:07 PM Aljoscha Krettek <
> aljos...@apache.org>
> > >>> wrote:
> > >>>
> >  Hi,
> > 
> >  I’m also in favour of using Optional only for method return values.
> I
> >  could be persuaded to allow them for parameters of internal methods
> > but
> > >> I’m
> >  sceptical about that.
> > 
> >  Aljoscha
> > >>>

Re: [DISCUSS] Flink client api enhancement for downstream project

2019-08-20 Thread Stephan Ewen
Till has made some good comments here.

Two things to add:

  - The job mode is very nice in the way that it runs the client inside the
cluster (in the same image/process that is the JM) and thus unifies both
applications and what the Spark world calls the "driver mode".

  - Another thing I would add is that during the FLIP-6 design, we were
thinking about setups where Dispatcher and JobManager are separate
processes.
A Yarn or Mesos Dispatcher of a session could run independently (even
as privileged processes executing no code).
Then you the "per-job" mode could still be helpful: when a job is
submitted to the dispatcher, it launches the JM again in a per-job mode, so
that JM and TM processes are bound to teh job only. For higher security
setups, it is important that processes are not reused across jobs.

On Tue, Aug 20, 2019 at 10:27 AM Till Rohrmann  wrote:

> I would not be in favour of getting rid of the per-job mode since it
> simplifies the process of running Flink jobs considerably. Moreover, it is
> not only well suited for container deployments but also for deployments
> where you want to guarantee job isolation. For example, a user could use
> the per-job mode on Yarn to execute his job on a separate cluster.
>
> I think that having two notions of cluster deployments (session vs. per-job
> mode) does not necessarily contradict your ideas for the client api
> refactoring. For example one could have the following interfaces:
>
> - ClusterDeploymentDescriptor: encapsulates the logic how to deploy a
> cluster.
> - ClusterClient: allows to interact with a cluster
> - JobClient: allows to interact with a running job
>
> Now the ClusterDeploymentDescriptor could have two methods:
>
> - ClusterClient deploySessionCluster()
> - JobClusterClient/JobClient deployPerJobCluster(JobGraph)
>
> where JobClusterClient is either a supertype of ClusterClient which does
> not give you the functionality to submit jobs or deployPerJobCluster
> returns directly a JobClient.
>
> When setting up the ExecutionEnvironment, one would then not provide a
> ClusterClient to submit jobs but a JobDeployer which, depending on the
> selected mode, either uses a ClusterClient (session mode) to submit jobs or
> a ClusterDeploymentDescriptor to deploy per a job mode cluster with the job
> to execute.
>
> These are just some thoughts how one could make it working because I
> believe there is some value in using the per job mode from the
> ExecutionEnvironment.
>
> Concerning the web submission, this is indeed a bit tricky. From a cluster
> management stand point, I would in favour of not executing user code on the
> REST endpoint. Especially when considering security, it would be good to
> have a well defined cluster behaviour where it is explicitly stated where
> user code and, thus, potentially risky code is executed. Ideally we limit
> it to the TaskExecutor and JobMaster.
>
> Cheers,
> Till
>
> On Tue, Aug 20, 2019 at 9:40 AM Flavio Pompermaier 
> wrote:
>
> > In my opinion the client should not use any environment to get the Job
> > graph because the jar should reside ONLY on the cluster (and not in the
> > client classpath otherwise there are always inconsistencies between
> client
> > and Flink Job manager's classpath).
> > In the YARN, Mesos and Kubernetes scenarios you have the jar but you
> could
> > start a cluster that has the jar on the Job Manager as well (but this is
> > the only case where I think you can assume that the client has the jar on
> > the classpath..in the REST job submission you don't have any classpath).
> >
> > Thus, always in my opinion, the JobGraph should be generated by the Job
> > Manager REST API.
> >
> >
> > On Tue, Aug 20, 2019 at 9:00 AM Zili Chen  wrote:
> >
> >> I would like to involve Till & Stephan here to clarify some concept of
> >> per-job mode.
> >>
> >> The term per-job is one of modes a cluster could run on. It is mainly
> >> aimed
> >> at spawn
> >> a dedicated cluster for a specific job while the job could be packaged
> >> with
> >> Flink
> >> itself and thus the cluster initialized with job so that get rid of a
> >> separated
> >> submission step.
> >>
> >> This is useful for container deployments where one create his image with
> >> the job
> >> and then simply deploy the container.
> >>
> >> However, it is out of client scope since a client(ClusterClient for
> >> example) is for
> >> communicate with an existing cluster and performance actions. Currently,
> >> in
> >> per-job
> >> mode, we extract the job graph and bundle it into cluster deployment and
> >> thus no
> >> concept of client get involved. It looks like reasonable to exclude the
> >> deployment
> >> of per-job cluster from client api and use dedicated utility
> >> classes(deployers) for
> >> deployment.
> >>
> >> Zili Chen  于2019年8月20日周二 下午12:37写道:
> >>
> >> > Hi Aljoscha,
> >> >
> >> > Thanks for your reply and participance. The Google Doc you linked to
> >> > requires
> >> > permission and I think you could

Re: [DISCUSS] Upgrade kinesis connector to Apache 2.0 License and include it in official release

2019-08-20 Thread Stephan Ewen
Just FYI - Becket, Aljoscha, and me are working on fleshing out the
remaining details of FLIP-27 (new source API).
We will share this as soon as we have made some progress on some of the
details.

The Kinesis connector would be one of the first that we would try to also
implement in that new API, as a validation that it is powerful and flexible
enough.

If the upgrade involved major refactoring, would it make sense combine
these efforts?

Best,
Stephan


On Tue, Aug 20, 2019 at 9:12 AM Dyana Rose  wrote:

> Ahh, brilliant, I had myself on notifications for the streams adapter
> releases, but must have missed it. That's great news.
>
> I've got the branch prepped for moving over to Apache 2.0, but staying on
> KCL 1.x, which requires the least amount of change.
>
> Considering the large amount of change required to update to KCL/SDK 2.x I
> would recommend that be done in a parallel task. Making both connectors
> available then for usage, 1.x and 2.x. If that makes sense.
>
> The branch I push will have the English Language documents updated, but not
> have the Chinese Language documents updated. Is there a process for this?
>
> Thanks,
> Dyana
>
> On Mon, 19 Aug 2019 at 19:08, Bowen Li  wrote:
>
> > Hi all,
> >
> > A while back we discussed upgrading flink-connector-kinesis module to
> > Apache 2.0 license so that its jar can be published as part of official
> > Flink releases. Given we have a large user base using Flink with
> > kinesis/dynamodb streams, it'll free users from building and maintaining
> > the module themselves, and improve user and developer experience. A
> ticket
> > was created [1] but has been idle mainly due to new releases of some aws
> > libs are not available yet then.
> >
> > As of today I see that all flink-connector-kinesis's aws dependencies
> have
> > been updated to Apache 2.0 license and are released. They include:
> >
> > - aws-java-sdk-kinesis
> > - aws-java-sdk-sts
> > - amazon-kinesis-client
> > - amazon-kinesis-producer (Apache 2.0 from 0.13.1, released 18 days ago)
> > [2]
> > - dynamodb-streams-kinesis-adapter (Apache 2.0 from 1.5.0, released 7
> days
> > ago) [3]
> >
> > Therefore, I'd suggest we kick off the initiative and aim for release
> 1.10
> > which is roughly 3 months away, leaving us plenty of time to finish.
> > According to @Dyana 's comment in the ticket [1], seems some large chunks
> > of changes need to be made into multiple parts than simply upgrading lib
> > versions, so we can further break the JIRA down into sub-tasks to limit
> > scope of each change for easier code review.
> >
> > @Dyana would you still be interested in carrying the responsibility and
> > forwarding the effort?
> >
> > Thanks,
> > Bowen
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-12847
> > [2] https://github.com/awslabs/amazon-kinesis-producer/releases
> > [3] https://github.com/awslabs/dynamodb-streams-kinesis-adapter/releases
> >
> >
> >
>
> --
>
> Dyana Rose
> Software Engineer
>
>
> W: www.salecycle.com 
> [image: Airline & Travel Booking Trends - Download Report]
> 
>


Re: [VOTE] Flink Project Bylaws

2019-08-20 Thread Stephan Ewen
I see it somewhat similar to Henry.

Generally, all committers should go for a review by another committer,
unless it is a trivial comment or style fix. I personally do that, even
though being one of the committers that have been with the project longest.

For now, I was hoping though that we have a mature enough community that
this "soft rule" is enough. Whenever possible, working based on trust with
soft processes beats working with hard processes. We can still revisit this
in case we see that it does not work out.


On Mon, Aug 19, 2019 at 10:21 PM Henry Saputra 
wrote:

> One of the perks of being committers is be able to commit code without
> asking from another committer. Having said that, I think we rely on
> maturity of the committers to know when to ask for reviews and when to
> commit directly.
>
> For example, if someone just change typos on comments or simple rename of
> internal variables, I think we could trust the committer to safely commit
> the changes. When the changes will have effect of changing or introduce new
> flows of the code, that's when reviews are needed and strongly encouraged.
> I think the balance is needed for this.
>
> PMCs have the ability and right to revert changes in source repo as
> necessary.
>
> - Henry
>
> On Sun, Aug 18, 2019 at 9:23 PM Thomas Weise  wrote:
>
> > +0 (binding)
> >
> > I don't think committers should be allowed to approve their own changes.
> I
> > would prefer if non-committer contributors can approve committer PRs as
> > that would encourage more participation in code review and ability to
> > contribute.
> >
> >
> > On Fri, Aug 16, 2019 at 9:02 PM Shaoxuan Wang 
> wrote:
> >
> > > +1 (binding)
> > >
> > > On Fri, Aug 16, 2019 at 7:48 PM Chesnay Schepler 
> > > wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > Although I think it would be a good idea to always cc
> > > > priv...@flink.apache.org when modifying bylaws, if anything to speed
> > up
> > > > the voting process.
> > > >
> > > > On 16/08/2019 11:26, Ufuk Celebi wrote:
> > > > > +1 (binding)
> > > > >
> > > > > – Ufuk
> > > > >
> > > > >
> > > > > On Wed, Aug 14, 2019 at 4:50 AM Biao Liu 
> wrote:
> > > > >
> > > > >> +1 (non-binding)
> > > > >>
> > > > >> Thanks for pushing this!
> > > > >>
> > > > >> Thanks,
> > > > >> Biao /'bɪ.aʊ/
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Wed, 14 Aug 2019 at 09:37, Jark Wu  wrote:
> > > > >>
> > > > >>> +1 (non-binding)
> > > > >>>
> > > > >>> Best,
> > > > >>> Jark
> > > > >>>
> > > > >>> On Wed, 14 Aug 2019 at 09:22, Kurt Young 
> wrote:
> > > > >>>
> > > >  +1 (binding)
> > > > 
> > > >  Best,
> > > >  Kurt
> > > > 
> > > > 
> > > >  On Wed, Aug 14, 2019 at 1:34 AM Yun Tang 
> > wrote:
> > > > 
> > > > > +1 (non-binding)
> > > > >
> > > > > But I have a minor question about "code change" action, for
> those
> > > > > "[hotfix]" github pull requests [1], the dev mailing list would
> > not
> > > > >> be
> > > > > notified currently. I think we should change the description of
> > > this
> > > >  action.
> > > > >
> > > > > [1]
> > > > >
> > > > >>
> > > >
> > >
> >
> https://flink.apache.org/contributing/contribute-code.html#code-contribution-process
> > > > > Best
> > > > > Yun Tang
> > > > > 
> > > > > From: JingsongLee 
> > > > > Sent: Tuesday, August 13, 2019 23:56
> > > > > To: dev 
> > > > > Subject: Re: [VOTE] Flink Project Bylaws
> > > > >
> > > > > +1 (non-binding)
> > > > > Thanks Becket.
> > > > > I've learned a lot from current bylaws.
> > > > >
> > > > > Best,
> > > > > Jingsong Lee
> > > > >
> > > > >
> > > > >
> > --
> > > > > From:Yu Li 
> > > > > Send Time:2019年8月13日(星期二) 17:48
> > > > > To:dev 
> > > > > Subject:Re: [VOTE] Flink Project Bylaws
> > > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > Thanks for the efforts Becket!
> > > > >
> > > > > Best Regards,
> > > > > Yu
> > > > >
> > > > >
> > > > > On Tue, 13 Aug 2019 at 16:09, Xintong Song <
> > tonysong...@gmail.com>
> > > >  wrote:
> > > > >> +1 (non-binding)
> > > > >>
> > > > >> Thank you~
> > > > >>
> > > > >> Xintong Song
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Tue, Aug 13, 2019 at 1:48 PM Robert Metzger <
> > > > >> rmetz...@apache.org>
> > > > >> wrote:
> > > > >>
> > > > >>> +1 (binding)
> > > > >>>
> > > > >>> On Tue, Aug 13, 2019 at 1:47 PM Becket Qin <
> > becket@gmail.com
> > > > > wrote:
> > > >  Thanks everyone for voting.
> > > > 
> > > >  For those who have already voted, just want to bring this up
> > to
> > > >  your
> > > >  attention that there is a minor clarification to the bylaws
> > > > >> wiki
> > > >  this
> > > >  morning. The change is in

Re: flink release-1.8.0 Flink-avro unit tests failed

2019-08-20 Thread Stephan Ewen
Thanks, looks like you diagnosed it correctly. environment specific
encoding settings.

Could you open a ticket (maybe a PR) to set the encoding and make the test
stable across environments?

On Mon, Aug 19, 2019 at 9:46 PM Ethan Li  wrote:

> It’s probably the encoding problem. The environment I ran the unit tests
> on uses ANSI_X3.4-1968
>
> It looks like we have to use "en_US.UTF-8"
>
>
> > On Aug 19, 2019, at 1:44 PM, Ethan Li  wrote:
> >
> > Hello,
> >
> > Not sure if anyone encountered this issue before.  I tried to run “mvn
> clean install” on flink release-1.8, but some unit tests in flink-arvo
> module failed:
> >
> >
> > [ERROR] Tests run: 12, Failures: 4, Errors: 0, Skipped: 0, Time elapsed:
> 4.81 s <<< FAILURE! - in
> org.apache.flink.formats.avro.typeutils.AvroTypeExtractionTest
> > [ERROR] testSimpleAvroRead[Execution mode =
> CLUSTER](org.apache.flink.formats.avro.typeutils.AvroTypeExtractionTest)
> Time elapsed: 0.438 s  <<< FAILURE!
> > java.lang.AssertionError:
> > Different elements in arrays: expected 2 elements and received 2
> > files: [/tmp/junit5386344396421857812/junit6023978980792200274.tmp/4,
> /tmp/junit5386344396421857812/junit6023978980792200274.tmp/2,
> /tmp/junit5386344396421857812/junit6023978980792200274.tmp/1,
> /tmp/junit5386344396421857812/junit6023978980792200274.tmp/3]
> >  expected: [{"name": "Alyssa", "favorite_number": 256, "favorite_color":
> null, "type_long_test": null, "type_double_test": 123.45, "type_null_test":
> null, "type_bool_test": true, "type_array_string": ["ELEMENT 1", "ELEMENT
> 2"], "type_array_boolean": [true, false], "type_nullable_array": null,
> "type_enum": "GREEN", "type_map": {"KEY 2": 17554, "KEY 1": 8546456},
> "type_fixed": null, "type_union": null, "type_nested": {"num": 239,
> "street": "Baker Street", "city": "London", "state": "London", "zip": "NW1
> 6XE"}, "type_bytes": {"bytes":
> "\u\u\u\u\u\u\u\u\u\u"},
> "type_date": 2014-03-01, "type_time_millis": 12:12:12.000,
> "type_time_micros": 123456, "type_timestamp_millis":
> 2014-03-01T12:12:12.321Z, "type_timestamp_micros": 123456,
> "type_decimal_bytes": {"bytes": "\u0007?"}, "type_decimal_fixed": [7,
> -48]}, {"name": "Charlie", "favorite_number": null, "favorite_color":
> "blue", "type_long_test": 1337, "type_double_test": 1.337,
> "type_null_test": null, "type_bool_test": false, "type_array_string": [],
> "type_array_boolean": [], "type_nullable_array": null, "type_enum": "RED",
> "type_map": {}, "type_fixed": null, "type_union": null, "type_nested":
> {"num": 239, "street": "Baker Street", "city": "London", "state": "London",
> "zip": "NW1 6XE"}, "type_bytes": {"bytes":
> "\u\u\u\u\u\u\u\u\u\u"},
> "type_date": 2014-03-01, "type_time_millis": 12:12:12.000,
> "type_time_micros": 123456, "type_timestamp_millis":
> 2014-03-01T12:12:12.321Z, "type_timestamp_micros": 123456,
> "type_decimal_bytes": {"bytes": "\u0007?"}, "type_decimal_fixed": [7, -48]}]
> >  received: [{"name": "Alyssa", "favorite_number": 256, "favorite_color":
> null, "type_long_test": null, "type_double_test": 123.45, "type_null_test":
> null, "type_bool_test": true, "type_array_string": ["ELEMENT 1", "ELEMENT
> 2"], "type_array_boolean": [true, false], "type_nullable_array": null,
> "type_enum": "GREEN", "type_map": {"KEY 2": 17554, "KEY 1": 8546456},
> "type_fixed": null, "type_union": null, "type_nested": {"num": 239,
> "street": "Baker Street", "city": "London", "state": "London", "zip": "NW1
> 6XE"}, "type_bytes": {"bytes":
> "\u\u\u\u\u\u\u\u\u\u"},
> "type_date": 2014-03-01, "type_time_millis": 12:12:12.000,
> "type_time_micros": 123456, "type_timestamp_millis":
> 2014-03-01T12:12:12.321Z, "type_timestamp_micros": 123456,
> "type_decimal_bytes": {"bytes": "\u0007??"}, "type_decimal_fixed": [7,
> -48]}, {"name": "Charlie", "favorite_number": null, "favorite_color":
> "blue", "type_long_test": 1337, "type_double_test": 1.337,
> "type_null_test": null, "type_bool_test": false, "type_array_string": [],
> "type_array_boolean": [], "type_nullable_array": null, "type_enum": "RED",
> "type_map": {}, "type_fixed": null, "type_union": null, "type_nested":
> {"num": 239, "street": "Baker Street", "city": "London", "state": "London",
> "zip": "NW1 6XE"}, "type_bytes": {"bytes":
> "\u\u\u\u\u\u\u\u\u\u"},
> "type_date": 2014-03-01, "type_time_millis": 12:12:12.000,
> "type_time_micros": 123456, "type_timestamp_millis":
> 2014-03-01T12:12:12.321Z, "type_timestamp_micros": 123456,
> "type_decimal_bytes": {"bytes": "\u0007??"}, "type_decimal_fixed": [7,
> -48]}]
> >   at
> org.apache.flink.formats.avro.typeutils.AvroTypeExtractionTest.after(AvroTypeExtractionTest.java:76)
> >
> >
> >
> > Comparing “expected” with “received”, there is really some question mark
> difference.
> >
> > For example, in “expected’, it’s
> >
> > "type_decimal_bytes": {"bytes": "\u0007?”}

Re: [VOTE] Flink Project Bylaws

2019-08-20 Thread Becket Qin
Thanks for sharing your thoughts, Thomas, Henry and Stephan. I also think
the committers are supposed to be mature enough to know when a review on
their own patch is needed.

@Henry, just want to confirm, are you +1 on the proposed bylaws?

Thanks,

Jiangjie (Becket) Qin

On Tue, Aug 20, 2019 at 10:54 AM Stephan Ewen  wrote:

> I see it somewhat similar to Henry.
>
> Generally, all committers should go for a review by another committer,
> unless it is a trivial comment or style fix. I personally do that, even
> though being one of the committers that have been with the project longest.
>
> For now, I was hoping though that we have a mature enough community that
> this "soft rule" is enough. Whenever possible, working based on trust with
> soft processes beats working with hard processes. We can still revisit this
> in case we see that it does not work out.
>
>
> On Mon, Aug 19, 2019 at 10:21 PM Henry Saputra 
> wrote:
>
> > One of the perks of being committers is be able to commit code without
> > asking from another committer. Having said that, I think we rely on
> > maturity of the committers to know when to ask for reviews and when to
> > commit directly.
> >
> > For example, if someone just change typos on comments or simple rename of
> > internal variables, I think we could trust the committer to safely commit
> > the changes. When the changes will have effect of changing or introduce
> new
> > flows of the code, that's when reviews are needed and strongly
> encouraged.
> > I think the balance is needed for this.
> >
> > PMCs have the ability and right to revert changes in source repo as
> > necessary.
> >
> > - Henry
> >
> > On Sun, Aug 18, 2019 at 9:23 PM Thomas Weise  wrote:
> >
> > > +0 (binding)
> > >
> > > I don't think committers should be allowed to approve their own
> changes.
> > I
> > > would prefer if non-committer contributors can approve committer PRs as
> > > that would encourage more participation in code review and ability to
> > > contribute.
> > >
> > >
> > > On Fri, Aug 16, 2019 at 9:02 PM Shaoxuan Wang 
> > wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > On Fri, Aug 16, 2019 at 7:48 PM Chesnay Schepler  >
> > > > wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > Although I think it would be a good idea to always cc
> > > > > priv...@flink.apache.org when modifying bylaws, if anything to
> speed
> > > up
> > > > > the voting process.
> > > > >
> > > > > On 16/08/2019 11:26, Ufuk Celebi wrote:
> > > > > > +1 (binding)
> > > > > >
> > > > > > – Ufuk
> > > > > >
> > > > > >
> > > > > > On Wed, Aug 14, 2019 at 4:50 AM Biao Liu 
> > wrote:
> > > > > >
> > > > > >> +1 (non-binding)
> > > > > >>
> > > > > >> Thanks for pushing this!
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Biao /'bɪ.aʊ/
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> On Wed, 14 Aug 2019 at 09:37, Jark Wu  wrote:
> > > > > >>
> > > > > >>> +1 (non-binding)
> > > > > >>>
> > > > > >>> Best,
> > > > > >>> Jark
> > > > > >>>
> > > > > >>> On Wed, 14 Aug 2019 at 09:22, Kurt Young 
> > wrote:
> > > > > >>>
> > > > >  +1 (binding)
> > > > > 
> > > > >  Best,
> > > > >  Kurt
> > > > > 
> > > > > 
> > > > >  On Wed, Aug 14, 2019 at 1:34 AM Yun Tang 
> > > wrote:
> > > > > 
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > But I have a minor question about "code change" action, for
> > those
> > > > > > "[hotfix]" github pull requests [1], the dev mailing list
> would
> > > not
> > > > > >> be
> > > > > > notified currently. I think we should change the description
> of
> > > > this
> > > > >  action.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://flink.apache.org/contributing/contribute-code.html#code-contribution-process
> > > > > > Best
> > > > > > Yun Tang
> > > > > > 
> > > > > > From: JingsongLee 
> > > > > > Sent: Tuesday, August 13, 2019 23:56
> > > > > > To: dev 
> > > > > > Subject: Re: [VOTE] Flink Project Bylaws
> > > > > >
> > > > > > +1 (non-binding)
> > > > > > Thanks Becket.
> > > > > > I've learned a lot from current bylaws.
> > > > > >
> > > > > > Best,
> > > > > > Jingsong Lee
> > > > > >
> > > > > >
> > > > > >
> > > --
> > > > > > From:Yu Li 
> > > > > > Send Time:2019年8月13日(星期二) 17:48
> > > > > > To:dev 
> > > > > > Subject:Re: [VOTE] Flink Project Bylaws
> > > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > Thanks for the efforts Becket!
> > > > > >
> > > > > > Best Regards,
> > > > > > Yu
> > > > > >
> > > > > >
> > > > > > On Tue, 13 Aug 2019 at 16:09, Xintong Song <
> > > tonysong...@gmail.com>
> > > > >  wrote:
> > > > > >> +1 (non-binding)
> > > > > >>
> > > > > >> Thank you~
> > > > > >>
> > > > > >> Xintong Song
> > > >

Re: [DISCUSS][CODE STYLE] Breaking long function argument lists and chained method calls

2019-08-20 Thread Yu Li
I second Stephan's summarize, and to be more explicit, +1 on:
- Set a hard line length limit
- Allow arguments on the same line if below length limit
- With consistent argument breaking when that length is exceeded
- Developers can break before that if they feel it helps with readability

FWIW, hbase project also sets the line length limit to 100 [1] (personally
I don't have any tendency on this, so JFYI).

[1]
https://github.com/apache/hbase/blob/a59f7d4ffc27ea23b9822c3c26d6aeb76ccdf9aa/hbase-checkstyle/src/main/resources/hbase/checkstyle.xml#L128

Best Regards,
Yu


On Mon, 19 Aug 2019 at 18:22, Stephan Ewen  wrote:

> I personally prefer not to break lines with few parameters.
> It just feels unnecessarily clumsy to parse the breaks if there are only
> two or three arguments with short names.
>
> So +1
>   - for a hard line length limit
>   - allowing arguments on the same line if below that limit
>   - with consistent argument breaking when that length is exceeded
>   - developers can break before that if they feel it helps with
> readability.
>
> This should be similar to what we have, except for enforcing the line
> length limit.
>
> I think our Java guide originally suggested 120 characters line length, but
> we can reduce that to 100 if a majority argues for that, but it is separate
> discussion.
> We uses shorter lines in Scala (100 chars) because Scala code becomes
> harder to read faster with long lines.
>
>
> On Mon, Aug 19, 2019 at 10:45 AM Andrey Zagrebin 
> wrote:
>
> > Hi Everybody,
> >
> > Thanks for your feedback guys and sorry for not getting back to the
> > discussion for some time.
> >
> > @SHI Xiaogang
> > About breaking lines for thrown exceptions:
> > Indeed that would prevent growing the throw clause indefinitely.
> > I am a bit concerned about putting the right parenthesis and/or throw
> > clause on the next line
> > because in general we do not it and there are a lot of variations of how
> > and what to put to the next line so it needs explicit memorising.
> > Also, we do not have many checked exceptions and usually avoid them.
> > Although I am not a big fan of many function arguments either but this
> > seems to be a bigger problem in the code base.
> > I would be ok to not enforce anything for exceptions atm.
> >
> > @Chesnay Schepler 
> > Thanks for mentioning automatic checks.
> > Indeed, pointing out this kind of style issues during PR reviews is very
> > tedious
> > and cannot really force it without automated tools.
> > I would still consider the outcome of this discussion as a soft
> > recommendation atm (which we also have for some other things in the code
> > style draft).
> > We need more investigation about how to enforce things. I am not so
> > knowledgable about code style/IDE checks.
> > From the first glance I also do not see a simple way. If somebody has
> more
> > insight please share your experience.
> >
> > @Biao Liu 
> > Line length limitation:
> > I do not see anything for Java, only for Scala: 100 (also enforced by
> build
> > AFAIK).
> > From what I heard there has been already some discussion about the hard
> > limit for the line length.
> > Although quite some people are in favour of it (including me) and seems
> to
> > be a nice limitation,
> > there are some practical implication about it.
> > Historically, Flink did not have any code style checks and huge chunks of
> > code base have to be reformatted destroying the commit history.
> > Another thing is value for the limit. Nowadays, we have wide screens and
> do
> > not often even need to scroll.
> > Nevertheless, we can kick off another discussion about the line length
> > limit and enforcing it.
> > Atm I see people to adhere to a soft recommendation of 120 line length
> for
> > Java because it is usually a bit more verbose comparing to Scala.
> >
> > *Question 1*:
> > I would be ok to always break line if there is more than one chained
> call.
> > There are a lot of places where this is only one short call I would not
> > break line in this case.
> > If it is too confusing I would be ok to stick to the rule to break either
> > all or none.
> > Thanks for pointing out this explicitly: For a chained method calls, the
> > new line should be started with the dot.
> > I think it should be also part of the rule if forced.
> >
> > *Question 2:*
> > The indent of new line should be 1 tab or 2 tabs? (I assume it matters
> only
> > for function arguments)
> > This is a good point which again probably deserves a separate thread.
> > We also had an internal discussion about it. I would be also in favour of
> > resolving it into one way.
> > Atm we indeed have 2 ways in our code base which are again soft
> > recommendations.
> > The problem is mostly with enforcing it automatically.
> > The approach with extra indentation also needs IDE setup otherwise it is
> > annoying
> > that after every function cut/paste, e.g. Idea changes the format to one
> > indentation automatically and often people do not notice it.

Re: [VOTE] Apache Flink 1.9.0, release candidate #3

2019-08-20 Thread Tzu-Li (Gordon) Tai
+1

Legal checks:
- verified signatures and hashes
- New bundled Javascript dependencies for flink-runtime-web are correctly
reflected under licenses-binary and NOTICE file.
- locally built from source (Scala 2.12, without Hadoop)
- No missing artifacts in staging repo
- No binaries in source release

Functional checks:
- Quickstart working (both in IDE + job submission)
- Simple State Processor API program that performs offline key schema
migration (RocksDB backend). Generated savepoint is valid to restore from.
- All E2E tests pass locally
- Didn’t notice any issues with the new WebUI

Cheers,
Gordon

On Tue, Aug 20, 2019 at 3:53 AM Zili Chen  wrote:

> +1 (non-binding)
>
> - build from source: OK(8u212)
> - check local setup tutorial works as expected
>
> Best,
> tison.
>
>
> Yu Li  于2019年8月20日周二 上午8:24写道:
>
> > +1 (non-binding)
> >
> > - checked release notes: OK
> > - checked sums and signatures: OK
> > - repository appears to contain all expected artifacts
> > - source release
> >  - contains no binaries: OK
> >  - contains no 1.9-SNAPSHOT references: OK
> >  - build from source: OK (8u102)
> > - binary release
> >  - no examples appear to be missing
> >  - started a cluster; WebUI reachable, example ran successfully
> > - checked README.md file and found nothing unexpected
> >
> > Best Regards,
> > Yu
> >
> >
> > On Tue, 20 Aug 2019 at 01:16, Tzu-Li (Gordon) Tai 
> > wrote:
> >
> > > Hi all,
> > >
> > > Release candidate #3 for Apache Flink 1.9.0 is now ready for your
> review.
> > >
> > > Please review and vote on release candidate #3 for version 1.9.0, as
> > > follows:
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > >
> > > The complete staging area is available for your review, which includes:
> > > * JIRA release notes [1],
> > > * the official Apache source release and binary convenience releases to
> > be
> > > deployed to dist.apache.org [2], which are signed with the key with
> > > fingerprint 1C1E2394D3194E1944613488F320986D35C33D6A [3],
> > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > * source code tag “release-1.9.0-rc3” [5].
> > > * pull requests for the release note documentation [6] and announcement
> > > blog post [7].
> > >
> > > As proposed in the RC2 vote thread [8], for RC3 we are only
> > cherry-picking
> > > minimal specific changes on top of RC2 to be able to reasonably carry
> > over
> > > previous testing efforts and effectively require a shorter voting time.
> > >
> > > The only extra commits in this RC, compared to RC2, are the following:
> > > - c2d9aeac [FLINK-13231] [pubsub] Replace Max outstanding
> acknowledgement
> > > ids limit with a FlinkConnectorRateLimiter
> > > - d8941711 [FLINK-13699][table-api] Fix TableFactory doesn’t work with
> > DDL
> > > when containing TIMESTAMP/DATE/TIME types
> > > - 04e95278 [FLINK-13752] Only references necessary variables when
> > > bookkeeping result partitions on TM
> > >
> > > Due to the minimal set of changes, the vote for RC3 will be *open for
> > only
> > > 48 hours*.
> > > Please cast your votes before *Aug. 21st (Wed.) 2019, 17:00 PM CET*.
> > >
> > > It is adopted by majority approval, with at least 3 PMC affirmative
> > votes.
> > >
> > > Thanks,
> > > Gordon
> > >
> > > [1]
> > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344601
> > > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.9.0-rc3/
> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > [4]
> > https://repository.apache.org/content/repositories/orgapacheflink-1236
> > > [5]
> > >
> > >
> >
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=refs/tags/release-1.9.0-rc3
> > > [6] https://github.com/apache/flink/pull/9438
> > > [7] https://github.com/apache/flink-web/pull/244
> > > [8]
> > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Apache-Flink-Release-1-9-0-release-candidate-2-tp31542p31933.html
> > >
> >
>


Re: [DISCUSS] Flink client api enhancement for downstream project

2019-08-20 Thread Zili Chen
Thanks for the clarification.

The idea JobDeployer ever came into my mind when I was muddled with
how to execute per-job mode and session mode with the same user code
and framework codepath.

With the concept JobDeployer we back to the statement that environment
knows every configs of cluster deployment and job submission. We
configure or generate from configuration a specific JobDeployer in
environment and then code align on

*JobClient client = env.execute().get();*

which in session mode returned by clusterClient.submitJob and in per-job
mode returned by clusterDescriptor.deployJobCluster.

Here comes a problem that currently we directly run ClusterEntrypoint
with extracted job graph. Follow the JobDeployer way we'd better
align entry point of per-job deployment at JobDeployer. Users run
their main method or by a Cli(finally call main method) to deploy the
job cluster.

Best,
tison.


Stephan Ewen  于2019年8月20日周二 下午4:40写道:

> Till has made some good comments here.
>
> Two things to add:
>
>   - The job mode is very nice in the way that it runs the client inside the
> cluster (in the same image/process that is the JM) and thus unifies both
> applications and what the Spark world calls the "driver mode".
>
>   - Another thing I would add is that during the FLIP-6 design, we were
> thinking about setups where Dispatcher and JobManager are separate
> processes.
> A Yarn or Mesos Dispatcher of a session could run independently (even
> as privileged processes executing no code).
> Then you the "per-job" mode could still be helpful: when a job is
> submitted to the dispatcher, it launches the JM again in a per-job mode, so
> that JM and TM processes are bound to teh job only. For higher security
> setups, it is important that processes are not reused across jobs.
>
> On Tue, Aug 20, 2019 at 10:27 AM Till Rohrmann 
> wrote:
>
> > I would not be in favour of getting rid of the per-job mode since it
> > simplifies the process of running Flink jobs considerably. Moreover, it
> is
> > not only well suited for container deployments but also for deployments
> > where you want to guarantee job isolation. For example, a user could use
> > the per-job mode on Yarn to execute his job on a separate cluster.
> >
> > I think that having two notions of cluster deployments (session vs.
> per-job
> > mode) does not necessarily contradict your ideas for the client api
> > refactoring. For example one could have the following interfaces:
> >
> > - ClusterDeploymentDescriptor: encapsulates the logic how to deploy a
> > cluster.
> > - ClusterClient: allows to interact with a cluster
> > - JobClient: allows to interact with a running job
> >
> > Now the ClusterDeploymentDescriptor could have two methods:
> >
> > - ClusterClient deploySessionCluster()
> > - JobClusterClient/JobClient deployPerJobCluster(JobGraph)
> >
> > where JobClusterClient is either a supertype of ClusterClient which does
> > not give you the functionality to submit jobs or deployPerJobCluster
> > returns directly a JobClient.
> >
> > When setting up the ExecutionEnvironment, one would then not provide a
> > ClusterClient to submit jobs but a JobDeployer which, depending on the
> > selected mode, either uses a ClusterClient (session mode) to submit jobs
> or
> > a ClusterDeploymentDescriptor to deploy per a job mode cluster with the
> job
> > to execute.
> >
> > These are just some thoughts how one could make it working because I
> > believe there is some value in using the per job mode from the
> > ExecutionEnvironment.
> >
> > Concerning the web submission, this is indeed a bit tricky. From a
> cluster
> > management stand point, I would in favour of not executing user code on
> the
> > REST endpoint. Especially when considering security, it would be good to
> > have a well defined cluster behaviour where it is explicitly stated where
> > user code and, thus, potentially risky code is executed. Ideally we limit
> > it to the TaskExecutor and JobMaster.
> >
> > Cheers,
> > Till
> >
> > On Tue, Aug 20, 2019 at 9:40 AM Flavio Pompermaier  >
> > wrote:
> >
> > > In my opinion the client should not use any environment to get the Job
> > > graph because the jar should reside ONLY on the cluster (and not in the
> > > client classpath otherwise there are always inconsistencies between
> > client
> > > and Flink Job manager's classpath).
> > > In the YARN, Mesos and Kubernetes scenarios you have the jar but you
> > could
> > > start a cluster that has the jar on the Job Manager as well (but this
> is
> > > the only case where I think you can assume that the client has the jar
> on
> > > the classpath..in the REST job submission you don't have any
> classpath).
> > >
> > > Thus, always in my opinion, the JobGraph should be generated by the Job
> > > Manager REST API.
> > >
> > >
> > > On Tue, Aug 20, 2019 at 9:00 AM Zili Chen 
> wrote:
> > >
> > >> I would like to involve Till & Stephan here to clarify some concept of
> > >> per-job mode.
> > >>
> > >> The term 

[jira] [Created] (FLINK-13794) Remove unused field printStatusDuringExecution in ClusterClient

2019-08-20 Thread TisonKun (Jira)
TisonKun created FLINK-13794:


 Summary: Remove unused field printStatusDuringExecution in 
ClusterClient
 Key: FLINK-13794
 URL: https://issues.apache.org/jira/browse/FLINK-13794
 Project: Flink
  Issue Type: Improvement
  Components: Command Line Client
Reporter: TisonKun
 Fix For: 1.10.0


{{printStatusDuringExecution}} in {{ClusterClient}} is only set but never used. 
It is itself a legacy field. Remove it as cleanup.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Upgrade kinesis connector to Apache 2.0 License and include it in official release

2019-08-20 Thread Becket Qin
I agree with Stephan. It will be good to see if we can align those two
efforts so that we don't write code that are soon to be refactored again.

Thanks,

Jiangjie (Becket) Qin

On Tue, Aug 20, 2019 at 10:50 AM Stephan Ewen  wrote:

> Just FYI - Becket, Aljoscha, and me are working on fleshing out the
> remaining details of FLIP-27 (new source API).
> We will share this as soon as we have made some progress on some of the
> details.
>
> The Kinesis connector would be one of the first that we would try to also
> implement in that new API, as a validation that it is powerful and flexible
> enough.
>
> If the upgrade involved major refactoring, would it make sense combine
> these efforts?
>
> Best,
> Stephan
>
>
> On Tue, Aug 20, 2019 at 9:12 AM Dyana Rose 
> wrote:
>
> > Ahh, brilliant, I had myself on notifications for the streams adapter
> > releases, but must have missed it. That's great news.
> >
> > I've got the branch prepped for moving over to Apache 2.0, but staying on
> > KCL 1.x, which requires the least amount of change.
> >
> > Considering the large amount of change required to update to KCL/SDK 2.x
> I
> > would recommend that be done in a parallel task. Making both connectors
> > available then for usage, 1.x and 2.x. If that makes sense.
> >
> > The branch I push will have the English Language documents updated, but
> not
> > have the Chinese Language documents updated. Is there a process for this?
> >
> > Thanks,
> > Dyana
> >
> > On Mon, 19 Aug 2019 at 19:08, Bowen Li  wrote:
> >
> > > Hi all,
> > >
> > > A while back we discussed upgrading flink-connector-kinesis module to
> > > Apache 2.0 license so that its jar can be published as part of official
> > > Flink releases. Given we have a large user base using Flink with
> > > kinesis/dynamodb streams, it'll free users from building and
> maintaining
> > > the module themselves, and improve user and developer experience. A
> > ticket
> > > was created [1] but has been idle mainly due to new releases of some
> aws
> > > libs are not available yet then.
> > >
> > > As of today I see that all flink-connector-kinesis's aws dependencies
> > have
> > > been updated to Apache 2.0 license and are released. They include:
> > >
> > > - aws-java-sdk-kinesis
> > > - aws-java-sdk-sts
> > > - amazon-kinesis-client
> > > - amazon-kinesis-producer (Apache 2.0 from 0.13.1, released 18 days
> ago)
> > > [2]
> > > - dynamodb-streams-kinesis-adapter (Apache 2.0 from 1.5.0, released 7
> > days
> > > ago) [3]
> > >
> > > Therefore, I'd suggest we kick off the initiative and aim for release
> > 1.10
> > > which is roughly 3 months away, leaving us plenty of time to finish.
> > > According to @Dyana 's comment in the ticket [1], seems some large
> chunks
> > > of changes need to be made into multiple parts than simply upgrading
> lib
> > > versions, so we can further break the JIRA down into sub-tasks to limit
> > > scope of each change for easier code review.
> > >
> > > @Dyana would you still be interested in carrying the responsibility and
> > > forwarding the effort?
> > >
> > > Thanks,
> > > Bowen
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-12847
> > > [2] https://github.com/awslabs/amazon-kinesis-producer/releases
> > > [3]
> https://github.com/awslabs/dynamodb-streams-kinesis-adapter/releases
> > >
> > >
> > >
> >
> > --
> >
> > Dyana Rose
> > Software Engineer
> >
> >
> > W: www.salecycle.com 
> > [image: Airline & Travel Booking Trends - Download Report]
> > 
> >
>


[jira] [Created] (FLINK-13795) Web UI logs errors when selecting Checkpoint Tab for Batch Jobs

2019-08-20 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-13795:


 Summary: Web UI logs errors when selecting Checkpoint Tab for 
Batch Jobs
 Key: FLINK-13795
 URL: https://issues.apache.org/jira/browse/FLINK-13795
 Project: Flink
  Issue Type: Bug
  Components: Runtime / REST
Affects Versions: 1.9.0
Reporter: Stephan Ewen


The logs of the REST endpoint print errors if you run a batch job and then 
select the "Checkpoints" tab.

I would expect that this simply shows "no checkpoints available for this job" 
and not that an {{ERROR}} level entry appears in the log.

{code}
2019-08-20 12:04:54,195 ERROR 
org.apache.flink.runtime.rest.handler.job.checkpoints.CheckpointingStatisticsHandler
  - Exception occurred in REST handler: Checkpointing has not been enabled.
{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13796) Remove unused variable

2019-08-20 Thread Fokko Driesprong (Jira)
Fokko Driesprong created FLINK-13796:


 Summary: Remove unused variable
 Key: FLINK-13796
 URL: https://issues.apache.org/jira/browse/FLINK-13796
 Project: Flink
  Issue Type: Task
  Components: Deployment / YARN
Affects Versions: 1.8.1
Reporter: Fokko Driesprong






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13797) Add missing format argument

2019-08-20 Thread Fokko Driesprong (Jira)
Fokko Driesprong created FLINK-13797:


 Summary: Add missing format argument
 Key: FLINK-13797
 URL: https://issues.apache.org/jira/browse/FLINK-13797
 Project: Flink
  Issue Type: Task
  Components: Deployment / Mesos
Affects Versions: 1.8.1
Reporter: Fokko Driesprong






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13798) Refactor the process of checking stream status while emitting watermark in source

2019-08-20 Thread zhijiang (Jira)
zhijiang created FLINK-13798:


 Summary: Refactor the process of checking stream status while 
emitting watermark in source
 Key: FLINK-13798
 URL: https://issues.apache.org/jira/browse/FLINK-13798
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Task
Reporter: zhijiang
Assignee: zhijiang


As we know, the watermark could be emitted to downstream only when the stream 
status is active. For the downstream task we already have the component of 
StatusWatermarkValve in StreamInputProcessor to handle this logic. But for the 
source task the current implementation of this logic seems a bit tricky. There 
are two scenarios for the source case:
 * Emit watermark via source context: In the specific WatermarkContext, it 
would toggle the  stream status as active before collecting/emitting 
records/watermarks. Then in the implementation of RecordWriterOutput, it would 
check the status always active before really emitting watermark.
 * TimestampsAndPeriodicWatermarksOperator: The watermark is triggered by timer 
in interval time. When it happens, it would call output stack to emit 
watermark. Then the RecordWriterOutput could take the role of checking status 
before really emitting watermark.

So we can see that the checking status logic in RecordWriterOutput only works 
for above second scenario, and this logic seems redundant for the first 
scenario because WatermarkContext always toggle active status before emitting. 
Even worse, the logic is RecordWriterOutput would bring cycle dependency with 
StreamStatusMaintainer, which is a blocker for the following work of 
integrating source processing on runtime side.

The solution is that we could migrate the checking logic from 
RecordWriterOutput to TimestampsAndPeriodicWatermarksOperator. And we could 
also remove the toggle active logic  in existing WatermarkContext.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend

2019-08-20 Thread Yu Li
Sorry for the lag but since we've got a consensus days ago, I started a
vote thread which will have a result by EOD, thus I'm closing this
discussion thread. Thanks all for the participation and
comments/suggestions!

Best Regards,
Yu


On Fri, 16 Aug 2019 at 09:09, Till Rohrmann  wrote:

> +1 for this FLIP and the feature. I think this feature will be super
> helpful for many Flink users.
>
> Once the SpillableHeapKeyedStateBackend has proven to be superior to the
> HeapKeyedStateBackend we should think about removing the latter completely
> to reduce maintenance burden.
>
> Cheers,
> Till
>
> On Fri, Aug 16, 2019 at 4:06 AM Congxian Qiu 
> wrote:
>
> > Big +1 for this feature.
> >
> > This FLIP can help improves at least the following two scenarios:
> > - Temporary data peak when using Heap StateBackend
> > - Heap State Backend has better performance than RocksDBStateBackend,
> > especially on SATA disk. there are some guys ever told me that they
> > increased the parallelism of operators(and use HeapStateBackend) other
> than
> > use RocksDBStateBackend to get better performance. But increase
> parallelism
> > will have some other problems, after this FLIP, we can run Flink Job with
> > the same parallelism as RocksDBStateBackend and get better performance
> > also.
> >
> > Best,
> > Congxian
> >
> >
> > Yu Li  于2019年8月16日周五 上午12:14写道:
> >
> > > Thanks all for the reviews and comments!
> > >
> > > bq. From the implementation plan, it looks like this exists purely in a
> > new
> > > module and does not require any changes in other parts of Flink's code.
> > Can
> > > you confirm that?
> > > Confirmed, thanks!
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > > On Thu, 15 Aug 2019 at 18:04, Tzu-Li (Gordon) Tai  >
> > > wrote:
> > >
> > > > +1 to start a VOTE for this FLIP.
> > > >
> > > > Given the properties of this new state backend and that it will exist
> > as
> > > a
> > > > new module without touching the original heap backend, I don't see a
> > harm
> > > > in including this.
> > > > Regarding design of the feature, I've already mentioned my comments
> in
> > > the
> > > > original discussion thread.
> > > >
> > > > Cheers,
> > > > Gordon
> > > >
> > > > On Thu, Aug 15, 2019 at 5:53 PM Yun Tang  wrote:
> > > >
> > > > > Big +1 for this feature.
> > > > >
> > > > > Our customers including me, have ever met dilemma where we have to
> > use
> > > > > window to aggregate events in applications like real-time
> monitoring.
> > > The
> > > > > larger of timer and window state, the poor performance of RocksDB.
> > > > However,
> > > > > switching to use FsStateBackend would always make me feel fear
> about
> > > the
> > > > > OOM errors.
> > > > >
> > > > > Look forward for more powerful enrichment to state-backend, and
> help
> > > > Flink
> > > > > to achieve better performance together.
> > > > >
> > > > > Best
> > > > > Yun Tang
> > > > > 
> > > > > From: Stephan Ewen 
> > > > > Sent: Thursday, August 15, 2019 23:07
> > > > > To: dev 
> > > > > Subject: Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend
> > > > >
> > > > > +1 for this feature. I think this will be appreciated by users, as
> a
> > > way
> > > > to
> > > > > use the HeapStateBackend with a safety-net against OOM errors.
> > > > > And having had major production exposure is great.
> > > > >
> > > > > From the implementation plan, it looks like this exists purely in a
> > new
> > > > > module and does not require any changes in other parts of Flink's
> > code.
> > > > Can
> > > > > you confirm that?
> > > > >
> > > > > Other that that, I have no further questions and we could proceed
> to
> > > vote
> > > > > on this FLIP, from my side.
> > > > >
> > > > > Best,
> > > > > Stephan
> > > > >
> > > > >
> > > > > On Tue, Aug 13, 2019 at 10:00 PM Yu Li  wrote:
> > > > >
> > > > > > Sorry for forgetting to give the link of the FLIP, here it is:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-50%3A+Spill-able+Heap+Keyed+State+Backend
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > Best Regards,
> > > > > > Yu
> > > > > >
> > > > > >
> > > > > > On Tue, 13 Aug 2019 at 18:06, Yu Li  wrote:
> > > > > >
> > > > > > > Hi All,
> > > > > > >
> > > > > > > We ever held a discussion about this feature before [1] but now
> > > > opening
> > > > > > > another thread because after a second thought introducing a new
> > > > backend
> > > > > > > instead of modifying the existing heap backend is a better
> option
> > > to
> > > > > > > prevent causing any regression or surprise to existing
> > > in-production
> > > > > > usage.
> > > > > > > And since introducing a new backend is relatively big change,
> we
> > > > regard
> > > > > > it
> > > > > > > as a FLIP and need another discussion and voting process
> > according
> > > to
> > > > > our
> > > > > > > newly drafted bylaw [2].
> > > > > > >
> > > > > > > Please allow me to quote the brief descript

Re: [DISCUSS] Upgrade kinesis connector to Apache 2.0 License and include it in official release

2019-08-20 Thread Tzu-Li (Gordon) Tai
Hi Dyana,

Regarding your question on the Chinese docs:
Since the Chinese counterparts for the Kinesis connector documentation
isn't translated yet (see docs/dev/connectors/kinesis.zh.md), for now you
can simply just sync whatever changes you made to the English doc to the
Chinese one as well.

Cheers,
Gordon

On Tue, Aug 20, 2019 at 1:22 PM Becket Qin  wrote:

> I agree with Stephan. It will be good to see if we can align those two
> efforts so that we don't write code that are soon to be refactored again.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Tue, Aug 20, 2019 at 10:50 AM Stephan Ewen  wrote:
>
> > Just FYI - Becket, Aljoscha, and me are working on fleshing out the
> > remaining details of FLIP-27 (new source API).
> > We will share this as soon as we have made some progress on some of the
> > details.
> >
> > The Kinesis connector would be one of the first that we would try to also
> > implement in that new API, as a validation that it is powerful and
> flexible
> > enough.
> >
> > If the upgrade involved major refactoring, would it make sense combine
> > these efforts?
> >
> > Best,
> > Stephan
> >
> >
> > On Tue, Aug 20, 2019 at 9:12 AM Dyana Rose 
> > wrote:
> >
> > > Ahh, brilliant, I had myself on notifications for the streams adapter
> > > releases, but must have missed it. That's great news.
> > >
> > > I've got the branch prepped for moving over to Apache 2.0, but staying
> on
> > > KCL 1.x, which requires the least amount of change.
> > >
> > > Considering the large amount of change required to update to KCL/SDK
> 2.x
> > I
> > > would recommend that be done in a parallel task. Making both connectors
> > > available then for usage, 1.x and 2.x. If that makes sense.
> > >
> > > The branch I push will have the English Language documents updated, but
> > not
> > > have the Chinese Language documents updated. Is there a process for
> this?
> > >
> > > Thanks,
> > > Dyana
> > >
> > > On Mon, 19 Aug 2019 at 19:08, Bowen Li  wrote:
> > >
> > > > Hi all,
> > > >
> > > > A while back we discussed upgrading flink-connector-kinesis module to
> > > > Apache 2.0 license so that its jar can be published as part of
> official
> > > > Flink releases. Given we have a large user base using Flink with
> > > > kinesis/dynamodb streams, it'll free users from building and
> > maintaining
> > > > the module themselves, and improve user and developer experience. A
> > > ticket
> > > > was created [1] but has been idle mainly due to new releases of some
> > aws
> > > > libs are not available yet then.
> > > >
> > > > As of today I see that all flink-connector-kinesis's aws dependencies
> > > have
> > > > been updated to Apache 2.0 license and are released. They include:
> > > >
> > > > - aws-java-sdk-kinesis
> > > > - aws-java-sdk-sts
> > > > - amazon-kinesis-client
> > > > - amazon-kinesis-producer (Apache 2.0 from 0.13.1, released 18 days
> > ago)
> > > > [2]
> > > > - dynamodb-streams-kinesis-adapter (Apache 2.0 from 1.5.0, released 7
> > > days
> > > > ago) [3]
> > > >
> > > > Therefore, I'd suggest we kick off the initiative and aim for release
> > > 1.10
> > > > which is roughly 3 months away, leaving us plenty of time to finish.
> > > > According to @Dyana 's comment in the ticket [1], seems some large
> > chunks
> > > > of changes need to be made into multiple parts than simply upgrading
> > lib
> > > > versions, so we can further break the JIRA down into sub-tasks to
> limit
> > > > scope of each change for easier code review.
> > > >
> > > > @Dyana would you still be interested in carrying the responsibility
> and
> > > > forwarding the effort?
> > > >
> > > > Thanks,
> > > > Bowen
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-12847
> > > > [2] https://github.com/awslabs/amazon-kinesis-producer/releases
> > > > [3]
> > https://github.com/awslabs/dynamodb-streams-kinesis-adapter/releases
> > > >
> > > >
> > > >
> > >
> > > --
> > >
> > > Dyana Rose
> > > Software Engineer
> > >
> > >
> > > W: www.salecycle.com 
> > > [image: Airline & Travel Booking Trends - Download Report]
> > > 
> > >
> >
>


[jira] [Created] (FLINK-13799) Web Job Submit Page displays stream of error message when web submit is disables in the config

2019-08-20 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-13799:


 Summary: Web Job Submit Page displays stream of error message when 
web submit is disables in the config
 Key: FLINK-13799
 URL: https://issues.apache.org/jira/browse/FLINK-13799
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Web Frontend
Affects Versions: 1.9.0
Reporter: Stephan Ewen


If you put {{web.submit.enable: false}} into the configuration, the web UI will 
still display the "SubmitJob" page, but errors will continuously pop up, stating

"Unable to load requested file /jars."



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13800) Create a module for spill-able heap backend

2019-08-20 Thread Yu Li (Jira)
Yu Li created FLINK-13800:
-

 Summary: Create a module for spill-able heap backend
 Key: FLINK-13800
 URL: https://issues.apache.org/jira/browse/FLINK-13800
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / State Backends
Reporter: Yu Li


According to further discussions in 
[ML|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-50-Spill-able-Heap-Keyed-State-Backend-td31680.html]
 and 
[FLIP-50|https://cwiki.apache.org/confluence/display/FLINK/FLIP-50%3A+Spill-able+Heap+Keyed+State+Backend],
 we will create a new backend instead of modifying the existing 
{{HeapKeyedStateBackend}}, to prevent any surprise for existing in-production 
usage (and only change the spill-able one to default in future if proven to be 
stable).

This JIRA aims at creating a sub-module under the {{flink-state-backends}} 
module for the spill-able heap backend.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Upgrade kinesis connector to Apache 2.0 License and include it in official release

2019-08-20 Thread Dyana Rose
ok great,

that's done, the PR is rebased and squashed on top of master and is running
through Travis

https://github.com/apache/flink/pull/9494

Dyana

On Tue, 20 Aug 2019 at 15:32, Tzu-Li (Gordon) Tai 
wrote:

> Hi Dyana,
>
> Regarding your question on the Chinese docs:
> Since the Chinese counterparts for the Kinesis connector documentation
> isn't translated yet (see docs/dev/connectors/kinesis.zh.md), for now you
> can simply just sync whatever changes you made to the English doc to the
> Chinese one as well.
>
> Cheers,
> Gordon
>


[jira] [Created] (FLINK-13801) Introduce a HybridStateTable to combine everything together

2019-08-20 Thread Yu Li (Jira)
Yu Li created FLINK-13801:
-

 Summary: Introduce a HybridStateTable to combine everything 
together
 Key: FLINK-13801
 URL: https://issues.apache.org/jira/browse/FLINK-13801
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / State Backends
Reporter: Yu Li


This JIRA aims at introducing a {{HybridStateTable}} which could combine 
everything together, like checking the heap usage through 
{{HeapAccountingManager}} and GC status through {{HeapStatusMonitor}} to decide 
whether to spill/load key groups, triggering the spill/load action through 
{{SpillLoadManager}}, and recording all meta data (about which key group is on 
heap, which is on disk), etc.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13802) Flink code style guide

2019-08-20 Thread Andrey Zagrebin (Jira)
Andrey Zagrebin created FLINK-13802:
---

 Summary: Flink code style guide
 Key: FLINK-13802
 URL: https://issues.apache.org/jira/browse/FLINK-13802
 Project: Flink
  Issue Type: Task
  Components: Documentation, Project Website
Reporter: Andrey Zagrebin


This is an umbrella issue to introduce and improve Flink code style guide.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13803) Introduce SpillableHeapKeyedStateBackend and all necessities

2019-08-20 Thread Yu Li (Jira)
Yu Li created FLINK-13803:
-

 Summary: Introduce SpillableHeapKeyedStateBackend and all 
necessities
 Key: FLINK-13803
 URL: https://issues.apache.org/jira/browse/FLINK-13803
 Project: Flink
  Issue Type: Sub-task
Reporter: Yu Li


This JIRA aims at introducing a new {{SpillableHeapKeyedStateBackend}} which 
will reuse most code of the {{HeapKeyedStateBackend}} (probably the only 
difference is the spill-able one will register a {{HybridStateTable}}), and 
allow using it in {{FsStateBackend}} and {{MemoryStateBackend}} (only as an 
option, by default still {{HeapKeyedStateBackend}}) through configuration.

The related necessities include but are not limited to:
* A relative backend builder class
* Relative restore operation classes
* Necessary configurations for using spill-able backend

This should be the last JIRA after which the spill-able heap backend feature 
will become runnable regardless of the stability and performance.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13804) Collections initial capacity

2019-08-20 Thread Andrey Zagrebin (Jira)
Andrey Zagrebin created FLINK-13804:
---

 Summary: Collections initial capacity
 Key: FLINK-13804
 URL: https://issues.apache.org/jira/browse/FLINK-13804
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Project Website
Reporter: Andrey Zagrebin
Assignee: Andrey Zagrebin


The code style conclusion to add to web site:
 
Set the initial capacity only if there is a good proven reason to do it. 
Otherwise do not clutter the code with it.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS][CODE STYLE] Create collections always with initial capacity

2019-08-20 Thread Andrey Zagrebin
I created an umbrella issue for the code style guide effort and a subtask
for this discussion:
https://issues.apache.org/jira/browse/FLINK-13804
I will also submit a PR to flink-web based on the conclusion.

On Mon, Aug 19, 2019 at 6:15 PM Stephan Ewen  wrote:

> @Andrey Will you open a PR to add this to the code style?
>
> On Mon, Aug 19, 2019 at 11:51 AM Andrey Zagrebin 
> wrote:
>
> > Hi All,
> >
> > It looks like this proposal has an approval and we can conclude this
> > discussion.
> > Additionally, I agree with Piotr we should really force the proven good
> > reasoning for setting the capacity to avoid confusion, redundancy and
> other
> > already mentioned things while reading and maintaining the code.
> > Ideally the need of setting the capacity should be either immediately
> clear
> > (e.g. perf etc) or explained in comments if it is non-trivial.
> > Although, it can easily enter a grey zone, so I would not demand strictly
> > performance measurement proof e.g. if the size is known and it is "per
> > record" code.
> > At the end of the day it is a decision of the code developer and
> reviewer.
> >
> > The conclusion is then:
> > Set the initial capacity only if there is a good proven reason to do it.
> > Otherwise do not clutter the code with it.
> >
> > Best,
> > Andrey
> >
> > On Thu, Aug 1, 2019 at 5:10 PM Piotr Nowojski 
> wrote:
> >
> > > Hi,
> > >
> > > > - a bit more code, increases maintenance burden.
> > >
> > > I think there is even more to that. It’s almost like a code
> duplication,
> > > albeit expressed in very different way, with all of the drawbacks of
> > > duplicated code: initial capacity can drift out of sync, causing
> > confusion.
> > > Also it’s not “a bit more code”, it might be non trivial
> > > reasoning/calculation how to set the initial value. Whenever we change
> > > something/refactor the code, "maintenance burden” will mostly come from
> > > that.
> > >
> > > Also I think this just usually falls under a premature optimisation
> rule.
> > >
> > > Besides:
> > >
> > > > The conclusion is the following at the moment:
> > > > Only set the initial capacity if you have a good idea about the
> > expected
> > > size.
> > >
> > > I would add a clause to set the initial capacity “only for good proven
> > > reasons”. It’s not about whether we can set it, but whether it makes
> > sense
> > > to do so (to avoid the before mentioned "maintenance burden”).
> > >
> > > Piotrek
> > >
> > > > On 1 Aug 2019, at 14:41, Xintong Song  wrote:
> > > >
> > > > +1 on setting initial capacity only when have good expectation on the
> > > > collection size.
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Thu, Aug 1, 2019 at 2:32 PM Andrey Zagrebin  >
> > > wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> As you probably already noticed, Stephan has triggered a discussion
> > > thread
> > > >> about code style guide for Flink [1]. Recently we were discussing
> > > >> internally some smaller concerns and I would like start separate
> > threads
> > > >> for them.
> > > >>
> > > >> This thread is about creating collections always with initial
> > capacity.
> > > As
> > > >> you might have seen, some parts of our code base always initialise
> > > >> collections with some non-default capacity. You can even activate a
> > > check
> > > >> in IntelliJ Idea that can monitor and highlight creation of
> collection
> > > >> without initial capacity.
> > > >>
> > > >> Pros:
> > > >> - performance gain if there is a good reasoning about initial
> capacity
> > > >> - the capacity is always deterministic and does not depend on any
> > > changes
> > > >> of its default value in Java
> > > >> - easy to follow: always initialise, has IDE support for detection
> > > >>
> > > >> Cons (for initialising w/o good reasoning):
> > > >> - We are trying to outsmart JVM. When there is no good reasoning
> about
> > > >> initial capacity, we can rely on JVM default value.
> > > >> - It is even confusing e.g. for hash maps as the real size depends
> on
> > > the
> > > >> load factor.
> > > >> - It would only add minor performance gain.
> > > >> - a bit more code, increases maintenance burden.
> > > >>
> > > >> The conclusion is the following at the moment:
> > > >> Only set the initial capacity if you have a good idea about the
> > expected
> > > >> size.
> > > >>
> > > >> Please, feel free to share you thoughts.
> > > >>
> > > >> Best,
> > > >> Andrey
> > > >>
> > > >> [1]
> > > >>
> > > >>
> > >
> >
> http://mail-archives.apache.org/mod_mbox/flink-dev/201906.mbox/%3ced91df4b-7cab-4547-a430-85bc710fd...@apache.org%3E
> > > >>
> > >
> > >
> >
>


Re: [DISCUSS] Upgrade kinesis connector to Apache 2.0 License and include it in official release

2019-08-20 Thread Bowen Li
@Stephan @Becket kinesis connector currently is using KCL 1.9. Mass changes
are needed if we switch to KCL 2.x. I agree with Dyana that, since KCL 1.x
is also updated to Apache 2.0, we can just focus on upgrading to a newer
KCL 1.x minor version for now.

On Tue, Aug 20, 2019 at 7:52 AM Dyana Rose  wrote:

> ok great,
>
> that's done, the PR is rebased and squashed on top of master and is running
> through Travis
>
> https://github.com/apache/flink/pull/9494
>
> Dyana
>
> On Tue, 20 Aug 2019 at 15:32, Tzu-Li (Gordon) Tai 
> wrote:
>
> > Hi Dyana,
> >
> > Regarding your question on the Chinese docs:
> > Since the Chinese counterparts for the Kinesis connector documentation
> > isn't translated yet (see docs/dev/connectors/kinesis.zh.md), for now
> you
> > can simply just sync whatever changes you made to the English doc to the
> > Chinese one as well.
> >
> > Cheers,
> > Gordon
> >
>


[jira] [Created] (FLINK-13805) Bad Error Message when TaskManager is lost

2019-08-20 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-13805:


 Summary: Bad Error Message when TaskManager is lost
 Key: FLINK-13805
 URL: https://issues.apache.org/jira/browse/FLINK-13805
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.9.0
Reporter: Stephan Ewen
 Fix For: 1.10.0, 1.9.1


When a TaskManager is lost, the job reports as the failure cause
{code}
org.apache.flink.util.FlinkException: The assigned slot 
6d0e469d55a2630871f43ad0f89c786c_0 was removed.
{code}

That is a pretty bad error message, as a user I don't know what that means. 
Sounds like it could simply refer to internal book keeping, maybe some 
rebalancing or so.
You need to know a lot about Flink to understand that this means actually 
"TaskManager failure".




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Upgrade kinesis connector to Apache 2.0 License and include it in official release

2019-08-20 Thread Thomas Weise
+1 for KCL 1.x changes only.

I also think it would make sense to align FLIP-27 work and KCL 2.x related
changes, since these will require a hardening cycle with extensive testing
that is probably not practical to repeat.



On Tue, Aug 20, 2019 at 10:57 AM Bowen Li  wrote:

> @Stephan @Becket kinesis connector currently is using KCL 1.9. Mass changes
> are needed if we switch to KCL 2.x. I agree with Dyana that, since KCL 1.x
> is also updated to Apache 2.0, we can just focus on upgrading to a newer
> KCL 1.x minor version for now.
>
> On Tue, Aug 20, 2019 at 7:52 AM Dyana Rose 
> wrote:
>
> > ok great,
> >
> > that's done, the PR is rebased and squashed on top of master and is
> running
> > through Travis
> >
> > https://github.com/apache/flink/pull/9494
> >
> > Dyana
> >
> > On Tue, 20 Aug 2019 at 15:32, Tzu-Li (Gordon) Tai 
> > wrote:
> >
> > > Hi Dyana,
> > >
> > > Regarding your question on the Chinese docs:
> > > Since the Chinese counterparts for the Kinesis connector documentation
> > > isn't translated yet (see docs/dev/connectors/kinesis.zh.md), for now
> > you
> > > can simply just sync whatever changes you made to the English doc to
> the
> > > Chinese one as well.
> > >
> > > Cheers,
> > > Gordon
> > >
> >
>


[jira] [Created] (FLINK-13806) Metric Fetcher floods the JM log with errors when TM is lost

2019-08-20 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-13806:


 Summary: Metric Fetcher floods the JM log with errors when TM is 
lost
 Key: FLINK-13806
 URL: https://issues.apache.org/jira/browse/FLINK-13806
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Metrics
Affects Versions: 1.9.0
Reporter: Stephan Ewen
 Fix For: 1.10.0, 1.9.1


When a task manager is lost, the log contains a series of exceptions from the 
metrics fetcher, making it hard to identify the exceptions from the actual job 
failure.

The exception below is contained multiple time (in my example eight times) in a 
simple 4 TM setup after one TM failure.

I would suggest to suppress "failed asks" (timeouts) from the metrics fetcher 
service, because the fetcher has not enough information to distinguish between 
root cause exceptions and follow-up exceptions. In most cases, these exceptions 
should be follow-up to a failure that is handled in the 
scheduler/ExecutionGraph already, and the additional exception logging only add 
noise to the log.

{code}
2019-08-20 22:00:09,865 WARN  
org.apache.flink.runtime.rest.handler.legacy.metrics.MetricFetcherImpl  - 
Requesting TaskManager's path for query services failed.
java.util.concurrent.CompletionException: akka.pattern.AskTimeoutException: Ask 
timed out on [Actor[akka://flink/user/dispatcher#-1834666306]] after [1 
ms]. Message of type 
[org.apache.flink.runtime.rpc.messages.LocalFencedMessage]. A typical reason 
for `AskTimeoutException` is that the recipient actor didn't send a reply.
at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)
at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at 
org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:871)
at akka.dispatch.OnComplete.internal(Future.scala:263)
at akka.dispatch.OnComplete.internal(Future.scala:261)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:191)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:188)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at 
org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:74)
at 
scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
at 
scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
at 
akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:644)
at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:205)
at 
scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)
at 
scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)
at 
scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)
at 
akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:328)
at 
akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:279)
at 
akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:283)
at 
akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:235)
at java.lang.Thread.run(Thread.java:748)
Caused by: akka.pattern.AskTimeoutException: Ask timed out on 
[Actor[akka://flink/user/dispatcher#-1834666306]] after [1 ms]. Message of 
type [org.apache.flink.runtime.rpc.messages.LocalFencedMessage]. A typical 
reason for `AskTimeoutException` is that the recipient actor didn't send a 
reply.
at akka.pattern.PromiseActorRef$$anonfun$2.apply(AskSupport.scala:635)
at akka.pattern.PromiseActorRef$$anonfun$2.apply(AskSupport.scala:635)
at 
akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:648)
... 9 more
{code}




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [VOTE] Apache Flink 1.9.0, release candidate #3

2019-08-20 Thread Stephan Ewen
+1 (binding)

 - Downloaded the binary release tarball
 - started a standalone cluster with four nodes
 - ran some examples through the Web UI
 - checked the logs
 - created a project from the Java quickstarts maven archetype
 - ran a multi-stage DataSet job in batch mode
 - killed as TaskManager and verified correct restart behavior, including
failover region backtracking


I found a few issues, and a common theme here is confusing error reporting
and logging.

(1) When testing batch failover and killing a TaskManager, the job reports
as the failure cause "org.apache.flink.util.FlinkException: The assigned
slot 6d0e469d55a2630871f43ad0f89c786c_0 was removed."
I think that is a pretty bad error message, as a user I don't know what
that means. Some internal book keeping thing?
You need to know a lot about Flink to understand that this means
"TaskManager failure".
https://issues.apache.org/jira/browse/FLINK-13805
I would not block the release on this, but think this should get pretty
urgent attention.

(2) The Metric Fetcher floods the log with error messages when a
TaskManager is lost.
 There are many exceptions being logged by the Metrics Fetcher due to
not reaching the TM any more.
 This pollutes the log and drowns out the original exception and the
meaningful logs from the scheduler/execution graph.
 https://issues.apache.org/jira/browse/FLINK-13806
 Again, I would not block the release on this, but think this should
get pretty urgent attention.

(3) If you put "web.submit.enable: false" into the configuration, the web
UI will still display the "SubmitJob" page, but errors will
continuously pop up, stating "Unable to load requested file /jars."
https://issues.apache.org/jira/browse/FLINK-13799

(4) REST endpoint logs ERROR level messages when selecting the
"Checkpoints" tab for batch jobs. That does not seem correct.
 https://issues.apache.org/jira/browse/FLINK-13795

Best,
Stephan




On Tue, Aug 20, 2019 at 11:32 AM Tzu-Li (Gordon) Tai 
wrote:

> +1
>
> Legal checks:
> - verified signatures and hashes
> - New bundled Javascript dependencies for flink-runtime-web are correctly
> reflected under licenses-binary and NOTICE file.
> - locally built from source (Scala 2.12, without Hadoop)
> - No missing artifacts in staging repo
> - No binaries in source release
>
> Functional checks:
> - Quickstart working (both in IDE + job submission)
> - Simple State Processor API program that performs offline key schema
> migration (RocksDB backend). Generated savepoint is valid to restore from.
> - All E2E tests pass locally
> - Didn’t notice any issues with the new WebUI
>
> Cheers,
> Gordon
>
> On Tue, Aug 20, 2019 at 3:53 AM Zili Chen  wrote:
>
> > +1 (non-binding)
> >
> > - build from source: OK(8u212)
> > - check local setup tutorial works as expected
> >
> > Best,
> > tison.
> >
> >
> > Yu Li  于2019年8月20日周二 上午8:24写道:
> >
> > > +1 (non-binding)
> > >
> > > - checked release notes: OK
> > > - checked sums and signatures: OK
> > > - repository appears to contain all expected artifacts
> > > - source release
> > >  - contains no binaries: OK
> > >  - contains no 1.9-SNAPSHOT references: OK
> > >  - build from source: OK (8u102)
> > > - binary release
> > >  - no examples appear to be missing
> > >  - started a cluster; WebUI reachable, example ran successfully
> > > - checked README.md file and found nothing unexpected
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > > On Tue, 20 Aug 2019 at 01:16, Tzu-Li (Gordon) Tai  >
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > Release candidate #3 for Apache Flink 1.9.0 is now ready for your
> > review.
> > > >
> > > > Please review and vote on release candidate #3 for version 1.9.0, as
> > > > follows:
> > > > [ ] +1, Approve the release
> > > > [ ] -1, Do not approve the release (please provide specific comments)
> > > >
> > > > The complete staging area is available for your review, which
> includes:
> > > > * JIRA release notes [1],
> > > > * the official Apache source release and binary convenience releases
> to
> > > be
> > > > deployed to dist.apache.org [2], which are signed with the key with
> > > > fingerprint 1C1E2394D3194E1944613488F320986D35C33D6A [3],
> > > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > > * source code tag “release-1.9.0-rc3” [5].
> > > > * pull requests for the release note documentation [6] and
> announcement
> > > > blog post [7].
> > > >
> > > > As proposed in the RC2 vote thread [8], for RC3 we are only
> > > cherry-picking
> > > > minimal specific changes on top of RC2 to be able to reasonably carry
> > > over
> > > > previous testing efforts and effectively require a shorter voting
> time.
> > > >
> > > > The only extra commits in this RC, compared to RC2, are the
> following:
> > > > - c2d9aeac [FLINK-13231] [pubsub] Replace Max outstanding
> > acknowledgement
> > > > ids limit with a FlinkConnectorRateLimiter
> > > > - d8941711 [FLINK-1369

[jira] [Created] (FLINK-13807) Flink-avro unit tests fails if the character encoding in the environment is not default to UTF-8

2019-08-20 Thread Ethan Li (Jira)
Ethan Li created FLINK-13807:


 Summary: Flink-avro unit tests fails if the character encoding in 
the environment is not default to UTF-8
 Key: FLINK-13807
 URL: https://issues.apache.org/jira/browse/FLINK-13807
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.8.0
Reporter: Ethan Li



{code:java}
[ERROR] Tests run: 12, Failures: 4, Errors: 0, Skipped: 0, Time elapsed: 4.81 s 
<<< FAILURE! - in org.apache.flink.formats.avro.typeutils.AvroTypeExtractionTest
[ERROR] testSimpleAvroRead[Execution mode = 
CLUSTER](org.apache.flink.formats.avro.typeutils.AvroTypeExtractionTest)  Time 
elapsed: 0.438 s  <<< FAILURE!
java.lang.AssertionError: 
Different elements in arrays: expected 2 elements and received 2
files: [/tmp/junit5386344396421857812/junit6023978980792200274.tmp/4, 
/tmp/junit5386344396421857812/junit6023978980792200274.tmp/2, 
/tmp/junit5386344396421857812/junit6023978980792200274.tmp/1, 
/tmp/junit5386344396421857812/junit6023978980792200274.tmp/3]
 expected: [{"name": "Alyssa", "favorite_number": 256, "favorite_color": null, 
"type_long_test": null, "type_double_test": 123.45, "type_null_test": null, 
"type_bool_test": true, "type_array_string": ["ELEMENT 1", "ELEMENT 2"], 
"type_array_boolean": [true, false], "type_nullable_array": null, "type_enum": 
"GREEN", "type_map": {"KEY 2": 17554, "KEY 1": 8546456}, "type_fixed": null, 
"type_union": null, "type_nested": {"num": 239, "street": "Baker Street", 
"city": "London", "state": "London", "zip": "NW1 6XE"}, "type_bytes": {"bytes": 
"\u\u\u\u\u\u\u\u\u\u"}, "type_date": 
2014-03-01, "type_time_millis": 12:12:12.000, "type_time_micros": 123456, 
"type_timestamp_millis": 2014-03-01T12:12:12.321Z, "type_timestamp_micros": 
123456, "type_decimal_bytes": {"bytes": "\u0007?"}, "type_decimal_fixed": [7, 
-48]}, {"name": "Charlie", "favorite_number": null, "favorite_color": "blue", 
"type_long_test": 1337, "type_double_test": 1.337, "type_null_test": null, 
"type_bool_test": false, "type_array_string": [], "type_array_boolean": [], 
"type_nullable_array": null, "type_enum": "RED", "type_map": {}, "type_fixed": 
null, "type_union": null, "type_nested": {"num": 239, "street": "Baker Street", 
"city": "London", "state": "London", "zip": "NW1 6XE"}, "type_bytes": {"bytes": 
"\u\u\u\u\u\u\u\u\u\u"}, "type_date": 
2014-03-01, "type_time_millis": 12:12:12.000, "type_time_micros": 123456, 
"type_timestamp_millis": 2014-03-01T12:12:12.321Z, "type_timestamp_micros": 
123456, "type_decimal_bytes": {"bytes": "\u0007?"}, "type_decimal_fixed": [7, 
-48]}]
 received: [{"name": "Alyssa", "favorite_number": 256, "favorite_color": null, 
"type_long_test": null, "type_double_test": 123.45, "type_null_test": null, 
"type_bool_test": true, "type_array_string": ["ELEMENT 1", "ELEMENT 2"], 
"type_array_boolean": [true, false], "type_nullable_array": null, "type_enum": 
"GREEN", "type_map": {"KEY 2": 17554, "KEY 1": 8546456}, "type_fixed": null, 
"type_union": null, "type_nested": {"num": 239, "street": "Baker Street", 
"city": "London", "state": "London", "zip": "NW1 6XE"}, "type_bytes": {"bytes": 
"\u\u\u\u\u\u\u\u\u\u"}, "type_date": 
2014-03-01, "type_time_millis": 12:12:12.000, "type_time_micros": 123456, 
"type_timestamp_millis": 2014-03-01T12:12:12.321Z, "type_timestamp_micros": 
123456, "type_decimal_bytes": {"bytes": "\u0007??"}, "type_decimal_fixed": [7, 
-48]}, {"name": "Charlie", "favorite_number": null, "favorite_color": "blue", 
"type_long_test": 1337, "type_double_test": 1.337, "type_null_test": null, 
"type_bool_test": false, "type_array_string": [], "type_array_boolean": [], 
"type_nullable_array": null, "type_enum": "RED", "type_map": {}, "type_fixed": 
null, "type_union": null, "type_nested": {"num": 239, "street": "Baker Street", 
"city": "London", "state": "London", "zip": "NW1 6XE"}, "type_bytes": {"bytes": 
"\u\u\u\u\u\u\u\u\u\u"}, "type_date": 
2014-03-01, "type_time_millis": 12:12:12.000, "type_time_micros": 123456, 
"type_timestamp_millis": 2014-03-01T12:12:12.321Z, "type_timestamp_micros": 
123456, "type_decimal_bytes": {"bytes": "\u0007??"}, "type_decimal_fixed": [7, 
-48]}]
at 
org.apache.flink.formats.avro.typeutils.AvroTypeExtractionTest.after(AvroTypeExtractionTest.java:76)
{code}

Comparing “expected” with “received”, there is really some question mark 
difference.

For example, in “expected’, it’s
{code:java}
"type_decimal_bytes": {"bytes": "\u0007?”}
{code}

While in “received”, it’s 
{code:java}
"type_decimal_bytes": {"bytes": "\u0007??"}
{code}

The environment I ran the unit tests on uses ANSI_X3.4-1968 

I changed to "en_US.UTF-8" and the unit tests passed. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: flink release-1.8.0 Flink-avro unit tests failed

2019-08-20 Thread Ethan Li
I filed a jira https://issues.apache.org/jira/browse/FLINK-13807 for this.
Not sure if I am able to get to it in the near future so I didn't assign to
myself. Anyone feel free to pick it up.

I changed my environment to pass this for now.

On Tue, Aug 20, 2019 at 4:11 AM Stephan Ewen  wrote:

> Thanks, looks like you diagnosed it correctly. environment specific
> encoding settings.
>
> Could you open a ticket (maybe a PR) to set the encoding and make the test
> stable across environments?
>
> On Mon, Aug 19, 2019 at 9:46 PM Ethan Li 
> wrote:
>
> > It’s probably the encoding problem. The environment I ran the unit tests
> > on uses ANSI_X3.4-1968
> >
> > It looks like we have to use "en_US.UTF-8"
> >
> >
> > > On Aug 19, 2019, at 1:44 PM, Ethan Li 
> wrote:
> > >
> > > Hello,
> > >
> > > Not sure if anyone encountered this issue before.  I tried to run “mvn
> > clean install” on flink release-1.8, but some unit tests in flink-arvo
> > module failed:
> > >
> > >
> > > [ERROR] Tests run: 12, Failures: 4, Errors: 0, Skipped: 0, Time
> elapsed:
> > 4.81 s <<< FAILURE! - in
> > org.apache.flink.formats.avro.typeutils.AvroTypeExtractionTest
> > > [ERROR] testSimpleAvroRead[Execution mode =
> > CLUSTER](org.apache.flink.formats.avro.typeutils.AvroTypeExtractionTest)
> > Time elapsed: 0.438 s  <<< FAILURE!
> > > java.lang.AssertionError:
> > > Different elements in arrays: expected 2 elements and received 2
> > > files: [/tmp/junit5386344396421857812/junit6023978980792200274.tmp/4,
> > /tmp/junit5386344396421857812/junit6023978980792200274.tmp/2,
> > /tmp/junit5386344396421857812/junit6023978980792200274.tmp/1,
> > /tmp/junit5386344396421857812/junit6023978980792200274.tmp/3]
> > >  expected: [{"name": "Alyssa", "favorite_number": 256,
> "favorite_color":
> > null, "type_long_test": null, "type_double_test": 123.45,
> "type_null_test":
> > null, "type_bool_test": true, "type_array_string": ["ELEMENT 1", "ELEMENT
> > 2"], "type_array_boolean": [true, false], "type_nullable_array": null,
> > "type_enum": "GREEN", "type_map": {"KEY 2": 17554, "KEY 1": 8546456},
> > "type_fixed": null, "type_union": null, "type_nested": {"num": 239,
> > "street": "Baker Street", "city": "London", "state": "London", "zip":
> "NW1
> > 6XE"}, "type_bytes": {"bytes":
> > "\u\u\u\u\u\u\u\u\u\u"},
> > "type_date": 2014-03-01, "type_time_millis": 12:12:12.000,
> > "type_time_micros": 123456, "type_timestamp_millis":
> > 2014-03-01T12:12:12.321Z, "type_timestamp_micros": 123456,
> > "type_decimal_bytes": {"bytes": "\u0007?"}, "type_decimal_fixed": [7,
> > -48]}, {"name": "Charlie", "favorite_number": null, "favorite_color":
> > "blue", "type_long_test": 1337, "type_double_test": 1.337,
> > "type_null_test": null, "type_bool_test": false, "type_array_string": [],
> > "type_array_boolean": [], "type_nullable_array": null, "type_enum":
> "RED",
> > "type_map": {}, "type_fixed": null, "type_union": null, "type_nested":
> > {"num": 239, "street": "Baker Street", "city": "London", "state":
> "London",
> > "zip": "NW1 6XE"}, "type_bytes": {"bytes":
> > "\u\u\u\u\u\u\u\u\u\u"},
> > "type_date": 2014-03-01, "type_time_millis": 12:12:12.000,
> > "type_time_micros": 123456, "type_timestamp_millis":
> > 2014-03-01T12:12:12.321Z, "type_timestamp_micros": 123456,
> > "type_decimal_bytes": {"bytes": "\u0007?"}, "type_decimal_fixed": [7,
> -48]}]
> > >  received: [{"name": "Alyssa", "favorite_number": 256,
> "favorite_color":
> > null, "type_long_test": null, "type_double_test": 123.45,
> "type_null_test":
> > null, "type_bool_test": true, "type_array_string": ["ELEMENT 1", "ELEMENT
> > 2"], "type_array_boolean": [true, false], "type_nullable_array": null,
> > "type_enum": "GREEN", "type_map": {"KEY 2": 17554, "KEY 1": 8546456},
> > "type_fixed": null, "type_union": null, "type_nested": {"num": 239,
> > "street": "Baker Street", "city": "London", "state": "London", "zip":
> "NW1
> > 6XE"}, "type_bytes": {"bytes":
> > "\u\u\u\u\u\u\u\u\u\u"},
> > "type_date": 2014-03-01, "type_time_millis": 12:12:12.000,
> > "type_time_micros": 123456, "type_timestamp_millis":
> > 2014-03-01T12:12:12.321Z, "type_timestamp_micros": 123456,
> > "type_decimal_bytes": {"bytes": "\u0007??"}, "type_decimal_fixed": [7,
> > -48]}, {"name": "Charlie", "favorite_number": null, "favorite_color":
> > "blue", "type_long_test": 1337, "type_double_test": 1.337,
> > "type_null_test": null, "type_bool_test": false, "type_array_string": [],
> > "type_array_boolean": [], "type_nullable_array": null, "type_enum":
> "RED",
> > "type_map": {}, "type_fixed": null, "type_union": null, "type_nested":
> > {"num": 239, "street": "Baker Street", "city": "London", "state":
> "London",
> > "zip": "NW1 6XE"}, "type_bytes": {"bytes":
> > "\u\u\u\u\u\u\u\u\u\u"},
> > "type_date": 2014-03-01, "type_time_millis": 12:12:12.000,
> > "type_time_micros": 12

Re: [VOTE] Apache Flink 1.9.0, release candidate #3

2019-08-20 Thread Jark Wu
+1 (non-binding)

- build the source release with Scala 2.12 and Scala 2.11 successfully
- checked/verified signatures and hashes
- checked that all POM files point to the same version
- started a cluster, ran a SQL query to temporal join with kafka source and
mysql jdbc table, and write results to kafka again.
  Using DDL (with timestamp type) to create the source and sinks. looks
good. No error in the logs.
- started a cluster, ran a SQL query to read from kafka source and apply a
group aggregation, and write into mysql jdbc table.
  Using DDL (with timestamp type) to create source and sink. looks good
too. No error in the logs.

Cheers,
Jark

On Wed, 21 Aug 2019 at 04:20, Stephan Ewen  wrote:

> +1 (binding)
>
>  - Downloaded the binary release tarball
>  - started a standalone cluster with four nodes
>  - ran some examples through the Web UI
>  - checked the logs
>  - created a project from the Java quickstarts maven archetype
>  - ran a multi-stage DataSet job in batch mode
>  - killed as TaskManager and verified correct restart behavior, including
> failover region backtracking
>
>
> I found a few issues, and a common theme here is confusing error reporting
> and logging.
>
> (1) When testing batch failover and killing a TaskManager, the job reports
> as the failure cause "org.apache.flink.util.FlinkException: The assigned
> slot 6d0e469d55a2630871f43ad0f89c786c_0 was removed."
> I think that is a pretty bad error message, as a user I don't know what
> that means. Some internal book keeping thing?
> You need to know a lot about Flink to understand that this means
> "TaskManager failure".
> https://issues.apache.org/jira/browse/FLINK-13805
> I would not block the release on this, but think this should get pretty
> urgent attention.
>
> (2) The Metric Fetcher floods the log with error messages when a
> TaskManager is lost.
>  There are many exceptions being logged by the Metrics Fetcher due to
> not reaching the TM any more.
>  This pollutes the log and drowns out the original exception and the
> meaningful logs from the scheduler/execution graph.
>  https://issues.apache.org/jira/browse/FLINK-13806
>  Again, I would not block the release on this, but think this should
> get pretty urgent attention.
>
> (3) If you put "web.submit.enable: false" into the configuration, the web
> UI will still display the "SubmitJob" page, but errors will
> continuously pop up, stating "Unable to load requested file /jars."
> https://issues.apache.org/jira/browse/FLINK-13799
>
> (4) REST endpoint logs ERROR level messages when selecting the
> "Checkpoints" tab for batch jobs. That does not seem correct.
>  https://issues.apache.org/jira/browse/FLINK-13795
>
> Best,
> Stephan
>
>
>
>
> On Tue, Aug 20, 2019 at 11:32 AM Tzu-Li (Gordon) Tai 
> wrote:
>
> > +1
> >
> > Legal checks:
> > - verified signatures and hashes
> > - New bundled Javascript dependencies for flink-runtime-web are correctly
> > reflected under licenses-binary and NOTICE file.
> > - locally built from source (Scala 2.12, without Hadoop)
> > - No missing artifacts in staging repo
> > - No binaries in source release
> >
> > Functional checks:
> > - Quickstart working (both in IDE + job submission)
> > - Simple State Processor API program that performs offline key schema
> > migration (RocksDB backend). Generated savepoint is valid to restore
> from.
> > - All E2E tests pass locally
> > - Didn’t notice any issues with the new WebUI
> >
> > Cheers,
> > Gordon
> >
> > On Tue, Aug 20, 2019 at 3:53 AM Zili Chen  wrote:
> >
> > > +1 (non-binding)
> > >
> > > - build from source: OK(8u212)
> > > - check local setup tutorial works as expected
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > Yu Li  于2019年8月20日周二 上午8:24写道:
> > >
> > > > +1 (non-binding)
> > > >
> > > > - checked release notes: OK
> > > > - checked sums and signatures: OK
> > > > - repository appears to contain all expected artifacts
> > > > - source release
> > > >  - contains no binaries: OK
> > > >  - contains no 1.9-SNAPSHOT references: OK
> > > >  - build from source: OK (8u102)
> > > > - binary release
> > > >  - no examples appear to be missing
> > > >  - started a cluster; WebUI reachable, example ran successfully
> > > > - checked README.md file and found nothing unexpected
> > > >
> > > > Best Regards,
> > > > Yu
> > > >
> > > >
> > > > On Tue, 20 Aug 2019 at 01:16, Tzu-Li (Gordon) Tai <
> tzuli...@apache.org
> > >
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Release candidate #3 for Apache Flink 1.9.0 is now ready for your
> > > review.
> > > > >
> > > > > Please review and vote on release candidate #3 for version 1.9.0,
> as
> > > > > follows:
> > > > > [ ] +1, Approve the release
> > > > > [ ] -1, Do not approve the release (please provide specific
> comments)
> > > > >
> > > > > The complete staging area is available for your review, which
> > includes:
> > > > > * JIRA release notes [1],
> > > > > * the offic

Re: [VOTE] Apache Flink 1.9.0, release candidate #3

2019-08-20 Thread vino yang
+1 (non-binding)

- checkout source code and build successfully
- started a local cluster and ran some example jobs successfully
- verified signatures and hashes
- checked release notes and post

Best,
Vino

Stephan Ewen  于2019年8月21日周三 上午4:20写道:

> +1 (binding)
>
>  - Downloaded the binary release tarball
>  - started a standalone cluster with four nodes
>  - ran some examples through the Web UI
>  - checked the logs
>  - created a project from the Java quickstarts maven archetype
>  - ran a multi-stage DataSet job in batch mode
>  - killed as TaskManager and verified correct restart behavior, including
> failover region backtracking
>
>
> I found a few issues, and a common theme here is confusing error reporting
> and logging.
>
> (1) When testing batch failover and killing a TaskManager, the job reports
> as the failure cause "org.apache.flink.util.FlinkException: The assigned
> slot 6d0e469d55a2630871f43ad0f89c786c_0 was removed."
> I think that is a pretty bad error message, as a user I don't know what
> that means. Some internal book keeping thing?
> You need to know a lot about Flink to understand that this means
> "TaskManager failure".
> https://issues.apache.org/jira/browse/FLINK-13805
> I would not block the release on this, but think this should get pretty
> urgent attention.
>
> (2) The Metric Fetcher floods the log with error messages when a
> TaskManager is lost.
>  There are many exceptions being logged by the Metrics Fetcher due to
> not reaching the TM any more.
>  This pollutes the log and drowns out the original exception and the
> meaningful logs from the scheduler/execution graph.
>  https://issues.apache.org/jira/browse/FLINK-13806
>  Again, I would not block the release on this, but think this should
> get pretty urgent attention.
>
> (3) If you put "web.submit.enable: false" into the configuration, the web
> UI will still display the "SubmitJob" page, but errors will
> continuously pop up, stating "Unable to load requested file /jars."
> https://issues.apache.org/jira/browse/FLINK-13799
>
> (4) REST endpoint logs ERROR level messages when selecting the
> "Checkpoints" tab for batch jobs. That does not seem correct.
>  https://issues.apache.org/jira/browse/FLINK-13795
>
> Best,
> Stephan
>
>
>
>
> On Tue, Aug 20, 2019 at 11:32 AM Tzu-Li (Gordon) Tai 
> wrote:
>
> > +1
> >
> > Legal checks:
> > - verified signatures and hashes
> > - New bundled Javascript dependencies for flink-runtime-web are correctly
> > reflected under licenses-binary and NOTICE file.
> > - locally built from source (Scala 2.12, without Hadoop)
> > - No missing artifacts in staging repo
> > - No binaries in source release
> >
> > Functional checks:
> > - Quickstart working (both in IDE + job submission)
> > - Simple State Processor API program that performs offline key schema
> > migration (RocksDB backend). Generated savepoint is valid to restore
> from.
> > - All E2E tests pass locally
> > - Didn’t notice any issues with the new WebUI
> >
> > Cheers,
> > Gordon
> >
> > On Tue, Aug 20, 2019 at 3:53 AM Zili Chen  wrote:
> >
> > > +1 (non-binding)
> > >
> > > - build from source: OK(8u212)
> > > - check local setup tutorial works as expected
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > Yu Li  于2019年8月20日周二 上午8:24写道:
> > >
> > > > +1 (non-binding)
> > > >
> > > > - checked release notes: OK
> > > > - checked sums and signatures: OK
> > > > - repository appears to contain all expected artifacts
> > > > - source release
> > > >  - contains no binaries: OK
> > > >  - contains no 1.9-SNAPSHOT references: OK
> > > >  - build from source: OK (8u102)
> > > > - binary release
> > > >  - no examples appear to be missing
> > > >  - started a cluster; WebUI reachable, example ran successfully
> > > > - checked README.md file and found nothing unexpected
> > > >
> > > > Best Regards,
> > > > Yu
> > > >
> > > >
> > > > On Tue, 20 Aug 2019 at 01:16, Tzu-Li (Gordon) Tai <
> tzuli...@apache.org
> > >
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Release candidate #3 for Apache Flink 1.9.0 is now ready for your
> > > review.
> > > > >
> > > > > Please review and vote on release candidate #3 for version 1.9.0,
> as
> > > > > follows:
> > > > > [ ] +1, Approve the release
> > > > > [ ] -1, Do not approve the release (please provide specific
> comments)
> > > > >
> > > > > The complete staging area is available for your review, which
> > includes:
> > > > > * JIRA release notes [1],
> > > > > * the official Apache source release and binary convenience
> releases
> > to
> > > > be
> > > > > deployed to dist.apache.org [2], which are signed with the key
> with
> > > > > fingerprint 1C1E2394D3194E1944613488F320986D35C33D6A [3],
> > > > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > > > * source code tag “release-1.9.0-rc3” [5].
> > > > > * pull requests for the release note documentation [6] and
> > announcement
> > > > > blog p

Re: [VOTE] Apache Flink 1.9.0, release candidate #3

2019-08-20 Thread Gary Yao
+1 (non-binding)

Reran Jepsen tests 10 times.

On Wed, Aug 21, 2019 at 5:35 AM vino yang  wrote:

> +1 (non-binding)
>
> - checkout source code and build successfully
> - started a local cluster and ran some example jobs successfully
> - verified signatures and hashes
> - checked release notes and post
>
> Best,
> Vino
>
> Stephan Ewen  于2019年8月21日周三 上午4:20写道:
>
> > +1 (binding)
> >
> >  - Downloaded the binary release tarball
> >  - started a standalone cluster with four nodes
> >  - ran some examples through the Web UI
> >  - checked the logs
> >  - created a project from the Java quickstarts maven archetype
> >  - ran a multi-stage DataSet job in batch mode
> >  - killed as TaskManager and verified correct restart behavior, including
> > failover region backtracking
> >
> >
> > I found a few issues, and a common theme here is confusing error
> reporting
> > and logging.
> >
> > (1) When testing batch failover and killing a TaskManager, the job
> reports
> > as the failure cause "org.apache.flink.util.FlinkException: The assigned
> > slot 6d0e469d55a2630871f43ad0f89c786c_0 was removed."
> > I think that is a pretty bad error message, as a user I don't know
> what
> > that means. Some internal book keeping thing?
> > You need to know a lot about Flink to understand that this means
> > "TaskManager failure".
> > https://issues.apache.org/jira/browse/FLINK-13805
> > I would not block the release on this, but think this should get
> pretty
> > urgent attention.
> >
> > (2) The Metric Fetcher floods the log with error messages when a
> > TaskManager is lost.
> >  There are many exceptions being logged by the Metrics Fetcher due to
> > not reaching the TM any more.
> >  This pollutes the log and drowns out the original exception and the
> > meaningful logs from the scheduler/execution graph.
> >  https://issues.apache.org/jira/browse/FLINK-13806
> >  Again, I would not block the release on this, but think this should
> > get pretty urgent attention.
> >
> > (3) If you put "web.submit.enable: false" into the configuration, the web
> > UI will still display the "SubmitJob" page, but errors will
> > continuously pop up, stating "Unable to load requested file /jars."
> > https://issues.apache.org/jira/browse/FLINK-13799
> >
> > (4) REST endpoint logs ERROR level messages when selecting the
> > "Checkpoints" tab for batch jobs. That does not seem correct.
> >  https://issues.apache.org/jira/browse/FLINK-13795
> >
> > Best,
> > Stephan
> >
> >
> >
> >
> > On Tue, Aug 20, 2019 at 11:32 AM Tzu-Li (Gordon) Tai <
> tzuli...@apache.org>
> > wrote:
> >
> > > +1
> > >
> > > Legal checks:
> > > - verified signatures and hashes
> > > - New bundled Javascript dependencies for flink-runtime-web are
> correctly
> > > reflected under licenses-binary and NOTICE file.
> > > - locally built from source (Scala 2.12, without Hadoop)
> > > - No missing artifacts in staging repo
> > > - No binaries in source release
> > >
> > > Functional checks:
> > > - Quickstart working (both in IDE + job submission)
> > > - Simple State Processor API program that performs offline key schema
> > > migration (RocksDB backend). Generated savepoint is valid to restore
> > from.
> > > - All E2E tests pass locally
> > > - Didn’t notice any issues with the new WebUI
> > >
> > > Cheers,
> > > Gordon
> > >
> > > On Tue, Aug 20, 2019 at 3:53 AM Zili Chen 
> wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > - build from source: OK(8u212)
> > > > - check local setup tutorial works as expected
> > > >
> > > > Best,
> > > > tison.
> > > >
> > > >
> > > > Yu Li  于2019年8月20日周二 上午8:24写道:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > - checked release notes: OK
> > > > > - checked sums and signatures: OK
> > > > > - repository appears to contain all expected artifacts
> > > > > - source release
> > > > >  - contains no binaries: OK
> > > > >  - contains no 1.9-SNAPSHOT references: OK
> > > > >  - build from source: OK (8u102)
> > > > > - binary release
> > > > >  - no examples appear to be missing
> > > > >  - started a cluster; WebUI reachable, example ran successfully
> > > > > - checked README.md file and found nothing unexpected
> > > > >
> > > > > Best Regards,
> > > > > Yu
> > > > >
> > > > >
> > > > > On Tue, 20 Aug 2019 at 01:16, Tzu-Li (Gordon) Tai <
> > tzuli...@apache.org
> > > >
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > Release candidate #3 for Apache Flink 1.9.0 is now ready for your
> > > > review.
> > > > > >
> > > > > > Please review and vote on release candidate #3 for version 1.9.0,
> > as
> > > > > > follows:
> > > > > > [ ] +1, Approve the release
> > > > > > [ ] -1, Do not approve the release (please provide specific
> > comments)
> > > > > >
> > > > > > The complete staging area is available for your review, which
> > > includes:
> > > > > > * JIRA release notes [1],
> > > > > > * the official Apache source release and binary convenience
>

Re: [VOTE] Apache Flink 1.9.0, release candidate #3

2019-08-20 Thread Bowen Li
+1 non-binding

- built from source with default profile
- manually ran SQL and Table API tests for Flink's metadata integration
with Hive Metastore in local cluster
- manually ran SQL tests for batch capability with Blink planner and Hive
integration (source/sink/udf) in local cluster
- file formats include: csv, orc, parquet


On Tue, Aug 20, 2019 at 10:23 PM Gary Yao  wrote:

> +1 (non-binding)
>
> Reran Jepsen tests 10 times.
>
> On Wed, Aug 21, 2019 at 5:35 AM vino yang  wrote:
>
> > +1 (non-binding)
> >
> > - checkout source code and build successfully
> > - started a local cluster and ran some example jobs successfully
> > - verified signatures and hashes
> > - checked release notes and post
> >
> > Best,
> > Vino
> >
> > Stephan Ewen  于2019年8月21日周三 上午4:20写道:
> >
> > > +1 (binding)
> > >
> > >  - Downloaded the binary release tarball
> > >  - started a standalone cluster with four nodes
> > >  - ran some examples through the Web UI
> > >  - checked the logs
> > >  - created a project from the Java quickstarts maven archetype
> > >  - ran a multi-stage DataSet job in batch mode
> > >  - killed as TaskManager and verified correct restart behavior,
> including
> > > failover region backtracking
> > >
> > >
> > > I found a few issues, and a common theme here is confusing error
> > reporting
> > > and logging.
> > >
> > > (1) When testing batch failover and killing a TaskManager, the job
> > reports
> > > as the failure cause "org.apache.flink.util.FlinkException: The
> assigned
> > > slot 6d0e469d55a2630871f43ad0f89c786c_0 was removed."
> > > I think that is a pretty bad error message, as a user I don't know
> > what
> > > that means. Some internal book keeping thing?
> > > You need to know a lot about Flink to understand that this means
> > > "TaskManager failure".
> > > https://issues.apache.org/jira/browse/FLINK-13805
> > > I would not block the release on this, but think this should get
> > pretty
> > > urgent attention.
> > >
> > > (2) The Metric Fetcher floods the log with error messages when a
> > > TaskManager is lost.
> > >  There are many exceptions being logged by the Metrics Fetcher due
> to
> > > not reaching the TM any more.
> > >  This pollutes the log and drowns out the original exception and
> the
> > > meaningful logs from the scheduler/execution graph.
> > >  https://issues.apache.org/jira/browse/FLINK-13806
> > >  Again, I would not block the release on this, but think this
> should
> > > get pretty urgent attention.
> > >
> > > (3) If you put "web.submit.enable: false" into the configuration, the
> web
> > > UI will still display the "SubmitJob" page, but errors will
> > > continuously pop up, stating "Unable to load requested file /jars."
> > > https://issues.apache.org/jira/browse/FLINK-13799
> > >
> > > (4) REST endpoint logs ERROR level messages when selecting the
> > > "Checkpoints" tab for batch jobs. That does not seem correct.
> > >  https://issues.apache.org/jira/browse/FLINK-13795
> > >
> > > Best,
> > > Stephan
> > >
> > >
> > >
> > >
> > > On Tue, Aug 20, 2019 at 11:32 AM Tzu-Li (Gordon) Tai <
> > tzuli...@apache.org>
> > > wrote:
> > >
> > > > +1
> > > >
> > > > Legal checks:
> > > > - verified signatures and hashes
> > > > - New bundled Javascript dependencies for flink-runtime-web are
> > correctly
> > > > reflected under licenses-binary and NOTICE file.
> > > > - locally built from source (Scala 2.12, without Hadoop)
> > > > - No missing artifacts in staging repo
> > > > - No binaries in source release
> > > >
> > > > Functional checks:
> > > > - Quickstart working (both in IDE + job submission)
> > > > - Simple State Processor API program that performs offline key schema
> > > > migration (RocksDB backend). Generated savepoint is valid to restore
> > > from.
> > > > - All E2E tests pass locally
> > > > - Didn’t notice any issues with the new WebUI
> > > >
> > > > Cheers,
> > > > Gordon
> > > >
> > > > On Tue, Aug 20, 2019 at 3:53 AM Zili Chen 
> > wrote:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > - build from source: OK(8u212)
> > > > > - check local setup tutorial works as expected
> > > > >
> > > > > Best,
> > > > > tison.
> > > > >
> > > > >
> > > > > Yu Li  于2019年8月20日周二 上午8:24写道:
> > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > - checked release notes: OK
> > > > > > - checked sums and signatures: OK
> > > > > > - repository appears to contain all expected artifacts
> > > > > > - source release
> > > > > >  - contains no binaries: OK
> > > > > >  - contains no 1.9-SNAPSHOT references: OK
> > > > > >  - build from source: OK (8u102)
> > > > > > - binary release
> > > > > >  - no examples appear to be missing
> > > > > >  - started a cluster; WebUI reachable, example ran
> successfully
> > > > > > - checked README.md file and found nothing unexpected
> > > > > >
> > > > > > Best Regards,
> > > > > > Yu
> > > > > >
> > > > > >
> > > > > > On Tue, 20 Aug 2019 at 01:16, Tzu

Re: [DISCUSS][CODE STYLE] Breaking long function argument lists and chained method calls

2019-08-20 Thread Zili Chen
Implement question: how to apply the line length rules?

If we just turn on checkstyle rule "LineLength" then a huge
effort is required to break lines those break the rule. If
we use an auto-formatter here then it possibly break line
"just at the position" awfully.

Is it possible we require only to fit the rule on the fly
a pull request touch files?

Best,
tison.


Yu Li  于2019年8月20日周二 下午5:22写道:

> I second Stephan's summarize, and to be more explicit, +1 on:
> - Set a hard line length limit
> - Allow arguments on the same line if below length limit
> - With consistent argument breaking when that length is exceeded
> - Developers can break before that if they feel it helps with readability
>
> FWIW, hbase project also sets the line length limit to 100 [1] (personally
> I don't have any tendency on this, so JFYI).
>
> [1]
>
> https://github.com/apache/hbase/blob/a59f7d4ffc27ea23b9822c3c26d6aeb76ccdf9aa/hbase-checkstyle/src/main/resources/hbase/checkstyle.xml#L128
>
> Best Regards,
> Yu
>
>
> On Mon, 19 Aug 2019 at 18:22, Stephan Ewen  wrote:
>
> > I personally prefer not to break lines with few parameters.
> > It just feels unnecessarily clumsy to parse the breaks if there are only
> > two or three arguments with short names.
> >
> > So +1
> >   - for a hard line length limit
> >   - allowing arguments on the same line if below that limit
> >   - with consistent argument breaking when that length is exceeded
> >   - developers can break before that if they feel it helps with
> > readability.
> >
> > This should be similar to what we have, except for enforcing the line
> > length limit.
> >
> > I think our Java guide originally suggested 120 characters line length,
> but
> > we can reduce that to 100 if a majority argues for that, but it is
> separate
> > discussion.
> > We uses shorter lines in Scala (100 chars) because Scala code becomes
> > harder to read faster with long lines.
> >
> >
> > On Mon, Aug 19, 2019 at 10:45 AM Andrey Zagrebin 
> > wrote:
> >
> > > Hi Everybody,
> > >
> > > Thanks for your feedback guys and sorry for not getting back to the
> > > discussion for some time.
> > >
> > > @SHI Xiaogang
> > > About breaking lines for thrown exceptions:
> > > Indeed that would prevent growing the throw clause indefinitely.
> > > I am a bit concerned about putting the right parenthesis and/or throw
> > > clause on the next line
> > > because in general we do not it and there are a lot of variations of
> how
> > > and what to put to the next line so it needs explicit memorising.
> > > Also, we do not have many checked exceptions and usually avoid them.
> > > Although I am not a big fan of many function arguments either but this
> > > seems to be a bigger problem in the code base.
> > > I would be ok to not enforce anything for exceptions atm.
> > >
> > > @Chesnay Schepler 
> > > Thanks for mentioning automatic checks.
> > > Indeed, pointing out this kind of style issues during PR reviews is
> very
> > > tedious
> > > and cannot really force it without automated tools.
> > > I would still consider the outcome of this discussion as a soft
> > > recommendation atm (which we also have for some other things in the
> code
> > > style draft).
> > > We need more investigation about how to enforce things. I am not so
> > > knowledgable about code style/IDE checks.
> > > From the first glance I also do not see a simple way. If somebody has
> > more
> > > insight please share your experience.
> > >
> > > @Biao Liu 
> > > Line length limitation:
> > > I do not see anything for Java, only for Scala: 100 (also enforced by
> > build
> > > AFAIK).
> > > From what I heard there has been already some discussion about the hard
> > > limit for the line length.
> > > Although quite some people are in favour of it (including me) and seems
> > to
> > > be a nice limitation,
> > > there are some practical implication about it.
> > > Historically, Flink did not have any code style checks and huge chunks
> of
> > > code base have to be reformatted destroying the commit history.
> > > Another thing is value for the limit. Nowadays, we have wide screens
> and
> > do
> > > not often even need to scroll.
> > > Nevertheless, we can kick off another discussion about the line length
> > > limit and enforcing it.
> > > Atm I see people to adhere to a soft recommendation of 120 line length
> > for
> > > Java because it is usually a bit more verbose comparing to Scala.
> > >
> > > *Question 1*:
> > > I would be ok to always break line if there is more than one chained
> > call.
> > > There are a lot of places where this is only one short call I would not
> > > break line in this case.
> > > If it is too confusing I would be ok to stick to the rule to break
> either
> > > all or none.
> > > Thanks for pointing out this explicitly: For a chained method calls,
> the
> > > new line should be started with the dot.
> > > I think it should be also part of the rule if forced.
> > >
> > > *Question 2:*
> > > The indent of new line should be 1 ta