Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-23 Thread Zhu Zhu
Thanks Yun for starting this discussion.
I think the multicasting can be very helpful in certain cases.

I have received requirements from users that they want to do broadcast
join, while the data set to broadcast is too large to fit in one task.
Thus the requirement turned out to be to support cartesian product of 2
data set(one of which can be infinite stream).
For example, A(parallelism=2) broadcast join B(parallelism=2) in JobVertex
C.
The idea to is have 4 C subtasks to deal with different combinations of A/B
partitions, like C1(A1,B1), C2(A1,B2), C3(A2,B1), C4(A2,B2).
This requires one record to be sent to multiple downstream subtasks, but
not to all subtasks.

With current interface this is not supported, as one record can only be
sent to one subtask, or to all subtasks of a JobVertex.
And the user had to split the broadcast data set manually to several
different JobVertices, which is hard to maintain and extend.

Thanks,
Zhu Zhu

Yun Gao  于2019年8月22日周四 下午8:42写道:

> Hi everyone,
>   In some scenarios we met a requirement that some operators want to
> send records to theirs downstream operators with an multicast communication
> pattern. In detail, for some records, the operators want to send them
> according to the partitioner (for example, Rebalance), and for some other
> records, the operators want to send them to all the connected operators and
> tasks. Such a communication pattern could be viewed as a kind of multicast:
> it does not broadcast every record, but some record will indeed be sent to
> multiple downstream operators.
>
> However, we found that this kind of communication pattern seems could not
> be implemented rightly if the operators have multiple consumers with
> different parallelism, using the customized partitioner. To solve the above
> problem, we propose to enhance the support for such kind of irregular
> communication pattern. We think there may be two options:
>
>  1. Support a kind of customized operator events, which share much
> similarity with Watermark, and these events can be broadcasted to the
> downstream operators separately.
>  2. Let the channel selector supports multicast, and also add the
> separate RecordWriter implementation to avoid impacting the performance of
> the channel selector that does not need multicast.
>
> The problem and options are detailed in
> https://docs.google.com/document/d/1npi5c_SeP68KuT2lNdKd8G7toGR_lxQCGOnZm_hVMks/edit?usp=sharing
>
> We are also wondering if there are other methods to implement this
> requirement with or without changing Runtime. Very thanks for any feedbacks
> !
>
>
> Best,
> Yun
>
>


Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-23 Thread Piotr Nowojski
Hi,

Yun:

Thanks for proposing the idea. I have checked the document and left couple of 
questions there, but it might be better to answer them here.

What is the exact motivation and what problems do you want to solve? We have 
dropped multicast support from the network stack [1] for two reasons:
1. Performance 
2. Code simplicity 

The proposal to re introduce `int[] ChannelSelector#selectChannels()` would 
revert those changes. At that time we were thinking about a way how to keep the 
multicast support on the network level, while keeping the performance and 
simplicity for non multicast cases and there are ways to achieve that. However 
they would add extra complexity to Flink, which it would be better to avoid.

On the other hand, supporting dual pattern: standard partitioning or 
broadcasting is easy to do, as LatencyMarkers are doing exactly that. It would 
be just a matter of exposing this to the user in some way. So before we go any 
further, can you describe your use cases/motivation? Isn’t mix of standard 
partitioning and broadcasting enough? Do we need multicasting?

Zhu:

Could you rephrase your example? I didn’t quite understand it.

Piotrek

[1] https://issues.apache.org/jira/browse/FLINK-10662 


> On 23 Aug 2019, at 09:17, Zhu Zhu  wrote:
> 
> Thanks Yun for starting this discussion.
> I think the multicasting can be very helpful in certain cases.
> 
> I have received requirements from users that they want to do broadcast
> join, while the data set to broadcast is too large to fit in one task.
> Thus the requirement turned out to be to support cartesian product of 2
> data set(one of which can be infinite stream).
> For example, A(parallelism=2) broadcast join B(parallelism=2) in JobVertex
> C.
> The idea to is have 4 C subtasks to deal with different combinations of A/B
> partitions, like C1(A1,B1), C2(A1,B2), C3(A2,B1), C4(A2,B2).
> This requires one record to be sent to multiple downstream subtasks, but
> not to all subtasks.
> 
> With current interface this is not supported, as one record can only be
> sent to one subtask, or to all subtasks of a JobVertex.
> And the user had to split the broadcast data set manually to several
> different JobVertices, which is hard to maintain and extend.
> 
> Thanks,
> Zhu Zhu
> 
> Yun Gao  于2019年8月22日周四 下午8:42写道:
> 
>> Hi everyone,
>>  In some scenarios we met a requirement that some operators want to
>> send records to theirs downstream operators with an multicast communication
>> pattern. In detail, for some records, the operators want to send them
>> according to the partitioner (for example, Rebalance), and for some other
>> records, the operators want to send them to all the connected operators and
>> tasks. Such a communication pattern could be viewed as a kind of multicast:
>> it does not broadcast every record, but some record will indeed be sent to
>> multiple downstream operators.
>> 
>> However, we found that this kind of communication pattern seems could not
>> be implemented rightly if the operators have multiple consumers with
>> different parallelism, using the customized partitioner. To solve the above
>> problem, we propose to enhance the support for such kind of irregular
>> communication pattern. We think there may be two options:
>> 
>> 1. Support a kind of customized operator events, which share much
>> similarity with Watermark, and these events can be broadcasted to the
>> downstream operators separately.
>> 2. Let the channel selector supports multicast, and also add the
>> separate RecordWriter implementation to avoid impacting the performance of
>> the channel selector that does not need multicast.
>> 
>> The problem and options are detailed in
>> https://docs.google.com/document/d/1npi5c_SeP68KuT2lNdKd8G7toGR_lxQCGOnZm_hVMks/edit?usp=sharing
>> 
>> We are also wondering if there are other methods to implement this
>> requirement with or without changing Runtime. Very thanks for any feedbacks
>> !
>> 
>> 
>> Best,
>> Yun
>> 
>> 



Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Gavin Lee
Why bin/start-scala-shell.sh local return following error?

bin/start-scala-shell.sh local

Error: Could not find or load main class
org.apache.flink.api.scala.FlinkShell
For flink 1.8.1 and previous ones, no such issues.

On Fri, Aug 23, 2019 at 2:05 PM qi luo  wrote:

> Congratulations and thanks for the hard work!
>
> Qi
>
> On Aug 22, 2019, at 8:03 PM, Tzu-Li (Gordon) Tai 
> wrote:
>
> The Apache Flink community is very happy to announce the release of Apache
> Flink 1.9.0, which is the latest major release.
>
> Apache Flink® is an open-source stream processing framework for
> distributed, high-performing, always-available, and accurate data streaming
> applications.
>
> The release is available for download at:
> https://flink.apache.org/downloads.html
>
> Please check out the release blog post for an overview of the improvements
> for this new major release:
> https://flink.apache.org/news/2019/08/22/release-1.9.0.html
>
> The full release notes are available in Jira:
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344601
>
> We would like to thank all contributors of the Apache Flink community who
> made this release possible!
>
> Cheers,
> Gordon
>
>
>

-- 
Gavin


[jira] [Created] (FLINK-13823) Incorrect debug log in CompileUtils

2019-08-23 Thread wangsan (Jira)
wangsan created FLINK-13823:
---

 Summary: Incorrect debug log in CompileUtils
 Key: FLINK-13823
 URL: https://issues.apache.org/jira/browse/FLINK-13823
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.9.0
Reporter: wangsan
 Fix For: 1.9.1


There is a typo in `CompileUtils`:

```java
CODE_LOG.debug("Compiling: %s \n\n Code:\n%s", name, code);
```

The placeholder  should be `{}` instead of `%s`.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13824) Code duplication in tools/trawis_watchdog.sh to launch watchdog process

2019-08-23 Thread Alex (Jira)
Alex created FLINK-13824:


 Summary: Code duplication in tools/trawis_watchdog.sh to launch 
watchdog process
 Key: FLINK-13824
 URL: https://issues.apache.org/jira/browse/FLINK-13824
 Project: Flink
  Issue Type: Bug
  Components: Travis
Reporter: Alex


{travis_watchdog.sh} scripts launches monitoring processes for Java processes 
(spawned by {mvn install} and {mvn verify}).
The launch logic is copy-pasted twice and prints wrong pid in the second use 
case.




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13825) The original plugins dir is not restored after e2e test run

2019-08-23 Thread Alex (Jira)
Alex created FLINK-13825:


 Summary: The original plugins dir is not restored after e2e test 
run
 Key: FLINK-13825
 URL: https://issues.apache.org/jira/browse/FLINK-13825
 Project: Flink
  Issue Type: Bug
Reporter: Alex


Previously, the result of Flink distribution build didn't contain {plugins} dir.
Instead, for some e2e tests, the directory was created (and removed) by a 
test's setup steps.

FLINK-12868 has added a pre-created {plugins} dir into the Flink distribution 
build, but without adjusting the e2e tests. As the result, after some e2e tests 
run, the original directory is removed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: CiBot Update

2019-08-23 Thread Chesnay Schepler
@Ethan Li The source for the CiBot is available here 
. The implementation of this 
command is tightly connected to how the CiBot works; but conceptually it 
looks at a PR, finds the most recent build that ran, and uses the Travis 
REST API to restart the build.
Additionally, it keeps track of which comments have been processed by 
storing the comment ID in the CI report.

If you have further questions, feel free to ping me directly.

@Dianfu I agree, we should include it somewhere in either the flinkbot 
template or the CI report.


On 23/08/2019 03:35, Dian Fu wrote:

Thanks Chesnay for your great work! A very useful feature!

Just one minor suggestion: It will be better if we could add this command to the section 
"Bot commands" in the flinkbot template.

Regards,
Dian


在 2019年8月23日,上午2:06,Ethan Li  写道:

My question is specifically about implementation of "@flinkbot run travis"


On Aug 22, 2019, at 1:06 PM, Ethan Li  wrote:

Hi Chesnay,

This is really nice feature!

Can I ask how is this implemented? Do you have the related Jira/PR/docs that I 
can take a look? I’d like to introduce it to another project if applicable. 
Thank you very much!

Best,
Ethan


On Aug 22, 2019, at 8:34 AM, Biao Liu mailto:mmyy1...@gmail.com>> wrote:

Thanks Chesnay a lot,

I love this feature!

Thanks,
Biao /'bɪ.aʊ/



On Thu, 22 Aug 2019 at 20:55, Hequn Cheng mailto:chenghe...@gmail.com>> wrote:


Cool, thanks Chesnay a lot for the improvement!

Best, Hequn

On Thu, Aug 22, 2019 at 5:02 PM Zhu Zhu mailto:reed...@gmail.com>> wrote:


Thanks Chesnay for the CI improvement!
It is very helpful.

Thanks,
Zhu Zhu

zhijiang mailto:wangzhijiang...@aliyun.com.invalid>> 于2019年8月22日周四 下午4:18写道:


It is really very convenient now. Valuable work, Chesnay!

Best,
Zhijiang
--
From:Till Rohrmann mailto:trohrm...@apache.org>>
Send Time:2019年8月22日(星期四) 10:13
To:dev mailto:dev@flink.apache.org>>
Subject:Re: CiBot Update

Thanks for the continuous work on the CiBot Chesnay!

Cheers,
Till

On Thu, Aug 22, 2019 at 9:47 AM Jark Wu mailto:imj...@gmail.com>> wrote:


Great work! Thanks Chesnay!



On Thu, 22 Aug 2019 at 15:42, Xintong Song mailto:tonysong...@gmail.com>>

wrote:

The re-triggering travis feature is so convenient. Thanks Chesnay~!

Thank you~

Xintong Song



On Thu, Aug 22, 2019 at 9:26 AM Stephan Ewen mailto:se...@apache.org>>

wrote:

Nice, thanks!

On Thu, Aug 22, 2019 at 3:59 AM Zili Chen mailto:wander4...@gmail.com>>

wrote:

Thanks for your announcement. Nice work!

Best,
tison.


vino yang mailto:yanghua1...@gmail.com>> 于2019年8月22日周四 
上午8:14写道:


+1 for "@flinkbot run travis", it is very convenient.

Chesnay Schepler mailto:ches...@apache.org>> 于2019年8月21日周三

下午9:12写道:

Hi everyone,

this is an update on recent changes to the CI bot.


The bot now cancels builds if a new commit was added to a

PR,

and

cancels all builds if the PR was closed.
(This was implemented a while ago; I'm just mentioning it

again

for

discoverability)


Additionally, starting today you can now re-trigger a

Travis

run

by

writing a comment "@flinkbot run travis"; this means you no

longer

have

to commit an empty commit or do other shenanigans to get

another

build

running.
Note that this will /not/ work if the PR was re-opened,

until

at

least

1

new build was triggered by a push.









Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-23 Thread Zhu Zhu
Hi Piotr,

The case is about a broadcast join:
A--\
 +--(join)--> C
B--/

Usually we can broadcast A(the result that JobVertex A produces) to all
subtasks of C.
But in this case the size of A is too large to fit in one subtask of C.
Thus we have to partition A to (A_0, A_1, A_2, ..., A_m-1).
The throughput of B is too large to deal in one subtask as well. And we
partition B into (B_0, B_1, B_2, ..., B_n-1).

Now if we want to join A and B, the basic idea is to set parallelism of C
to be m*n, and subtask C_kn+l should deal with the join work of (A_k, B_l).
To achieve this,
each record in partition A_k should to sent to *n* downstream subtasks:
{C_kn, C_kn+1, C_kn+2, ..., C_kn+n-1}
each record in partition B_l should to sent to *m* downstream
subtasks:  {C_l, C_n+l, C_2n+l, ..., C_(m-1)n+l}

This is different from current single-cast or broad-cast way.
That's why I think multi-cast can help with this case.

Thanks,
Zhu Zhu

Piotr Nowojski  于2019年8月23日周五 下午3:20写道:

> Hi,
>
> Yun:
>
> Thanks for proposing the idea. I have checked the document and left couple
> of questions there, but it might be better to answer them here.
>
> What is the exact motivation and what problems do you want to solve? We
> have dropped multicast support from the network stack [1] for two reasons:
> 1. Performance
> 2. Code simplicity
>
> The proposal to re introduce `int[] ChannelSelector#selectChannels()`
> would revert those changes. At that time we were thinking about a way how
> to keep the multicast support on the network level, while keeping the
> performance and simplicity for non multicast cases and there are ways to
> achieve that. However they would add extra complexity to Flink, which it
> would be better to avoid.
>
> On the other hand, supporting dual pattern: standard partitioning or
> broadcasting is easy to do, as LatencyMarkers are doing exactly that. It
> would be just a matter of exposing this to the user in some way. So before
> we go any further, can you describe your use cases/motivation? Isn’t mix of
> standard partitioning and broadcasting enough? Do we need multicasting?
>
> Zhu:
>
> Could you rephrase your example? I didn’t quite understand it.
>
> Piotrek
>
> [1] https://issues.apache.org/jira/browse/FLINK-10662 <
> https://issues.apache.org/jira/browse/FLINK-10662>
>
> > On 23 Aug 2019, at 09:17, Zhu Zhu  wrote:
> >
> > Thanks Yun for starting this discussion.
> > I think the multicasting can be very helpful in certain cases.
> >
> > I have received requirements from users that they want to do broadcast
> > join, while the data set to broadcast is too large to fit in one task.
> > Thus the requirement turned out to be to support cartesian product of 2
> > data set(one of which can be infinite stream).
> > For example, A(parallelism=2) broadcast join B(parallelism=2) in
> JobVertex
> > C.
> > The idea to is have 4 C subtasks to deal with different combinations of
> A/B
> > partitions, like C1(A1,B1), C2(A1,B2), C3(A2,B1), C4(A2,B2).
> > This requires one record to be sent to multiple downstream subtasks, but
> > not to all subtasks.
> >
> > With current interface this is not supported, as one record can only be
> > sent to one subtask, or to all subtasks of a JobVertex.
> > And the user had to split the broadcast data set manually to several
> > different JobVertices, which is hard to maintain and extend.
> >
> > Thanks,
> > Zhu Zhu
> >
> > Yun Gao  于2019年8月22日周四 下午8:42写道:
> >
> >> Hi everyone,
> >>  In some scenarios we met a requirement that some operators want to
> >> send records to theirs downstream operators with an multicast
> communication
> >> pattern. In detail, for some records, the operators want to send them
> >> according to the partitioner (for example, Rebalance), and for some
> other
> >> records, the operators want to send them to all the connected operators
> and
> >> tasks. Such a communication pattern could be viewed as a kind of
> multicast:
> >> it does not broadcast every record, but some record will indeed be sent
> to
> >> multiple downstream operators.
> >>
> >> However, we found that this kind of communication pattern seems could
> not
> >> be implemented rightly if the operators have multiple consumers with
> >> different parallelism, using the customized partitioner. To solve the
> above
> >> problem, we propose to enhance the support for such kind of irregular
> >> communication pattern. We think there may be two options:
> >>
> >> 1. Support a kind of customized operator events, which share much
> >> similarity with Watermark, and these events can be broadcasted to the
> >> downstream operators separately.
> >> 2. Let the channel selector supports multicast, and also add the
> >> separate RecordWriter implementation to avoid impacting the performance
> of
> >> the channel selector that does not need multicast.
> >>
> >> The problem and options are detailed in
> >>
> https://docs.google.com/document/d/1npi5c_SeP68KuT2lNdKd8G7toGR_lxQCGOnZm_hVMks/edit?usp=sharin

[jira] [Created] (FLINK-13826) Support INSERT OVERWRITE for Hive connector

2019-08-23 Thread Rui Li (Jira)
Rui Li created FLINK-13826:
--

 Summary: Support INSERT OVERWRITE for Hive connector
 Key: FLINK-13826
 URL: https://issues.apache.org/jira/browse/FLINK-13826
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive, Table SQL / Planner
Reporter: Rui Li






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13827) shell variable should be escaped in start-scala-shell.sh

2019-08-23 Thread TisonKun (Jira)
TisonKun created FLINK-13827:


 Summary: shell variable should be escaped in start-scala-shell.sh
 Key: FLINK-13827
 URL: https://issues.apache.org/jira/browse/FLINK-13827
 Project: Flink
  Issue Type: Bug
  Components: Scala Shell
Affects Versions: 1.9.0, 1.10.0
Reporter: TisonKun
 Fix For: 1.10.0, 1.9.1



{code:java}
diff --git a/flink-scala-shell/start-script/start-scala-shell.sh 
b/flink-scala-shell/start-script/start-scala-shell.sh
index b6da81af72..65b9045584 100644
--- a/flink-scala-shell/start-script/start-scala-shell.sh
+++ b/flink-scala-shell/start-script/start-scala-shell.sh
@@ -97,9 +97,9 @@ log_setting="-Dlog.file="$LOG" 
-Dlog4j.configuration=file:"$FLINK_CONF_DIR"/$LOG
 
 if ${EXTERNAL_LIB_FOUND}
 then
-$JAVA_RUN -Dscala.color -cp "$FLINK_CLASSPATH" $log_setting 
org.apache.flink.api.scala.FlinkShell $@ --addclasspath "$EXT_CLASSPATH"
+$JAVA_RUN -Dscala.color -cp "$FLINK_CLASSPATH" "$log_setting" 
org.apache.flink.api.scala.FlinkShell $@ --addclasspath "$EXT_CLASSPATH"
 else
-$JAVA_RUN -Dscala.color -cp "$FLINK_CLASSPATH" $log_setting 
org.apache.flink.api.scala.FlinkShell $@
+$JAVA_RUN -Dscala.color -cp "$FLINK_CLASSPATH" "$log_setting" 
org.apache.flink.api.scala.FlinkShell $@
 fi
 
 #restore echo
{code}

otherwise it is error prone when {{$log_setting}} contain arbitrary content. 




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Zili Chen
Hi Gavin,

I also find a problem in shell if the directory contain whitespace
then the final command to run is incorrect. Could you check the
final command to be executed?

FYI, here is the ticket[1].

Best,
tison.

[1] https://issues.apache.org/jira/browse/FLINK-13827


Gavin Lee  于2019年8月23日周五 下午3:36写道:

> Why bin/start-scala-shell.sh local return following error?
>
> bin/start-scala-shell.sh local
>
> Error: Could not find or load main class
> org.apache.flink.api.scala.FlinkShell
> For flink 1.8.1 and previous ones, no such issues.
>
> On Fri, Aug 23, 2019 at 2:05 PM qi luo  wrote:
>
>> Congratulations and thanks for the hard work!
>>
>> Qi
>>
>> On Aug 22, 2019, at 8:03 PM, Tzu-Li (Gordon) Tai 
>> wrote:
>>
>> The Apache Flink community is very happy to announce the release of
>> Apache Flink 1.9.0, which is the latest major release.
>>
>> Apache Flink® is an open-source stream processing framework for
>> distributed, high-performing, always-available, and accurate data streaming
>> applications.
>>
>> The release is available for download at:
>> https://flink.apache.org/downloads.html
>>
>> Please check out the release blog post for an overview of the
>> improvements for this new major release:
>> https://flink.apache.org/news/2019/08/22/release-1.9.0.html
>>
>> The full release notes are available in Jira:
>>
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344601
>>
>> We would like to thank all contributors of the Apache Flink community who
>> made this release possible!
>>
>> Cheers,
>> Gordon
>>
>>
>>
>
> --
> Gavin
>


Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-23 Thread Yun Gao
   Hi Piotr,

Thanks a lot for the suggestions!

The core motivation of this discussion is to implement a new iteration 
library on the DataStream, and it requires to insert special records in the 
stream to notify the progress of the iteration. The mechanism of such records 
is very similar to the current Watermark, and we meet the problem of sending 
normal records according to the partition (Rebalance, etc..) and also be able 
to broadcast the inserted progress records to all the connected records. I have 
read the notes in the google doc and I totally agree with that exploring the 
broadcast interface in RecordWriter in some way is able to solve this issue. 

   Regarding to `int[] ChannelSelector#selectChannels()`, I'm wondering if 
we introduce a new MulticastRecordWriter and left the current RecordWriter 
untouched, could we avoid the performance degradation ? Since with such a 
modification the normal RecordWriter does not need to iterate the return array 
by ChannelSelector, and the only difference will be returning an array instead 
of an integer, and accessing the first element of the returned array instead of 
reading the integer directly.

Best,
Yun


--
From:Piotr Nowojski 
Send Time:2019 Aug. 23 (Fri.) 15:20
To:dev 
Cc:Yun Gao 
Subject:Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

Hi,

Yun:

Thanks for proposing the idea. I have checked the document and left couple of 
questions there, but it might be better to answer them here.

What is the exact motivation and what problems do you want to solve? We have 
dropped multicast support from the network stack [1] for two reasons:
1. Performance 
2. Code simplicity 

The proposal to re introduce `int[] ChannelSelector#selectChannels()` would 
revert those changes. At that time we were thinking about a way how to keep the 
multicast support on the network level, while keeping the performance and 
simplicity for non multicast cases and there are ways to achieve that. However 
they would add extra complexity to Flink, which it would be better to avoid.

On the other hand, supporting dual pattern: standard partitioning or 
broadcasting is easy to do, as LatencyMarkers are doing exactly that. It would 
be just a matter of exposing this to the user in some way. So before we go any 
further, can you describe your use cases/motivation? Isn’t mix of standard 
partitioning and broadcasting enough? Do we need multicasting?

Zhu:

Could you rephrase your example? I didn’t quite understand it.

Piotrek

[1] https://issues.apache.org/jira/browse/FLINK-10662


On 23 Aug 2019, at 09:17, Zhu Zhu  wrote:
Thanks Yun for starting this discussion.
I think the multicasting can be very helpful in certain cases.

I have received requirements from users that they want to do broadcast
join, while the data set to broadcast is too large to fit in one task.
Thus the requirement turned out to be to support cartesian product of 2
data set(one of which can be infinite stream).
For example, A(parallelism=2) broadcast join B(parallelism=2) in JobVertex
C.
The idea to is have 4 C subtasks to deal with different combinations of A/B
partitions, like C1(A1,B1), C2(A1,B2), C3(A2,B1), C4(A2,B2).
This requires one record to be sent to multiple downstream subtasks, but
not to all subtasks.

With current interface this is not supported, as one record can only be
sent to one subtask, or to all subtasks of a JobVertex.
And the user had to split the broadcast data set manually to several
different JobVertices, which is hard to maintain and extend.

Thanks,
Zhu Zhu

Yun Gao  于2019年8月22日周四 下午8:42写道:

Hi everyone,
  In some scenarios we met a requirement that some operators want to
send records to theirs downstream operators with an multicast communication
pattern. In detail, for some records, the operators want to send them
according to the partitioner (for example, Rebalance), and for some other
records, the operators want to send them to all the connected operators and
tasks. Such a communication pattern could be viewed as a kind of multicast:
it does not broadcast every record, but some record will indeed be sent to
multiple downstream operators.

However, we found that this kind of communication pattern seems could not
be implemented rightly if the operators have multiple consumers with
different parallelism, using the customized partitioner. To solve the above
problem, we propose to enhance the support for such kind of irregular
communication pattern. We think there may be two options:

 1. Support a kind of customized operator events, which share much
similarity with Watermark, and these events can be broadcasted to the
downstream operators separately.
 2. Let the channel selector supports multicast, and also add the
separate RecordWriter implementation to avoid impacting the performance of
the channel selector that does not need multicast.

The problem and options are deta

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Gavin Lee
Thanks for your reply @Zili.
I'm afraid it's not the same issue.
I found that the FlinkShell.class was not included in flink dist jar file
in 1.9.0 version.
Nowhere can find this class file inside jar, either in opt or lib directory
under root folder of flink distribution.


On Fri, Aug 23, 2019 at 4:10 PM Zili Chen  wrote:

> Hi Gavin,
>
> I also find a problem in shell if the directory contain whitespace
> then the final command to run is incorrect. Could you check the
> final command to be executed?
>
> FYI, here is the ticket[1].
>
> Best,
> tison.
>
> [1] https://issues.apache.org/jira/browse/FLINK-13827
>
>
> Gavin Lee  于2019年8月23日周五 下午3:36写道:
>
>> Why bin/start-scala-shell.sh local return following error?
>>
>> bin/start-scala-shell.sh local
>>
>> Error: Could not find or load main class
>> org.apache.flink.api.scala.FlinkShell
>> For flink 1.8.1 and previous ones, no such issues.
>>
>> On Fri, Aug 23, 2019 at 2:05 PM qi luo  wrote:
>>
>>> Congratulations and thanks for the hard work!
>>>
>>> Qi
>>>
>>> On Aug 22, 2019, at 8:03 PM, Tzu-Li (Gordon) Tai 
>>> wrote:
>>>
>>> The Apache Flink community is very happy to announce the release of
>>> Apache Flink 1.9.0, which is the latest major release.
>>>
>>> Apache Flink® is an open-source stream processing framework for
>>> distributed, high-performing, always-available, and accurate data streaming
>>> applications.
>>>
>>> The release is available for download at:
>>> https://flink.apache.org/downloads.html
>>>
>>> Please check out the release blog post for an overview of the
>>> improvements for this new major release:
>>> https://flink.apache.org/news/2019/08/22/release-1.9.0.html
>>>
>>> The full release notes are available in Jira:
>>>
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344601
>>>
>>> We would like to thank all contributors of the Apache Flink community
>>> who made this release possible!
>>>
>>> Cheers,
>>> Gordon
>>>
>>>
>>>
>>
>> --
>> Gavin
>>
>

-- 
Gavin


Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Zili Chen
I downloaded 1.9.0 dist from here[1] and didn't see the issue. Where do you
download it? Could you try to download the dist from [1] and see whether
the problem last?

Best,
tison.

[1]
http://mirrors.tuna.tsinghua.edu.cn/apache/flink/flink-1.9.0/flink-1.9.0-bin-scala_2.11.tgz


Gavin Lee  于2019年8月23日周五 下午4:34写道:

> Thanks for your reply @Zili.
> I'm afraid it's not the same issue.
> I found that the FlinkShell.class was not included in flink dist jar file
> in 1.9.0 version.
> Nowhere can find this class file inside jar, either in opt or lib
> directory under root folder of flink distribution.
>
>
> On Fri, Aug 23, 2019 at 4:10 PM Zili Chen  wrote:
>
>> Hi Gavin,
>>
>> I also find a problem in shell if the directory contain whitespace
>> then the final command to run is incorrect. Could you check the
>> final command to be executed?
>>
>> FYI, here is the ticket[1].
>>
>> Best,
>> tison.
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-13827
>>
>>
>> Gavin Lee  于2019年8月23日周五 下午3:36写道:
>>
>>> Why bin/start-scala-shell.sh local return following error?
>>>
>>> bin/start-scala-shell.sh local
>>>
>>> Error: Could not find or load main class
>>> org.apache.flink.api.scala.FlinkShell
>>> For flink 1.8.1 and previous ones, no such issues.
>>>
>>> On Fri, Aug 23, 2019 at 2:05 PM qi luo  wrote:
>>>
 Congratulations and thanks for the hard work!

 Qi

 On Aug 22, 2019, at 8:03 PM, Tzu-Li (Gordon) Tai 
 wrote:

 The Apache Flink community is very happy to announce the release of
 Apache Flink 1.9.0, which is the latest major release.

 Apache Flink® is an open-source stream processing framework for
 distributed, high-performing, always-available, and accurate data streaming
 applications.

 The release is available for download at:
 https://flink.apache.org/downloads.html

 Please check out the release blog post for an overview of the
 improvements for this new major release:
 https://flink.apache.org/news/2019/08/22/release-1.9.0.html

 The full release notes are available in Jira:

 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344601

 We would like to thank all contributors of the Apache Flink community
 who made this release possible!

 Cheers,
 Gordon



>>>
>>> --
>>> Gavin
>>>
>>
>
> --
> Gavin
>


[jira] [Created] (FLINK-13828) Deprecate ConfigConstants.LOCAL_START_WEBSERVER

2019-08-23 Thread TisonKun (Jira)
TisonKun created FLINK-13828:


 Summary: Deprecate ConfigConstants.LOCAL_START_WEBSERVER
 Key: FLINK-13828
 URL: https://issues.apache.org/jira/browse/FLINK-13828
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Configuration
Affects Versions: 1.10.0
Reporter: TisonKun
 Fix For: 1.10.0


Maybe backport to 1.9.1 because I heard this issue from a user of 1.9.0

Exactly {{ConfigConstants.LOCAL_START_WEBSERVER}} has no power any more but we 
don't mark it as deprecated or give an investigate to revive it. Since it is 
part of public interface we cannot remove it in minor version but deprecate it 
is necessary.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-23 Thread Piotr Nowojski
Hi Yun and Zhu Zhu,

Thanks for the more detailed example Zhu Zhu.

As far as I understand for the iterations example we do not need multicasting. 
Regarding the Join example, I don’t fully understand it. The example that Zhu 
Zhu presented has a drawback of sending both tables to multiple nodes. What’s 
the benefit of using broadcast join over a hash join in such case? As far as I 
know, the biggest benefit of using broadcast join instead of hash join is that 
we can avoid sending the larger table over the network, because we can perform 
the join locally. In this example we are sending both of the tables to multiple 
nodes, which should defeat the purpose.

Is it about implementing cross join or near cross joins in a distributed 
fashion? 

> if we introduce a new MulticastRecordWriter

That’s one of the solutions. It might have a drawback of 3 class virtualisation 
problem (We have RecordWriter and BroadcastRecordWriter already). With up to 
two implementations, JVM is able to devirtualise the calls.

Previously I was also thinking about just providing two different 
ChannelSelector interfaces. One with `int[]` and `SingleChannelSelector` with 
plain `int` and based on that, RecordWriter could perform some magic (worst 
case scenario `instaceof` checks).

Another solution might be to change `ChannelSelector` interface into an 
iterator.

But let's discuss the details after we agree on implementing this.

Piotrek

> On 23 Aug 2019, at 10:20, Yun Gao  wrote:
> 
>Hi Piotr,
> 
> Thanks a lot for the suggestions!
> 
> The core motivation of this discussion is to implement a new 
> iteration library on the DataStream, and it requires to insert special 
> records in the stream to notify the progress of the iteration. The mechanism 
> of such records is very similar to the current Watermark, and we meet the 
> problem of sending normal records according to the partition (Rebalance, 
> etc..) and also be able to broadcast the inserted progress records to all the 
> connected records. I have read the notes in the google doc and I totally 
> agree with that exploring the broadcast interface in RecordWriter in some way 
> is able to solve this issue. 
> 
>Regarding to `int[] ChannelSelector#selectChannels()`, I'm wondering 
> if we introduce a new MulticastRecordWriter and left the current RecordWriter 
> untouched, could we avoid the performance degradation ? Since with such a 
> modification the normal RecordWriter does not need to iterate the return 
> array by ChannelSelector, and the only difference will be returning an array 
> instead of an integer, and accessing the first element of the returned array 
> instead of reading the integer directly.
> 
> Best,
> Yun
> 
> --
> From:Piotr Nowojski 
> Send Time:2019 Aug. 23 (Fri.) 15:20
> To:dev 
> Cc:Yun Gao 
> Subject:Re: [DISCUSS] Enhance Support for Multicast Communication Pattern
> 
> Hi,
> 
> Yun:
> 
> Thanks for proposing the idea. I have checked the document and left couple of 
> questions there, but it might be better to answer them here.
> 
> What is the exact motivation and what problems do you want to solve? We have 
> dropped multicast support from the network stack [1] for two reasons:
> 1. Performance 
> 2. Code simplicity 
> 
> The proposal to re introduce `int[] ChannelSelector#selectChannels()` would 
> revert those changes. At that time we were thinking about a way how to keep 
> the multicast support on the network level, while keeping the performance and 
> simplicity for non multicast cases and there are ways to achieve that. 
> However they would add extra complexity to Flink, which it would be better to 
> avoid.
> 
> On the other hand, supporting dual pattern: standard partitioning or 
> broadcasting is easy to do, as LatencyMarkers are doing exactly that. It 
> would be just a matter of exposing this to the user in some way. So before we 
> go any further, can you describe your use cases/motivation? Isn’t mix of 
> standard partitioning and broadcasting enough? Do we need multicasting?
> 
> Zhu:
> 
> Could you rephrase your example? I didn’t quite understand it.
> 
> Piotrek
> 
> [1] https://issues.apache.org/jira/browse/FLINK-10662 
> 
> 
> On 23 Aug 2019, at 09:17, Zhu Zhu  > wrote:
> 
> Thanks Yun for starting this discussion.
> I think the multicasting can be very helpful in certain cases.
> 
> I have received requirements from users that they want to do broadcast
> join, while the data set to broadcast is too large to fit in one task.
> Thus the requirement turned out to be to support cartesian product of 2
> data set(one of which can be infinite stream).
> For example, A(parallelism=2) broadcast join B(parallelism=2) in JobVertex
> C.
> The idea to is have 4 C subtasks to deal with different combinations of A/B
> partitions, like C1(A1,B1), C2(A1,B2), C3(A2,B1), C4(A2

1.9 release uses docs of master branch

2019-08-23 Thread Paul Lam
Hi devs,

I've just noticed that the documentation of 1.9 release links to docs of master 
branch. It would be appreciated if someone can fix this. Thanks a lot!

Best,
Paul Lam



Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-23 Thread Zhu Zhu
Hi Piotr,

Yes you are right it's a distributed cross join requirement.
Broadcast join can help with cross join cases. But users cannot use it if
the data set to join is too large to fit into one subtask.

Sorry for left some details behind.

Thanks,
Zhu Zhu

Piotr Nowojski  于2019年8月23日周五 下午4:57写道:

> Hi Yun and Zhu Zhu,
>
> Thanks for the more detailed example Zhu Zhu.
>
> As far as I understand for the iterations example we do not need
> multicasting. Regarding the Join example, I don’t fully understand it. The
> example that Zhu Zhu presented has a drawback of sending both tables to
> multiple nodes. What’s the benefit of using broadcast join over a hash join
> in such case? As far as I know, the biggest benefit of using broadcast join
> instead of hash join is that we can avoid sending the larger table over the
> network, because we can perform the join locally. In this example we are
> sending both of the tables to multiple nodes, which should defeat the
> purpose.
>
> Is it about implementing cross join or near cross joins in a distributed
> fashion?
>
> > if we introduce a new MulticastRecordWriter
>
> That’s one of the solutions. It might have a drawback of 3 class
> virtualisation problem (We have RecordWriter and BroadcastRecordWriter
> already). With up to two implementations, JVM is able to devirtualise the
> calls.
>
> Previously I was also thinking about just providing two different
> ChannelSelector interfaces. One with `int[]` and `SingleChannelSelector`
> with plain `int` and based on that, RecordWriter could perform some magic
> (worst case scenario `instaceof` checks).
>
> Another solution might be to change `ChannelSelector` interface into an
> iterator.
>
> But let's discuss the details after we agree on implementing this.
>
> Piotrek
>
> > On 23 Aug 2019, at 10:20, Yun Gao  wrote:
> >
> >Hi Piotr,
> >
> > Thanks a lot for the suggestions!
> >
> > The core motivation of this discussion is to implement a new
> iteration library on the DataStream, and it requires to insert special
> records in the stream to notify the progress of the iteration. The
> mechanism of such records is very similar to the current Watermark, and we
> meet the problem of sending normal records according to the partition
> (Rebalance, etc..) and also be able to broadcast the inserted progress
> records to all the connected records. I have read the notes in the google
> doc and I totally agree with that exploring the broadcast interface in
> RecordWriter in some way is able to solve this issue.
> >
> >Regarding to `int[] ChannelSelector#selectChannels()`, I'm
> wondering if we introduce a new MulticastRecordWriter and left the current
> RecordWriter untouched, could we avoid the performance degradation ? Since
> with such a modification the normal RecordWriter does not need to iterate
> the return array by ChannelSelector, and the only difference will be
> returning an array instead of an integer, and accessing the first element
> of the returned array instead of reading the integer directly.
> >
> > Best,
> > Yun
> >
> > --
> > From:Piotr Nowojski 
> > Send Time:2019 Aug. 23 (Fri.) 15:20
> > To:dev 
> > Cc:Yun Gao 
> > Subject:Re: [DISCUSS] Enhance Support for Multicast Communication Pattern
> >
> > Hi,
> >
> > Yun:
> >
> > Thanks for proposing the idea. I have checked the document and left
> couple of questions there, but it might be better to answer them here.
> >
> > What is the exact motivation and what problems do you want to solve? We
> have dropped multicast support from the network stack [1] for two reasons:
> > 1. Performance
> > 2. Code simplicity
> >
> > The proposal to re introduce `int[] ChannelSelector#selectChannels()`
> would revert those changes. At that time we were thinking about a way how
> to keep the multicast support on the network level, while keeping the
> performance and simplicity for non multicast cases and there are ways to
> achieve that. However they would add extra complexity to Flink, which it
> would be better to avoid.
> >
> > On the other hand, supporting dual pattern: standard partitioning or
> broadcasting is easy to do, as LatencyMarkers are doing exactly that. It
> would be just a matter of exposing this to the user in some way. So before
> we go any further, can you describe your use cases/motivation? Isn’t mix of
> standard partitioning and broadcasting enough? Do we need multicasting?
> >
> > Zhu:
> >
> > Could you rephrase your example? I didn’t quite understand it.
> >
> > Piotrek
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-10662 <
> https://issues.apache.org/jira/browse/FLINK-10662>
> >
> > On 23 Aug 2019, at 09:17, Zhu Zhu  reed...@gmail.com>> wrote:
> >
> > Thanks Yun for starting this discussion.
> > I think the multicasting can be very helpful in certain cases.
> >
> > I have received requirements from users that they want to do broadcast
> > join, while the data

Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-23 Thread Yun Gao
 Hi Piotr,

Thanks a lot for sharing the thoughts! 

For the iteration, agree with that multicasting is not necessary. 
Exploring the broadcast interface to Output of the operators in some way should 
also solve this issue, and I think it should be even more convenient to have 
the broadcast method for the iteration. 

Also thanks Zhu Zhu for the cross join case!
  Best, 
   Yun



--
From:Zhu Zhu 
Send Time:2019 Aug. 23 (Fri.) 17:25
To:dev 
Cc:Yun Gao 
Subject:Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

Hi Piotr,

Yes you are right it's a distributed cross join requirement.
Broadcast join can help with cross join cases. But users cannot use it if the 
data set to join is too large to fit into one subtask.

Sorry for left some details behind.

Thanks,
Zhu Zhu
Piotr Nowojski  于2019年8月23日周五 下午4:57写道:
Hi Yun and Zhu Zhu,

 Thanks for the more detailed example Zhu Zhu.

 As far as I understand for the iterations example we do not need multicasting. 
Regarding the Join example, I don’t fully understand it. The example that Zhu 
Zhu presented has a drawback of sending both tables to multiple nodes. What’s 
the benefit of using broadcast join over a hash join in such case? As far as I 
know, the biggest benefit of using broadcast join instead of hash join is that 
we can avoid sending the larger table over the network, because we can perform 
the join locally. In this example we are sending both of the tables to multiple 
nodes, which should defeat the purpose.

 Is it about implementing cross join or near cross joins in a distributed 
fashion? 

 > if we introduce a new MulticastRecordWriter

 That’s one of the solutions. It might have a drawback of 3 class 
virtualisation problem (We have RecordWriter and BroadcastRecordWriter 
already). With up to two implementations, JVM is able to devirtualise the calls.

 Previously I was also thinking about just providing two different 
ChannelSelector interfaces. One with `int[]` and `SingleChannelSelector` with 
plain `int` and based on that, RecordWriter could perform some magic (worst 
case scenario `instaceof` checks).

 Another solution might be to change `ChannelSelector` interface into an 
iterator.

 But let's discuss the details after we agree on implementing this.

 Piotrek

 > On 23 Aug 2019, at 10:20, Yun Gao  wrote:
 > 
 >Hi Piotr,
 > 
 > Thanks a lot for the suggestions!
 > 
 > The core motivation of this discussion is to implement a new 
 > iteration library on the DataStream, and it requires to insert special 
 > records in the stream to notify the progress of the iteration. The mechanism 
 > of such records is very similar to the current Watermark, and we meet the 
 > problem of sending normal records according to the partition (Rebalance, 
 > etc..) and also be able to broadcast the inserted progress records to all 
 > the connected records. I have read the notes in the google doc and I totally 
 > agree with that exploring the broadcast interface in RecordWriter in some 
 > way is able to solve this issue. 
 > 
 >Regarding to `int[] ChannelSelector#selectChannels()`, I'm wondering 
 > if we introduce a new MulticastRecordWriter and left the current 
 > RecordWriter untouched, could we avoid the performance degradation ? Since 
 > with such a modification the normal RecordWriter does not need to iterate 
 > the return array by ChannelSelector, and the only difference will be 
 > returning an array instead of an integer, and accessing the first element of 
 > the returned array instead of reading the integer directly.
 > 
 > Best,
 > Yun
 > 
 > --
 > From:Piotr Nowojski 
 > Send Time:2019 Aug. 23 (Fri.) 15:20
 > To:dev 
 > Cc:Yun Gao 
 > Subject:Re: [DISCUSS] Enhance Support for Multicast Communication Pattern
 > 
 > Hi,
 > 
 > Yun:
 > 
 > Thanks for proposing the idea. I have checked the document and left couple 
 > of questions there, but it might be better to answer them here.
 > 
 > What is the exact motivation and what problems do you want to solve? We have 
 > dropped multicast support from the network stack [1] for two reasons:
 > 1. Performance 
 > 2. Code simplicity 
 > 
 > The proposal to re introduce `int[] ChannelSelector#selectChannels()` would 
 > revert those changes. At that time we were thinking about a way how to keep 
 > the multicast support on the network level, while keeping the performance 
 > and simplicity for non multicast cases and there are ways to achieve that. 
 > However they would add extra complexity to Flink, which it would be better 
 > to avoid.
 > 
 > On the other hand, supporting dual pattern: standard partitioning or 
 > broadcasting is easy to do, as LatencyMarkers are doing exactly that. It 
 > would be just a matter of exposing this to the user in some way. So before 
 > we go any further, can you describe your

[jira] [Created] (FLINK-13829) Incorrect doc version of 1.9 docs

2019-08-23 Thread Paul Lin (Jira)
Paul Lin created FLINK-13829:


 Summary: Incorrect doc version of 1.9 docs
 Key: FLINK-13829
 URL: https://issues.apache.org/jira/browse/FLINK-13829
 Project: Flink
  Issue Type: Task
  Components: Documentation
Affects Versions: 1.9.0
Reporter: Paul Lin


The document of Flink 1.9 links to docs of master branch, which should be 
1.9-release branch.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [VOTE] Flink Project Bylaws

2019-08-23 Thread Kostas Tzoumas
+1

On Thu, Aug 22, 2019 at 5:29 PM jincheng sun 
wrote:

> +1
>
> Becket Qin 于2019年8月22日 周四16:22写道:
>
> > Hi All, so far the votes count as following:
> >
> > +1 (Binding): 13 (Aljoscha, Fabian, Kurt, Till, Timo, Max, Stephan,
> > Gordon, Robert, Ufuk, Chesnay, Shaoxuan, Henry)
> > +0 (Binding): 1 (Thomas)
> >
> > +1 (Non-Binding): 10 (Hequn, Vino, Piotr, Dawid, Xintong, Yu, Jingsong,
> > Yun, Jark, Biao)
> >
> > Given that more than 6 days have passed and there are not sufficient +1s
> > to pass the vote. I am reaching out the the binding voters that have not
> > voted yet here.
> >
> >
> > @Greg, @Gyula, @Kostas, @Alan, @jincheng, @Marton, @Sebastian,
> @Vasiliki, @Daniel
> >
> > Would you have time to check the Flink bylaws proposal and vote on it?
> The
> > bylaws wiki is following:
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026
> >
> > We are following the 2/3 majority voting process. This is the first
> > attempt of reaching out to the PMCs that have not voted yet. We will make
> > another attempt after 7 days if the result of the vote is still not
> > determined by then.
> >
> > Also CCing private@ in case one did not setup the Apache email
> forwarding.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> >
> >
> > On Thu, Aug 22, 2019 at 8:03 AM Henry Saputra 
> > wrote:
> >
> >> Oh yeah,  +1 LGTM
> >>
> >> Thanks for working on this.
> >>
> >> - Henry
> >>
> >> On Tue, Aug 20, 2019 at 2:17 AM Becket Qin 
> wrote:
> >>
> >> > Thanks for sharing your thoughts, Thomas, Henry and Stephan. I also
> >> think
> >> > the committers are supposed to be mature enough to know when a review
> on
> >> > their own patch is needed.
> >> >
> >> > @Henry, just want to confirm, are you +1 on the proposed bylaws?
> >> >
> >> > Thanks,
> >> >
> >> > Jiangjie (Becket) Qin
> >> >
> >> > On Tue, Aug 20, 2019 at 10:54 AM Stephan Ewen 
> wrote:
> >> >
> >> > > I see it somewhat similar to Henry.
> >> > >
> >> > > Generally, all committers should go for a review by another
> committer,
> >> > > unless it is a trivial comment or style fix. I personally do that,
> >> even
> >> > > though being one of the committers that have been with the project
> >> > longest.
> >> > >
> >> > > For now, I was hoping though that we have a mature enough community
> >> that
> >> > > this "soft rule" is enough. Whenever possible, working based on
> trust
> >> > with
> >> > > soft processes beats working with hard processes. We can still
> revisit
> >> > this
> >> > > in case we see that it does not work out.
> >> > >
> >> > >
> >> > > On Mon, Aug 19, 2019 at 10:21 PM Henry Saputra <
> >> henry.sapu...@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > One of the perks of being committers is be able to commit code
> >> without
> >> > > > asking from another committer. Having said that, I think we rely
> on
> >> > > > maturity of the committers to know when to ask for reviews and
> when
> >> to
> >> > > > commit directly.
> >> > > >
> >> > > > For example, if someone just change typos on comments or simple
> >> rename
> >> > of
> >> > > > internal variables, I think we could trust the committer to safely
> >> > commit
> >> > > > the changes. When the changes will have effect of changing or
> >> introduce
> >> > > new
> >> > > > flows of the code, that's when reviews are needed and strongly
> >> > > encouraged.
> >> > > > I think the balance is needed for this.
> >> > > >
> >> > > > PMCs have the ability and right to revert changes in source repo
> as
> >> > > > necessary.
> >> > > >
> >> > > > - Henry
> >> > > >
> >> > > > On Sun, Aug 18, 2019 at 9:23 PM Thomas Weise 
> >> wrote:
> >> > > >
> >> > > > > +0 (binding)
> >> > > > >
> >> > > > > I don't think committers should be allowed to approve their own
> >> > > changes.
> >> > > > I
> >> > > > > would prefer if non-committer contributors can approve committer
> >> PRs
> >> > as
> >> > > > > that would encourage more participation in code review and
> >> ability to
> >> > > > > contribute.
> >> > > > >
> >> > > > >
> >> > > > > On Fri, Aug 16, 2019 at 9:02 PM Shaoxuan Wang <
> >> wshaox...@gmail.com>
> >> > > > wrote:
> >> > > > >
> >> > > > > > +1 (binding)
> >> > > > > >
> >> > > > > > On Fri, Aug 16, 2019 at 7:48 PM Chesnay Schepler <
> >> > ches...@apache.org
> >> > > >
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > +1 (binding)
> >> > > > > > >
> >> > > > > > > Although I think it would be a good idea to always cc
> >> > > > > > > priv...@flink.apache.org when modifying bylaws, if anything
> >> to
> >> > > speed
> >> > > > > up
> >> > > > > > > the voting process.
> >> > > > > > >
> >> > > > > > > On 16/08/2019 11:26, Ufuk Celebi wrote:
> >> > > > > > > > +1 (binding)
> >> > > > > > > >
> >> > > > > > > > – Ufuk
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > On Wed, Aug 14, 2019 at 4:50 AM Biao Liu <
> >> mmyy1...@gmail.com>
> >> > > > wrote:
> >> > > > > > > >
> >> > > > > > > >> +1 (non-binding)
> >> > > > > > > >>
> >> > > > > > > >> Thanks

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Gavin Lee
I used package on apache official site, with mirror [1], the difference is
I used scala 2.12 version.
I also tried to build from source for both scala 2.11 and 2.12, when build
2.12 the FlinkShell.class is in flink-dist jar file but after running mvn
clean package -Dscala-2.12, this class was removed in flink-dist_2.12-1.9 jar
file.
Seems broken here for scala 2.12, right?

[1]
http://mirror.bit.edu.cn/apache/flink/flink-1.9.0/flink-1.9.0-bin-scala_2.12.tgz

On Fri, Aug 23, 2019 at 4:37 PM Zili Chen  wrote:

> I downloaded 1.9.0 dist from here[1] and didn't see the issue. Where do you
> download it? Could you try to download the dist from [1] and see whether
> the problem last?
>
> Best,
> tison.
>
> [1]
> http://mirrors.tuna.tsinghua.edu.cn/apache/flink/flink-1.9.0/flink-1.9.0-bin-scala_2.11.tgz
>
>
> Gavin Lee  于2019年8月23日周五 下午4:34写道:
>
>> Thanks for your reply @Zili.
>> I'm afraid it's not the same issue.
>> I found that the FlinkShell.class was not included in flink dist jar file
>> in 1.9.0 version.
>> Nowhere can find this class file inside jar, either in opt or lib
>> directory under root folder of flink distribution.
>>
>>
>> On Fri, Aug 23, 2019 at 4:10 PM Zili Chen  wrote:
>>
>>> Hi Gavin,
>>>
>>> I also find a problem in shell if the directory contain whitespace
>>> then the final command to run is incorrect. Could you check the
>>> final command to be executed?
>>>
>>> FYI, here is the ticket[1].
>>>
>>> Best,
>>> tison.
>>>
>>> [1] https://issues.apache.org/jira/browse/FLINK-13827
>>>
>>>
>>> Gavin Lee  于2019年8月23日周五 下午3:36写道:
>>>
 Why bin/start-scala-shell.sh local return following error?

 bin/start-scala-shell.sh local

 Error: Could not find or load main class
 org.apache.flink.api.scala.FlinkShell
 For flink 1.8.1 and previous ones, no such issues.

 On Fri, Aug 23, 2019 at 2:05 PM qi luo  wrote:

> Congratulations and thanks for the hard work!
>
> Qi
>
> On Aug 22, 2019, at 8:03 PM, Tzu-Li (Gordon) Tai 
> wrote:
>
> The Apache Flink community is very happy to announce the release of
> Apache Flink 1.9.0, which is the latest major release.
>
> Apache Flink® is an open-source stream processing framework for
> distributed, high-performing, always-available, and accurate data 
> streaming
> applications.
>
> The release is available for download at:
> https://flink.apache.org/downloads.html
>
> Please check out the release blog post for an overview of the
> improvements for this new major release:
> https://flink.apache.org/news/2019/08/22/release-1.9.0.html
>
> The full release notes are available in Jira:
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344601
>
> We would like to thank all contributors of the Apache Flink community
> who made this release possible!
>
> Cheers,
> Gordon
>
>
>

 --
 Gavin

>>>
>>
>> --
>> Gavin
>>
>

-- 
Gavin


Re: 1.9 release uses docs of master branch

2019-08-23 Thread Paul Lam
I’ve filed a ticket to track this.

[1] https://issues.apache.org/jira/browse/FLINK-13829 


Best,
Paul Lam

> 在 2019年8月23日,16:58,Paul Lam  写道:
> 
> Hi devs,
> 
> I've just noticed that the documentation of 1.9 release links to docs of 
> master branch. It would be appreciated if someone can fix this. Thanks a lot!
> 
> Best,
> Paul Lam
> 



[jira] [Created] (FLINK-13830) The Document about Cluster on yarn have some problems

2019-08-23 Thread zhangmeng (Jira)
zhangmeng created FLINK-13830:
-

 Summary: The Document about Cluster on yarn have some problems
 Key: FLINK-13830
 URL: https://issues.apache.org/jira/browse/FLINK-13830
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Reporter: zhangmeng
 Attachments: image-2019-08-23-17-48-22-123.png

Read the flink 1.9 documentation, YARN Setup section, there have some issues 
with download the installation package. There is no 
flink-1.10-SNAPSHOT-bin-hadoop2.tgz package in the 
https://flink.apache.org/downloads.html download page.

!image-2019-08-23-17-48-22-123.png!



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13831) slots total display error

2019-08-23 Thread Yu Wang (Jira)
Yu Wang created FLINK-13831:
---

 Summary: slots total display error
 Key: FLINK-13831
 URL: https://issues.apache.org/jira/browse/FLINK-13831
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Web Frontend
Affects Versions: 1.9.0
Reporter: Yu Wang
 Attachments: slots.png

slots total display error



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-23 Thread Piotr Nowojski
Hi,

Thanks for the answers :) Ok I understand the full picture now. +1 from my side 
on solving this issue somehow. But before we start discussing how to solve it 
one last control question:

I guess this multicast is intended to be used in blink planner, right? Assuming 
that we implement the multicast support now, when would it be used by the 
blink? I would like to avoid a scenario, where we implement an unused feature 
and we keep maintaining it for a long period of time.

Piotrek

PS, try to include motivating examples, including concrete ones in the 
proposals/design docs, for example in the very first paragraph. Especially if 
it’s a commonly known feature like cross join :)

> On 23 Aug 2019, at 11:38, Yun Gao  wrote:
> 
> Hi Piotr,
> 
>Thanks a lot for sharing the thoughts! 
> 
>For the iteration, agree with that multicasting is not necessary. 
> Exploring the broadcast interface to Output of the operators in some way 
> should also solve this issue, and I think it should be even more convenient 
> to have the broadcast method for the iteration. 
> 
>Also thanks Zhu Zhu for the cross join case!
>  Best, 
>   Yun
> 
> 
> 
> --
> From:Zhu Zhu 
> Send Time:2019 Aug. 23 (Fri.) 17:25
> To:dev 
> Cc:Yun Gao 
> Subject:Re: [DISCUSS] Enhance Support for Multicast Communication Pattern
> 
> Hi Piotr,
> 
> Yes you are right it's a distributed cross join requirement.
> Broadcast join can help with cross join cases. But users cannot use it if the 
> data set to join is too large to fit into one subtask.
> 
> Sorry for left some details behind.
> 
> Thanks,
> Zhu Zhu
> Piotr Nowojski  于2019年8月23日周五 下午4:57写道:
> Hi Yun and Zhu Zhu,
> 
> Thanks for the more detailed example Zhu Zhu.
> 
> As far as I understand for the iterations example we do not need 
> multicasting. Regarding the Join example, I don’t fully understand it. The 
> example that Zhu Zhu presented has a drawback of sending both tables to 
> multiple nodes. What’s the benefit of using broadcast join over a hash join 
> in such case? As far as I know, the biggest benefit of using broadcast join 
> instead of hash join is that we can avoid sending the larger table over the 
> network, because we can perform the join locally. In this example we are 
> sending both of the tables to multiple nodes, which should defeat the purpose.
> 
> Is it about implementing cross join or near cross joins in a distributed 
> fashion? 
> 
>> if we introduce a new MulticastRecordWriter
> 
> That’s one of the solutions. It might have a drawback of 3 class 
> virtualisation problem (We have RecordWriter and BroadcastRecordWriter 
> already). With up to two implementations, JVM is able to devirtualise the 
> calls.
> 
> Previously I was also thinking about just providing two different 
> ChannelSelector interfaces. One with `int[]` and `SingleChannelSelector` with 
> plain `int` and based on that, RecordWriter could perform some magic (worst 
> case scenario `instaceof` checks).
> 
> Another solution might be to change `ChannelSelector` interface into an 
> iterator.
> 
> But let's discuss the details after we agree on implementing this.
> 
> Piotrek
> 
>> On 23 Aug 2019, at 10:20, Yun Gao  wrote:
>> 
>>   Hi Piotr,
>> 
>>Thanks a lot for the suggestions!
>> 
>>The core motivation of this discussion is to implement a new 
>> iteration library on the DataStream, and it requires to insert special 
>> records in the stream to notify the progress of the iteration. The mechanism 
>> of such records is very similar to the current Watermark, and we meet the 
>> problem of sending normal records according to the partition (Rebalance, 
>> etc..) and also be able to broadcast the inserted progress records to all 
>> the connected records. I have read the notes in the google doc and I totally 
>> agree with that exploring the broadcast interface in RecordWriter in some 
>> way is able to solve this issue. 
>> 
>>   Regarding to `int[] ChannelSelector#selectChannels()`, I'm wondering 
>> if we introduce a new MulticastRecordWriter and left the current 
>> RecordWriter untouched, could we avoid the performance degradation ? Since 
>> with such a modification the normal RecordWriter does not need to iterate 
>> the return array by ChannelSelector, and the only difference will be 
>> returning an array instead of an integer, and accessing the first element of 
>> the returned array instead of reading the integer directly.
>> 
>> Best,
>> Yun
>> 
>> --
>> From:Piotr Nowojski 
>> Send Time:2019 Aug. 23 (Fri.) 15:20
>> To:dev 
>> Cc:Yun Gao 
>> Subject:Re: [DISCUSS] Enhance Support for Multicast Communication Pattern
>> 
>> Hi,
>> 
>> Yun:
>> 
>> Thanks for proposing the idea. I have checked the document and left couple 
>> of questions there, but it might be better to answer them here.
>> 
>> What is the exact motivat

[jira] [Created] (FLINK-13832) DefaultRollingPolicy create() method should be renamed to builder()

2019-08-23 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-13832:
--

 Summary: DefaultRollingPolicy create() method should be renamed to 
builder()
 Key: FLINK-13832
 URL: https://issues.apache.org/jira/browse/FLINK-13832
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / FileSystem
Affects Versions: 1.9.0
Reporter: Gyula Fora


The  DefaultRollingPolicy.create() method returns an instance of PolicyBuilder 
not a DefaultRollingPolicy.  Therefore we should add a new method named 
.builder() and deprecate "create".

Right now if we want to create a new instance with the default settings we have 
to call:

DefaultRollingPolicy.create().build()


This nicely illustrates the problem with the naming of this method.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-23 Thread Zhu Zhu
Thanks Piotr,

Users asked for this feature sometimes ago when they migrating batch jobs
to Flink(Blink).
It's not very urgent as they have taken some workarounds to solve it.(like
partitioning data set to different job vertices)
So it's fine to not make it top priority.

Anyway, as a commonly known scenario, I think users can benefit from cross
join sooner or later.

Thanks,
Zhu Zhu

Piotr Nowojski  于2019年8月23日周五 下午6:19写道:

> Hi,
>
> Thanks for the answers :) Ok I understand the full picture now. +1 from my
> side on solving this issue somehow. But before we start discussing how to
> solve it one last control question:
>
> I guess this multicast is intended to be used in blink planner, right?
> Assuming that we implement the multicast support now, when would it be used
> by the blink? I would like to avoid a scenario, where we implement an
> unused feature and we keep maintaining it for a long period of time.
>
> Piotrek
>
> PS, try to include motivating examples, including concrete ones in the
> proposals/design docs, for example in the very first paragraph. Especially
> if it’s a commonly known feature like cross join :)
>
> > On 23 Aug 2019, at 11:38, Yun Gao  wrote:
> >
> > Hi Piotr,
> >
> >Thanks a lot for sharing the thoughts!
> >
> >For the iteration, agree with that multicasting is not necessary.
> Exploring the broadcast interface to Output of the operators in some way
> should also solve this issue, and I think it should be even more convenient
> to have the broadcast method for the iteration.
> >
> >Also thanks Zhu Zhu for the cross join case!
> >  Best,
> >   Yun
> >
> >
> >
> > --
> > From:Zhu Zhu 
> > Send Time:2019 Aug. 23 (Fri.) 17:25
> > To:dev 
> > Cc:Yun Gao 
> > Subject:Re: [DISCUSS] Enhance Support for Multicast Communication Pattern
> >
> > Hi Piotr,
> >
> > Yes you are right it's a distributed cross join requirement.
> > Broadcast join can help with cross join cases. But users cannot use it
> if the data set to join is too large to fit into one subtask.
> >
> > Sorry for left some details behind.
> >
> > Thanks,
> > Zhu Zhu
> > Piotr Nowojski  于2019年8月23日周五 下午4:57写道:
> > Hi Yun and Zhu Zhu,
> >
> > Thanks for the more detailed example Zhu Zhu.
> >
> > As far as I understand for the iterations example we do not need
> multicasting. Regarding the Join example, I don’t fully understand it. The
> example that Zhu Zhu presented has a drawback of sending both tables to
> multiple nodes. What’s the benefit of using broadcast join over a hash join
> in such case? As far as I know, the biggest benefit of using broadcast join
> instead of hash join is that we can avoid sending the larger table over the
> network, because we can perform the join locally. In this example we are
> sending both of the tables to multiple nodes, which should defeat the
> purpose.
> >
> > Is it about implementing cross join or near cross joins in a distributed
> fashion?
> >
> >> if we introduce a new MulticastRecordWriter
> >
> > That’s one of the solutions. It might have a drawback of 3 class
> virtualisation problem (We have RecordWriter and BroadcastRecordWriter
> already). With up to two implementations, JVM is able to devirtualise the
> calls.
> >
> > Previously I was also thinking about just providing two different
> ChannelSelector interfaces. One with `int[]` and `SingleChannelSelector`
> with plain `int` and based on that, RecordWriter could perform some magic
> (worst case scenario `instaceof` checks).
> >
> > Another solution might be to change `ChannelSelector` interface into an
> iterator.
> >
> > But let's discuss the details after we agree on implementing this.
> >
> > Piotrek
> >
> >> On 23 Aug 2019, at 10:20, Yun Gao  wrote:
> >>
> >>   Hi Piotr,
> >>
> >>Thanks a lot for the suggestions!
> >>
> >>The core motivation of this discussion is to implement a new
> iteration library on the DataStream, and it requires to insert special
> records in the stream to notify the progress of the iteration. The
> mechanism of such records is very similar to the current Watermark, and we
> meet the problem of sending normal records according to the partition
> (Rebalance, etc..) and also be able to broadcast the inserted progress
> records to all the connected records. I have read the notes in the google
> doc and I totally agree with that exploring the broadcast interface in
> RecordWriter in some way is able to solve this issue.
> >>
> >>   Regarding to `int[] ChannelSelector#selectChannels()`, I'm
> wondering if we introduce a new MulticastRecordWriter and left the current
> RecordWriter untouched, could we avoid the performance degradation ? Since
> with such a modification the normal RecordWriter does not need to iterate
> the return array by ChannelSelector, and the only difference will be
> returning an array instead of an integer, and accessing the first element
> of the returned array instead of

[jira] [Created] (FLINK-13833) Incorrect Maven dependencies for Flink-Hive connector

2019-08-23 Thread Qi Kang (Jira)
Qi Kang created FLINK-13833:
---

 Summary: Incorrect Maven dependencies for Flink-Hive connector
 Key: FLINK-13833
 URL: https://issues.apache.org/jira/browse/FLINK-13833
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.9.0
Reporter: Qi Kang


See 
[https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/hive/]

The Maven artifact IDs in the dependencies snippet are somewhat wrong, thus 
cannot be resolved. According to my finding in the repository, they should be 
corrected as follows.
{code:java}

org.apache.flink
flink-connector-hive_2.11
1.9.0
provided





org.apache.flink
flink-hadoop-compatibility_2.11
1.9.0
provided





org.apache.flink
flink-shaded-hadoop-2-uber
2.6.5-7.0
provided
{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13834) Add class for PolynomialExpansionMapper

2019-08-23 Thread Xu Yang (Jira)
Xu Yang created FLINK-13834:
---

 Summary: Add class for PolynomialExpansionMapper
 Key: FLINK-13834
 URL: https://issues.apache.org/jira/browse/FLINK-13834
 Project: Flink
  Issue Type: Sub-task
  Components: Library / Machine Learning
Reporter: Xu Yang


Polynomial expansion is the process of expanding your features into a 
polynomial space, which is formulated by an n-degree combination of original 
dimensions. Take a 2-variable feature vector as an example: (x, y), if we want 
to expand it with degree 2, then we get (x, x * x, y, x * y, y * y).
 * Add PolynomialExpansionMapper for the operation of the vector polynomial 
expansion.
 * Add VectorPolynomialExpandParams for the parameters of 
PolynomialExpansionMapper.
 * Add PolynomialExpansionMapperTest for the test example.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13835) Add class for FeatureHasherMapper.

2019-08-23 Thread Xu Yang (Jira)
Xu Yang created FLINK-13835:
---

 Summary: Add class for FeatureHasherMapper.
 Key: FLINK-13835
 URL: https://issues.apache.org/jira/browse/FLINK-13835
 Project: Flink
  Issue Type: Sub-task
  Components: Library / Machine Learning
Reporter: Xu Yang


FeatureHasherMapper is a transformer to project a number of categorical or 
numerical features into a feature vector of a specified dimension.
 * Add FeatureHasherMapper for the operation of the FeatureHasherMapper.
 * Add FeatureHasherParams for the params of FeatureHasherMapper.
 * Add FeatureHasherMapperTest for the test example.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13836) Improve support of java.util.UUID for JDBCTypeUtil

2019-08-23 Thread Jira
François Lacombe created FLINK-13836:


 Summary: Improve support of java.util.UUID for JDBCTypeUtil
 Key: FLINK-13836
 URL: https://issues.apache.org/jira/browse/FLINK-13836
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / JDBC
Affects Versions: 1.8.0
Reporter: François Lacombe


Currently, JDBCTypeUtil used by JDBCAppendTableSinkBuilder dones't support UUID 
types with java.util.UUID in Java.

Could it be possible to handle that as to allow to write UUID directly to 
postgresql please?



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Flink client api enhancement for downstream project

2019-08-23 Thread Kostas Kloudas
Hi all,

On the topic of web submission, I agree with Till that it only seems
to complicate things.
It is bad for security, job isolation (anybody can submit/cancel jobs), and its
implementation complicates some parts of the code. So, if it were to
redesign the
WebUI, maybe this part could be left out. In addition, I would say
that the ability to cancel
jobs could also be left out.

Also I would also be in favour of removing the "detached" mode, for
the reasons mentioned
above (i.e. because now we will have a future representing the result
on which the user
can choose to wait or not).

Now for the separating job submission and cluster creation, I am in
favour of keeping both.
Once again, the reasons are mentioned above by Stephan, Till, Aljoscha
and also Zili seems
to agree. They mainly have to do with security, isolation and ease of
resource management
for the user as he knows that "when my job is done, everything will be
cleared up". This is
also the experience you get when launching a process on your local OS.

On excluding the per-job mode from returning a JobClient or not, I
believe that eventually
it would be nice to allow users to get back a jobClient. The reason is
that 1) I cannot
find any objective reason why the user-experience should diverge, and
2) this will be the
way that the user will be able to interact with his running job.
Assuming that the necessary
ports are open for the REST API to work, then I think that the
JobClient can run against the
REST API without problems. If the needed ports are not open, then we
are safe to not return
a JobClient, as the user explicitly chose to close all points of
communication to his running job.

On the topic of not hijacking the "env.execute()" in order to get the
Plan, I definitely agree but
for the proposal of having a "compile()" method in the env, I would
like to have a better look at
the existing code.

Cheers,
Kostas

On Fri, Aug 23, 2019 at 5:52 AM Zili Chen  wrote:
>
> Hi Yang,
>
> It would be helpful if you check Stephan's last comment,
> which states that isolation is important.
>
> For per-job mode, we run a dedicated cluster(maybe it
> should have been a couple of JM and TMs during FLIP-6
> design) for a specific job. Thus the process is prevented
> from other jobs.
>
> In our cases there was a time we suffered from multi
> jobs submitted by different users and they affected
> each other so that all ran into an error state. Also,
> run the client inside the cluster could save client
> resource at some points.
>
> However, we also face several issues as you mentioned,
> that in per-job mode it always uses parent classloader
> thus classloading issues occur.
>
> BTW, one can makes an analogy between session/per-job mode
> in  Flink, and client/cluster mode in Spark.
>
> Best,
> tison.
>
>
> Yang Wang  于2019年8月22日周四 上午11:25写道:
>
> > From the user's perspective, it is really confused about the scope of
> > per-job cluster.
> >
> >
> > If it means a flink cluster with single job, so that we could get better
> > isolation.
> >
> > Now it does not matter how we deploy the cluster, directly deploy(mode1)
> >
> > or start a flink cluster and then submit job through cluster client(mode2).
> >
> >
> > Otherwise, if it just means directly deploy, how should we name the mode2,
> >
> > session with job or something else?
> >
> > We could also benefit from the mode2. Users could get the same isolation
> > with mode1.
> >
> > The user code and dependencies will be loaded by user class loader
> >
> > to avoid class conflict with framework.
> >
> >
> >
> > Anyway, both of the two submission modes are useful.
> >
> > We just need to clarify the concepts.
> >
> >
> >
> >
> > Best,
> >
> > Yang
> >
> > Zili Chen  于2019年8月20日周二 下午5:58写道:
> >
> > > Thanks for the clarification.
> > >
> > > The idea JobDeployer ever came into my mind when I was muddled with
> > > how to execute per-job mode and session mode with the same user code
> > > and framework codepath.
> > >
> > > With the concept JobDeployer we back to the statement that environment
> > > knows every configs of cluster deployment and job submission. We
> > > configure or generate from configuration a specific JobDeployer in
> > > environment and then code align on
> > >
> > > *JobClient client = env.execute().get();*
> > >
> > > which in session mode returned by clusterClient.submitJob and in per-job
> > > mode returned by clusterDescriptor.deployJobCluster.
> > >
> > > Here comes a problem that currently we directly run ClusterEntrypoint
> > > with extracted job graph. Follow the JobDeployer way we'd better
> > > align entry point of per-job deployment at JobDeployer. Users run
> > > their main method or by a Cli(finally call main method) to deploy the
> > > job cluster.
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > Stephan Ewen  于2019年8月20日周二 下午4:40写道:
> > >
> > > > Till has made some good comments here.
> > > >
> > > > Two things to add:
> > > >
> > > >   - The job mode is very nice in the way that i

[jira] [Created] (FLINK-13837) Support --files and --libjars arguments in flink run command line

2019-08-23 Thread Yang Wang (Jira)
Yang Wang created FLINK-13837:
-

 Summary: Support --files and --libjars arguments in flink run 
command line
 Key: FLINK-13837
 URL: https://issues.apache.org/jira/browse/FLINK-13837
 Project: Flink
  Issue Type: New Feature
  Components: Command Line Client
Reporter: Yang Wang


Currently we could use the following codes to register a cached file and then 
get it in the task. We hope it could be done more easier by --files command 
option, such as —files [file:///tmp/test_data].

 

*final* StreamExecutionEnvironment env = 
StreamExecutionEnvironment._getExecutionEnvironment_();

env.registerCachedFile(inputFile.toString(), *"test_data"*, *false*);

 

For a jar, we could build a fat jar including our codes and all dependencies . 
It is better to add --libjars command option to support transfer dependencies.

 

What’s the difference between --files&—libjars and -yt?
 * Option -yt is used when submitting job to YARN cluster, and all files will 
be distributed by YARN distributed cache. It will be shared by all jobs in the 
flink cluster.
 * Option --libjars is used for flink job, and all files will be distributed by 
blob server. It is only accessible for the specific job.

 

The new added command options are as follows.

--files                       Attach custom files for job. Directory

                                  could not be supported. Use ',' to

                                  separate multiple files. The files

                                  could be in local file system or

                                  distributed file system. Use URI

                                  schema to specify which file system

                                  the file belongs. If schema is

                                  missing, would try to get the file in

                                  local file system. Use '#' after the

                                  file path to specify retrieval key in

                                  runtime. (eg: --file

                                  file:///tmp/a.txt#file_key,hdfs:///$na

                                  menode_address/tmp/b.txt)

--libjars                    Attach custom library jars for job.

                                  Directory could not be supported. Use

                                  ',' to separate multiple jars. The

                                  jars could be in local file system or

                                  distributed file system. Use URI

                                  schema to specify which file system

                                  the jar belongs. If schema is missing,

                                  would try to get the jars in local

                                  file system. (eg: --libjars

                                  file:///tmp/dependency1.jar,hdfs:///$n

                                  amenode_address/tmp/dependency2.jar)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13838) Support -yta(--yarnshipArchives) arguments in flink run command line

2019-08-23 Thread Yang Wang (Jira)
Yang Wang created FLINK-13838:
-

 Summary: Support -yta(--yarnshipArchives) arguments in flink run 
command line
 Key: FLINK-13838
 URL: https://issues.apache.org/jira/browse/FLINK-13838
 Project: Flink
  Issue Type: New Feature
  Components: Command Line Client
Reporter: Yang Wang


Currently we could use --yarnship to transfer jars, files and directory for 
cluster and add them to classpath. However, compressed package could not be 
supported. If we have a compressed package including some config files, so 
files and jars, the --yarnshipArchives will be very useful.

 

What’s the difference between -yt and -yta?

-yt [file:///tmp/a.tar.gz] The file will be transferred by Yarn and keep the 
original compressed file(not be unpacked) in the workdir of 
jobmanager/taskmanager container.

-yta [file:///tmp/a.tar.gz#dict1] The file will be transferred by Yarn and 
unpacked to a new directory with name dict1 in the workdir.

 

-yta,--yarnshipArchives         Ship archives for cluster (t for

                                          transfer), Use ',' to separate

                                          multiple files. The archives could be

                                          in local file system or distributed

                                          file system. Use URI schema to specify

                                          which file system the file belongs. If

                                          schema is missing, would try to get

                                          the archives in local file system. Use

                                          '#' after the file path to specify a

                                          new name in workdir. (eg: -yta

                                          file:///tmp/a.tar.gz#dict1,hdfs:///$na

                                          menode_address/tmp/b.tar.gz)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13839) Support to set yarn node label for flink jobmanager and taskmanager container

2019-08-23 Thread Yang Wang (Jira)
Yang Wang created FLINK-13839:
-

 Summary: Support to set yarn node label for flink jobmanager and 
taskmanager container
 Key: FLINK-13839
 URL: https://issues.apache.org/jira/browse/FLINK-13839
 Project: Flink
  Issue Type: New Feature
  Components: Deployment / YARN
Reporter: Yang Wang


Yarn node label feature is introduced from 2.6. It is a way to group nodes with 
similar characteristics and applications can specify where to run. In the 
production or cloud environment, we want to the jobmanager running on some more 
stable machines. The node label could help us to achieve that.

 

However, the ResourceRequest.setNodeLabelExpression have not been supported in 
the current hadoop version dependency(2.4.1). So we need to bump the hadoop 
version to 2.6.5.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Till Rohrmann
Hi Gavin,

if I'm not mistaken, then the community excluded the Scala FlinkShell since
a couple of versions for Scala 2.12. The problem seems to be that some of
the tests failed. See here [1] for more information.

[1] https://issues.apache.org/jira/browse/FLINK-10911

Cheers,
Till

On Fri, Aug 23, 2019 at 11:44 AM Gavin Lee  wrote:

> I used package on apache official site, with mirror [1], the difference is
> I used scala 2.12 version.
> I also tried to build from source for both scala 2.11 and 2.12, when build
> 2.12 the FlinkShell.class is in flink-dist jar file but after running mvn
> clean package -Dscala-2.12, this class was removed in flink-dist_2.12-1.9
> jar
> file.
> Seems broken here for scala 2.12, right?
>
> [1]
>
> http://mirror.bit.edu.cn/apache/flink/flink-1.9.0/flink-1.9.0-bin-scala_2.12.tgz
>
> On Fri, Aug 23, 2019 at 4:37 PM Zili Chen  wrote:
>
> > I downloaded 1.9.0 dist from here[1] and didn't see the issue. Where do
> you
> > download it? Could you try to download the dist from [1] and see whether
> > the problem last?
> >
> > Best,
> > tison.
> >
> > [1]
> >
> http://mirrors.tuna.tsinghua.edu.cn/apache/flink/flink-1.9.0/flink-1.9.0-bin-scala_2.11.tgz
> >
> >
> > Gavin Lee  于2019年8月23日周五 下午4:34写道:
> >
> >> Thanks for your reply @Zili.
> >> I'm afraid it's not the same issue.
> >> I found that the FlinkShell.class was not included in flink dist jar
> file
> >> in 1.9.0 version.
> >> Nowhere can find this class file inside jar, either in opt or lib
> >> directory under root folder of flink distribution.
> >>
> >>
> >> On Fri, Aug 23, 2019 at 4:10 PM Zili Chen  wrote:
> >>
> >>> Hi Gavin,
> >>>
> >>> I also find a problem in shell if the directory contain whitespace
> >>> then the final command to run is incorrect. Could you check the
> >>> final command to be executed?
> >>>
> >>> FYI, here is the ticket[1].
> >>>
> >>> Best,
> >>> tison.
> >>>
> >>> [1] https://issues.apache.org/jira/browse/FLINK-13827
> >>>
> >>>
> >>> Gavin Lee  于2019年8月23日周五 下午3:36写道:
> >>>
>  Why bin/start-scala-shell.sh local return following error?
> 
>  bin/start-scala-shell.sh local
> 
>  Error: Could not find or load main class
>  org.apache.flink.api.scala.FlinkShell
>  For flink 1.8.1 and previous ones, no such issues.
> 
>  On Fri, Aug 23, 2019 at 2:05 PM qi luo  wrote:
> 
> > Congratulations and thanks for the hard work!
> >
> > Qi
> >
> > On Aug 22, 2019, at 8:03 PM, Tzu-Li (Gordon) Tai <
> tzuli...@apache.org>
> > wrote:
> >
> > The Apache Flink community is very happy to announce the release of
> > Apache Flink 1.9.0, which is the latest major release.
> >
> > Apache Flink® is an open-source stream processing framework for
> > distributed, high-performing, always-available, and accurate data
> streaming
> > applications.
> >
> > The release is available for download at:
> > https://flink.apache.org/downloads.html
> >
> > Please check out the release blog post for an overview of the
> > improvements for this new major release:
> > https://flink.apache.org/news/2019/08/22/release-1.9.0.html
> >
> > The full release notes are available in Jira:
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344601
> >
> > We would like to thank all contributors of the Apache Flink community
> > who made this release possible!
> >
> > Cheers,
> > Gordon
> >
> >
> >
> 
>  --
>  Gavin
> 
> >>>
> >>
> >> --
> >> Gavin
> >>
> >
>
> --
> Gavin
>


Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-23 Thread Piotr Nowojski
Hi,

If the primary motivation is broadcasting (for the iterations) and we have no 
immediate need for multicast (cross join), I would prefer to first expose 
broadcast via the DataStream API and only later, once we finally need it, 
support multicast. As I wrote, multicast would be more challenging to 
implement, with more complicated runtime and API. And re-using multicast just 
to support broadcast doesn’t have much sense:

1. It’s a bit obfuscated. It’s easier to understand collectBroadcast(record) or 
broadcastEmit(record) compared to some multicast channel selector that just 
happens to return all of the channels.
2. There are performance benefits of explicitly calling 
`RecordWriter#broadcastEmit`.


On a different note, what would be the semantic of such broadcast emit on 
KeyedStream? Would it be supported? Or would we limit support only to the 
non-keyed streams?

Piotrek

> On 23 Aug 2019, at 12:48, Zhu Zhu  wrote:
> 
> Thanks Piotr,
> 
> Users asked for this feature sometimes ago when they migrating batch jobs to 
> Flink(Blink). 
> It's not very urgent as they have taken some workarounds to solve it.(like 
> partitioning data set to different job vertices)
> So it's fine to not make it top priority.
> 
> Anyway, as a commonly known scenario, I think users can benefit from cross 
> join sooner or later.
> 
> Thanks,
> Zhu Zhu
> 
> Piotr Nowojski mailto:pi...@ververica.com>> 
> 于2019年8月23日周五 下午6:19写道:
> Hi,
> 
> Thanks for the answers :) Ok I understand the full picture now. +1 from my 
> side on solving this issue somehow. But before we start discussing how to 
> solve it one last control question:
> 
> I guess this multicast is intended to be used in blink planner, right? 
> Assuming that we implement the multicast support now, when would it be used 
> by the blink? I would like to avoid a scenario, where we implement an unused 
> feature and we keep maintaining it for a long period of time.
> 
> Piotrek
> 
> PS, try to include motivating examples, including concrete ones in the 
> proposals/design docs, for example in the very first paragraph. Especially if 
> it’s a commonly known feature like cross join :)
> 
> > On 23 Aug 2019, at 11:38, Yun Gao  wrote:
> > 
> > Hi Piotr,
> > 
> >Thanks a lot for sharing the thoughts! 
> > 
> >For the iteration, agree with that multicasting is not necessary. 
> > Exploring the broadcast interface to Output of the operators in some way 
> > should also solve this issue, and I think it should be even more convenient 
> > to have the broadcast method for the iteration. 
> > 
> >Also thanks Zhu Zhu for the cross join case!
> >  Best, 
> >   Yun
> > 
> > 
> > 
> > --
> > From:Zhu Zhu mailto:reed...@gmail.com>>
> > Send Time:2019 Aug. 23 (Fri.) 17:25
> > To:dev mailto:dev@flink.apache.org>>
> > Cc:Yun Gao mailto:yungao...@aliyun.com>>
> > Subject:Re: [DISCUSS] Enhance Support for Multicast Communication Pattern
> > 
> > Hi Piotr,
> > 
> > Yes you are right it's a distributed cross join requirement.
> > Broadcast join can help with cross join cases. But users cannot use it if 
> > the data set to join is too large to fit into one subtask.
> > 
> > Sorry for left some details behind.
> > 
> > Thanks,
> > Zhu Zhu
> > Piotr Nowojski mailto:pi...@ververica.com>> 
> > 于2019年8月23日周五 下午4:57写道:
> > Hi Yun and Zhu Zhu,
> > 
> > Thanks for the more detailed example Zhu Zhu.
> > 
> > As far as I understand for the iterations example we do not need 
> > multicasting. Regarding the Join example, I don’t fully understand it. The 
> > example that Zhu Zhu presented has a drawback of sending both tables to 
> > multiple nodes. What’s the benefit of using broadcast join over a hash join 
> > in such case? As far as I know, the biggest benefit of using broadcast join 
> > instead of hash join is that we can avoid sending the larger table over the 
> > network, because we can perform the join locally. In this example we are 
> > sending both of the tables to multiple nodes, which should defeat the 
> > purpose.
> > 
> > Is it about implementing cross join or near cross joins in a distributed 
> > fashion? 
> > 
> >> if we introduce a new MulticastRecordWriter
> > 
> > That’s one of the solutions. It might have a drawback of 3 class 
> > virtualisation problem (We have RecordWriter and BroadcastRecordWriter 
> > already). With up to two implementations, JVM is able to devirtualise the 
> > calls.
> > 
> > Previously I was also thinking about just providing two different 
> > ChannelSelector interfaces. One with `int[]` and `SingleChannelSelector` 
> > with plain `int` and based on that, RecordWriter could perform some magic 
> > (worst case scenario `instaceof` checks).
> > 
> > Another solution might be to change `ChannelSelector` interface into an 
> > iterator.
> > 
> > But let's discuss the details after we agree on implementing this.
> > 
> > Piotrek
> > 
> >> On 23 Aug 2019, at 10:20, Y

[jira] [Created] (FLINK-13840) Let StandaloneJobClusterEntryPoint use user code class loader

2019-08-23 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-13840:
-

 Summary: Let StandaloneJobClusterEntryPoint use user code class 
loader
 Key: FLINK-13840
 URL: https://issues.apache.org/jira/browse/FLINK-13840
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Affects Versions: 1.9.0, 1.10.0
Reporter: Till Rohrmann
 Fix For: 1.10.0


In order to resolve class loading issues when using the 
{{StandaloneJobClusterEntryPoint}}, it would be better to run the user code in 
the user code class loader which supports child first class loading. At the 
moment, the user code jar is part of the system class path and, hence, part of 
the system class loader.

An easy way to solve this problem would be to place the user code in a 
different directory than {{lib}} and then specify this path as an additional 
classpath when creating the {{PackagedProgram}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Flink client api enhancement for downstream project

2019-08-23 Thread Till Rohrmann
Hi Tison,

just a quick comment concerning the class loading issues when using the per
job mode. The community wants to change it so that the
StandaloneJobClusterEntryPoint actually uses the user code class loader
with child first class loading [1]. Hence, I hope that this problem will be
resolved soon.

[1] https://issues.apache.org/jira/browse/FLINK-13840

Cheers,
Till

On Fri, Aug 23, 2019 at 2:47 PM Kostas Kloudas  wrote:

> Hi all,
>
> On the topic of web submission, I agree with Till that it only seems
> to complicate things.
> It is bad for security, job isolation (anybody can submit/cancel jobs),
> and its
> implementation complicates some parts of the code. So, if it were to
> redesign the
> WebUI, maybe this part could be left out. In addition, I would say
> that the ability to cancel
> jobs could also be left out.
>
> Also I would also be in favour of removing the "detached" mode, for
> the reasons mentioned
> above (i.e. because now we will have a future representing the result
> on which the user
> can choose to wait or not).
>
> Now for the separating job submission and cluster creation, I am in
> favour of keeping both.
> Once again, the reasons are mentioned above by Stephan, Till, Aljoscha
> and also Zili seems
> to agree. They mainly have to do with security, isolation and ease of
> resource management
> for the user as he knows that "when my job is done, everything will be
> cleared up". This is
> also the experience you get when launching a process on your local OS.
>
> On excluding the per-job mode from returning a JobClient or not, I
> believe that eventually
> it would be nice to allow users to get back a jobClient. The reason is
> that 1) I cannot
> find any objective reason why the user-experience should diverge, and
> 2) this will be the
> way that the user will be able to interact with his running job.
> Assuming that the necessary
> ports are open for the REST API to work, then I think that the
> JobClient can run against the
> REST API without problems. If the needed ports are not open, then we
> are safe to not return
> a JobClient, as the user explicitly chose to close all points of
> communication to his running job.
>
> On the topic of not hijacking the "env.execute()" in order to get the
> Plan, I definitely agree but
> for the proposal of having a "compile()" method in the env, I would
> like to have a better look at
> the existing code.
>
> Cheers,
> Kostas
>
> On Fri, Aug 23, 2019 at 5:52 AM Zili Chen  wrote:
> >
> > Hi Yang,
> >
> > It would be helpful if you check Stephan's last comment,
> > which states that isolation is important.
> >
> > For per-job mode, we run a dedicated cluster(maybe it
> > should have been a couple of JM and TMs during FLIP-6
> > design) for a specific job. Thus the process is prevented
> > from other jobs.
> >
> > In our cases there was a time we suffered from multi
> > jobs submitted by different users and they affected
> > each other so that all ran into an error state. Also,
> > run the client inside the cluster could save client
> > resource at some points.
> >
> > However, we also face several issues as you mentioned,
> > that in per-job mode it always uses parent classloader
> > thus classloading issues occur.
> >
> > BTW, one can makes an analogy between session/per-job mode
> > in  Flink, and client/cluster mode in Spark.
> >
> > Best,
> > tison.
> >
> >
> > Yang Wang  于2019年8月22日周四 上午11:25写道:
> >
> > > From the user's perspective, it is really confused about the scope of
> > > per-job cluster.
> > >
> > >
> > > If it means a flink cluster with single job, so that we could get
> better
> > > isolation.
> > >
> > > Now it does not matter how we deploy the cluster, directly
> deploy(mode1)
> > >
> > > or start a flink cluster and then submit job through cluster
> client(mode2).
> > >
> > >
> > > Otherwise, if it just means directly deploy, how should we name the
> mode2,
> > >
> > > session with job or something else?
> > >
> > > We could also benefit from the mode2. Users could get the same
> isolation
> > > with mode1.
> > >
> > > The user code and dependencies will be loaded by user class loader
> > >
> > > to avoid class conflict with framework.
> > >
> > >
> > >
> > > Anyway, both of the two submission modes are useful.
> > >
> > > We just need to clarify the concepts.
> > >
> > >
> > >
> > >
> > > Best,
> > >
> > > Yang
> > >
> > > Zili Chen  于2019年8月20日周二 下午5:58写道:
> > >
> > > > Thanks for the clarification.
> > > >
> > > > The idea JobDeployer ever came into my mind when I was muddled with
> > > > how to execute per-job mode and session mode with the same user code
> > > > and framework codepath.
> > > >
> > > > With the concept JobDeployer we back to the statement that
> environment
> > > > knows every configs of cluster deployment and job submission. We
> > > > configure or generate from configuration a specific JobDeployer in
> > > > environment and then code align on
> > > >
> > > > *JobClient client = env.execute().get();*

[DISCUSS] Use Java's Duration instead of Flink's Time

2019-08-23 Thread Stephan Ewen
Hi all!

Many parts of the code use Flink's "Time" class. The Time really is a "time
interval" or a "Duration".

Since Java 8, there is a Java class "Duration" that is nice and flexible to
use.
I would suggest we start using Java Duration instead and drop Time as much
as possible in the runtime from now on.

Maybe even drop that class from the API in Flink 2.0.

Best,
Stephan


Re: [DISCUSS] Flink client api enhancement for downstream project

2019-08-23 Thread Zili Chen
Hi Till,

Thanks for your update. Nice to hear :-)

Best,
tison.


Till Rohrmann  于2019年8月23日周五 下午10:39写道:

> Hi Tison,
>
> just a quick comment concerning the class loading issues when using the per
> job mode. The community wants to change it so that the
> StandaloneJobClusterEntryPoint actually uses the user code class loader
> with child first class loading [1]. Hence, I hope that this problem will be
> resolved soon.
>
> [1] https://issues.apache.org/jira/browse/FLINK-13840
>
> Cheers,
> Till
>
> On Fri, Aug 23, 2019 at 2:47 PM Kostas Kloudas  wrote:
>
> > Hi all,
> >
> > On the topic of web submission, I agree with Till that it only seems
> > to complicate things.
> > It is bad for security, job isolation (anybody can submit/cancel jobs),
> > and its
> > implementation complicates some parts of the code. So, if it were to
> > redesign the
> > WebUI, maybe this part could be left out. In addition, I would say
> > that the ability to cancel
> > jobs could also be left out.
> >
> > Also I would also be in favour of removing the "detached" mode, for
> > the reasons mentioned
> > above (i.e. because now we will have a future representing the result
> > on which the user
> > can choose to wait or not).
> >
> > Now for the separating job submission and cluster creation, I am in
> > favour of keeping both.
> > Once again, the reasons are mentioned above by Stephan, Till, Aljoscha
> > and also Zili seems
> > to agree. They mainly have to do with security, isolation and ease of
> > resource management
> > for the user as he knows that "when my job is done, everything will be
> > cleared up". This is
> > also the experience you get when launching a process on your local OS.
> >
> > On excluding the per-job mode from returning a JobClient or not, I
> > believe that eventually
> > it would be nice to allow users to get back a jobClient. The reason is
> > that 1) I cannot
> > find any objective reason why the user-experience should diverge, and
> > 2) this will be the
> > way that the user will be able to interact with his running job.
> > Assuming that the necessary
> > ports are open for the REST API to work, then I think that the
> > JobClient can run against the
> > REST API without problems. If the needed ports are not open, then we
> > are safe to not return
> > a JobClient, as the user explicitly chose to close all points of
> > communication to his running job.
> >
> > On the topic of not hijacking the "env.execute()" in order to get the
> > Plan, I definitely agree but
> > for the proposal of having a "compile()" method in the env, I would
> > like to have a better look at
> > the existing code.
> >
> > Cheers,
> > Kostas
> >
> > On Fri, Aug 23, 2019 at 5:52 AM Zili Chen  wrote:
> > >
> > > Hi Yang,
> > >
> > > It would be helpful if you check Stephan's last comment,
> > > which states that isolation is important.
> > >
> > > For per-job mode, we run a dedicated cluster(maybe it
> > > should have been a couple of JM and TMs during FLIP-6
> > > design) for a specific job. Thus the process is prevented
> > > from other jobs.
> > >
> > > In our cases there was a time we suffered from multi
> > > jobs submitted by different users and they affected
> > > each other so that all ran into an error state. Also,
> > > run the client inside the cluster could save client
> > > resource at some points.
> > >
> > > However, we also face several issues as you mentioned,
> > > that in per-job mode it always uses parent classloader
> > > thus classloading issues occur.
> > >
> > > BTW, one can makes an analogy between session/per-job mode
> > > in  Flink, and client/cluster mode in Spark.
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > Yang Wang  于2019年8月22日周四 上午11:25写道:
> > >
> > > > From the user's perspective, it is really confused about the scope of
> > > > per-job cluster.
> > > >
> > > >
> > > > If it means a flink cluster with single job, so that we could get
> > better
> > > > isolation.
> > > >
> > > > Now it does not matter how we deploy the cluster, directly
> > deploy(mode1)
> > > >
> > > > or start a flink cluster and then submit job through cluster
> > client(mode2).
> > > >
> > > >
> > > > Otherwise, if it just means directly deploy, how should we name the
> > mode2,
> > > >
> > > > session with job or something else?
> > > >
> > > > We could also benefit from the mode2. Users could get the same
> > isolation
> > > > with mode1.
> > > >
> > > > The user code and dependencies will be loaded by user class loader
> > > >
> > > > to avoid class conflict with framework.
> > > >
> > > >
> > > >
> > > > Anyway, both of the two submission modes are useful.
> > > >
> > > > We just need to clarify the concepts.
> > > >
> > > >
> > > >
> > > >
> > > > Best,
> > > >
> > > > Yang
> > > >
> > > > Zili Chen  于2019年8月20日周二 下午5:58写道:
> > > >
> > > > > Thanks for the clarification.
> > > > >
> > > > > The idea JobDeployer ever came into my mind when I was muddled with
> > > > > how to execute per-job mode and session mode w

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Zili Chen
Hi Till,

Did we mention this in release note(or maybe previous release note where we
did the exclusion)?

Best,
tison.


Till Rohrmann  于2019年8月23日周五 下午10:28写道:

> Hi Gavin,
>
> if I'm not mistaken, then the community excluded the Scala FlinkShell
> since a couple of versions for Scala 2.12. The problem seems to be that
> some of the tests failed. See here [1] for more information.
>
> [1] https://issues.apache.org/jira/browse/FLINK-10911
>
> Cheers,
> Till
>
> On Fri, Aug 23, 2019 at 11:44 AM Gavin Lee  wrote:
>
>> I used package on apache official site, with mirror [1], the difference is
>> I used scala 2.12 version.
>> I also tried to build from source for both scala 2.11 and 2.12, when build
>> 2.12 the FlinkShell.class is in flink-dist jar file but after running mvn
>> clean package -Dscala-2.12, this class was removed in flink-dist_2.12-1.9
>> jar
>> file.
>> Seems broken here for scala 2.12, right?
>>
>> [1]
>>
>> http://mirror.bit.edu.cn/apache/flink/flink-1.9.0/flink-1.9.0-bin-scala_2.12.tgz
>>
>> On Fri, Aug 23, 2019 at 4:37 PM Zili Chen  wrote:
>>
>> > I downloaded 1.9.0 dist from here[1] and didn't see the issue. Where do
>> you
>> > download it? Could you try to download the dist from [1] and see whether
>> > the problem last?
>> >
>> > Best,
>> > tison.
>> >
>> > [1]
>> >
>> http://mirrors.tuna.tsinghua.edu.cn/apache/flink/flink-1.9.0/flink-1.9.0-bin-scala_2.11.tgz
>> >
>> >
>> > Gavin Lee  于2019年8月23日周五 下午4:34写道:
>> >
>> >> Thanks for your reply @Zili.
>> >> I'm afraid it's not the same issue.
>> >> I found that the FlinkShell.class was not included in flink dist jar
>> file
>> >> in 1.9.0 version.
>> >> Nowhere can find this class file inside jar, either in opt or lib
>> >> directory under root folder of flink distribution.
>> >>
>> >>
>> >> On Fri, Aug 23, 2019 at 4:10 PM Zili Chen 
>> wrote:
>> >>
>> >>> Hi Gavin,
>> >>>
>> >>> I also find a problem in shell if the directory contain whitespace
>> >>> then the final command to run is incorrect. Could you check the
>> >>> final command to be executed?
>> >>>
>> >>> FYI, here is the ticket[1].
>> >>>
>> >>> Best,
>> >>> tison.
>> >>>
>> >>> [1] https://issues.apache.org/jira/browse/FLINK-13827
>> >>>
>> >>>
>> >>> Gavin Lee  于2019年8月23日周五 下午3:36写道:
>> >>>
>>  Why bin/start-scala-shell.sh local return following error?
>> 
>>  bin/start-scala-shell.sh local
>> 
>>  Error: Could not find or load main class
>>  org.apache.flink.api.scala.FlinkShell
>>  For flink 1.8.1 and previous ones, no such issues.
>> 
>>  On Fri, Aug 23, 2019 at 2:05 PM qi luo  wrote:
>> 
>> > Congratulations and thanks for the hard work!
>> >
>> > Qi
>> >
>> > On Aug 22, 2019, at 8:03 PM, Tzu-Li (Gordon) Tai <
>> tzuli...@apache.org>
>> > wrote:
>> >
>> > The Apache Flink community is very happy to announce the release of
>> > Apache Flink 1.9.0, which is the latest major release.
>> >
>> > Apache Flink® is an open-source stream processing framework for
>> > distributed, high-performing, always-available, and accurate data
>> streaming
>> > applications.
>> >
>> > The release is available for download at:
>> > https://flink.apache.org/downloads.html
>> >
>> > Please check out the release blog post for an overview of the
>> > improvements for this new major release:
>> > https://flink.apache.org/news/2019/08/22/release-1.9.0.html
>> >
>> > The full release notes are available in Jira:
>> >
>> >
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344601
>> >
>> > We would like to thank all contributors of the Apache Flink
>> community
>> > who made this release possible!
>> >
>> > Cheers,
>> > Gordon
>> >
>> >
>> >
>> 
>>  --
>>  Gavin
>> 
>> >>>
>> >>
>> >> --
>> >> Gavin
>> >>
>> >
>>
>> --
>> Gavin
>>
>


Re: [DISCUSS] Use Java's Duration instead of Flink's Time

2019-08-23 Thread Zili Chen
Hi Stephan,

I like the idea unify usage of time/duration api. We actually
use at least five different classes for this purposes(see below).

One thing I'd like to pick up is that duration configuration
in Flink is almost in pattern as "60 s" that fits in the pattern
parsed by scala.concurrent.duration.Duration. AFAIK Duration
in Java 8 doesn't support this pattern. However, we can solve
it by introduce a DurationUtils.

Also to clarify, we now have (correct me if any other)

java.time.Duration
scala.concurrent.duration.Duration
scala.concurrent.duration.FiniteDuration
org.apache.flink.api.common.time.Time
org.apache.flink.streaming.api.windowing.time.Time

in use. If we'd prefer java.time.Duration, it is worth to consider
whether we unify all of them into Java's Duration, i.e., Java's
Duration is the first class time/duration api, while others should
be converted into or out from it.

Best,
tison.


Stephan Ewen  于2019年8月23日周五 下午10:45写道:

> Hi all!
>
> Many parts of the code use Flink's "Time" class. The Time really is a "time
> interval" or a "Duration".
>
> Since Java 8, there is a Java class "Duration" that is nice and flexible to
> use.
> I would suggest we start using Java Duration instead and drop Time as much
> as possible in the runtime from now on.
>
> Maybe even drop that class from the API in Flink 2.0.
>
> Best,
> Stephan
>


Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Oytun Tez
Hi all,

We also had to rollback our upgrade effort for 2 reasons:

- Official Docker container is not ready yet
- This artefact is not published with
scala: org.apache.flink:flink-queryable-state-client-java_2.11:jar:1.9.0






---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Fri, Aug 23, 2019 at 10:48 AM Zili Chen  wrote:

> Hi Till,
>
> Did we mention this in release note(or maybe previous release note where
> we did the exclusion)?
>
> Best,
> tison.
>
>
> Till Rohrmann  于2019年8月23日周五 下午10:28写道:
>
>> Hi Gavin,
>>
>> if I'm not mistaken, then the community excluded the Scala FlinkShell
>> since a couple of versions for Scala 2.12. The problem seems to be that
>> some of the tests failed. See here [1] for more information.
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-10911
>>
>> Cheers,
>> Till
>>
>> On Fri, Aug 23, 2019 at 11:44 AM Gavin Lee  wrote:
>>
>>> I used package on apache official site, with mirror [1], the difference
>>> is
>>> I used scala 2.12 version.
>>> I also tried to build from source for both scala 2.11 and 2.12, when
>>> build
>>> 2.12 the FlinkShell.class is in flink-dist jar file but after running mvn
>>> clean package -Dscala-2.12, this class was removed in
>>> flink-dist_2.12-1.9 jar
>>> file.
>>> Seems broken here for scala 2.12, right?
>>>
>>> [1]
>>>
>>> http://mirror.bit.edu.cn/apache/flink/flink-1.9.0/flink-1.9.0-bin-scala_2.12.tgz
>>>
>>> On Fri, Aug 23, 2019 at 4:37 PM Zili Chen  wrote:
>>>
>>> > I downloaded 1.9.0 dist from here[1] and didn't see the issue. Where
>>> do you
>>> > download it? Could you try to download the dist from [1] and see
>>> whether
>>> > the problem last?
>>> >
>>> > Best,
>>> > tison.
>>> >
>>> > [1]
>>> >
>>> http://mirrors.tuna.tsinghua.edu.cn/apache/flink/flink-1.9.0/flink-1.9.0-bin-scala_2.11.tgz
>>> >
>>> >
>>> > Gavin Lee  于2019年8月23日周五 下午4:34写道:
>>> >
>>> >> Thanks for your reply @Zili.
>>> >> I'm afraid it's not the same issue.
>>> >> I found that the FlinkShell.class was not included in flink dist jar
>>> file
>>> >> in 1.9.0 version.
>>> >> Nowhere can find this class file inside jar, either in opt or lib
>>> >> directory under root folder of flink distribution.
>>> >>
>>> >>
>>> >> On Fri, Aug 23, 2019 at 4:10 PM Zili Chen 
>>> wrote:
>>> >>
>>> >>> Hi Gavin,
>>> >>>
>>> >>> I also find a problem in shell if the directory contain whitespace
>>> >>> then the final command to run is incorrect. Could you check the
>>> >>> final command to be executed?
>>> >>>
>>> >>> FYI, here is the ticket[1].
>>> >>>
>>> >>> Best,
>>> >>> tison.
>>> >>>
>>> >>> [1] https://issues.apache.org/jira/browse/FLINK-13827
>>> >>>
>>> >>>
>>> >>> Gavin Lee  于2019年8月23日周五 下午3:36写道:
>>> >>>
>>>  Why bin/start-scala-shell.sh local return following error?
>>> 
>>>  bin/start-scala-shell.sh local
>>> 
>>>  Error: Could not find or load main class
>>>  org.apache.flink.api.scala.FlinkShell
>>>  For flink 1.8.1 and previous ones, no such issues.
>>> 
>>>  On Fri, Aug 23, 2019 at 2:05 PM qi luo  wrote:
>>> 
>>> > Congratulations and thanks for the hard work!
>>> >
>>> > Qi
>>> >
>>> > On Aug 22, 2019, at 8:03 PM, Tzu-Li (Gordon) Tai <
>>> tzuli...@apache.org>
>>> > wrote:
>>> >
>>> > The Apache Flink community is very happy to announce the release of
>>> > Apache Flink 1.9.0, which is the latest major release.
>>> >
>>> > Apache Flink® is an open-source stream processing framework for
>>> > distributed, high-performing, always-available, and accurate data
>>> streaming
>>> > applications.
>>> >
>>> > The release is available for download at:
>>> > https://flink.apache.org/downloads.html
>>> >
>>> > Please check out the release blog post for an overview of the
>>> > improvements for this new major release:
>>> > https://flink.apache.org/news/2019/08/22/release-1.9.0.html
>>> >
>>> > The full release notes are available in Jira:
>>> >
>>> >
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344601
>>> >
>>> > We would like to thank all contributors of the Apache Flink
>>> community
>>> > who made this release possible!
>>> >
>>> > Cheers,
>>> > Gordon
>>> >
>>> >
>>> >
>>> 
>>>  --
>>>  Gavin
>>> 
>>> >>>
>>> >>
>>> >> --
>>> >> Gavin
>>> >>
>>> >
>>>
>>> --
>>> Gavin
>>>
>>


Re: CiBot Update

2019-08-23 Thread Ethan Li
Thank you very much Chesnay! This is helpful

> On Aug 23, 2019, at 2:58 AM, Chesnay Schepler  wrote:
> 
> @Ethan Li The source for the CiBot is available here 
> . The implementation of this command is 
> tightly connected to how the CiBot works; but conceptually it looks at a PR, 
> finds the most recent build that ran, and uses the Travis REST API to restart 
> the build.
> Additionally, it keeps track of which comments have been processed by storing 
> the comment ID in the CI report.
> If you have further questions, feel free to ping me directly.
> 
> @Dianfu I agree, we should include it somewhere in either the flinkbot 
> template or the CI report.
> 
> On 23/08/2019 03:35, Dian Fu wrote:
>> Thanks Chesnay for your great work! A very useful feature!
>> 
>> Just one minor suggestion: It will be better if we could add this command to 
>> the section "Bot commands" in the flinkbot template.
>> 
>> Regards,
>> Dian
>> 
>>> 在 2019年8月23日,上午2:06,Ethan Li  写道:
>>> 
>>> My question is specifically about implementation of "@flinkbot run travis"
>>> 
 On Aug 22, 2019, at 1:06 PM, Ethan Li  wrote:
 
 Hi Chesnay,
 
 This is really nice feature!
 
 Can I ask how is this implemented? Do you have the related Jira/PR/docs 
 that I can take a look? I’d like to introduce it to another project if 
 applicable. Thank you very much!
 
 Best,
 Ethan
 
> On Aug 22, 2019, at 8:34 AM, Biao Liu  > wrote:
> 
> Thanks Chesnay a lot,
> 
> I love this feature!
> 
> Thanks,
> Biao /'bɪ.aʊ/
> 
> 
> 
> On Thu, 22 Aug 2019 at 20:55, Hequn Cheng  > wrote:
> 
>> Cool, thanks Chesnay a lot for the improvement!
>> 
>> Best, Hequn
>> 
>> On Thu, Aug 22, 2019 at 5:02 PM Zhu Zhu > > wrote:
>> 
>>> Thanks Chesnay for the CI improvement!
>>> It is very helpful.
>>> 
>>> Thanks,
>>> Zhu Zhu
>>> 
>>> zhijiang >> > 于2019年8月22日周四 下午4:18写道:
>>> 
 It is really very convenient now. Valuable work, Chesnay!
 
 Best,
 Zhijiang
 --
 From:Till Rohrmann mailto:trohrm...@apache.org>>
 Send Time:2019年8月22日(星期四) 10:13
 To:dev mailto:dev@flink.apache.org>>
 Subject:Re: CiBot Update
 
 Thanks for the continuous work on the CiBot Chesnay!
 
 Cheers,
 Till
 
 On Thu, Aug 22, 2019 at 9:47 AM Jark Wu >>> > wrote:
 
> Great work! Thanks Chesnay!
> 
> 
> 
> On Thu, 22 Aug 2019 at 15:42, Xintong Song  >
 wrote:
>> The re-triggering travis feature is so convenient. Thanks Chesnay~!
>> 
>> Thank you~
>> 
>> Xintong Song
>> 
>> 
>> 
>> On Thu, Aug 22, 2019 at 9:26 AM Stephan Ewen > >
>>> wrote:
>>> Nice, thanks!
>>> 
>>> On Thu, Aug 22, 2019 at 3:59 AM Zili Chen >> >
> wrote:
 Thanks for your announcement. Nice work!
 
 Best,
 tison.
 
 
 vino yang mailto:yanghua1...@gmail.com>> 
 于2019年8月22日周四 上午8:14写道:
 
> +1 for "@flinkbot run travis", it is very convenient.
> 
> Chesnay Schepler mailto:ches...@apache.org>> 
> 于2019年8月21日周三
>> 下午9:12写道:
>> Hi everyone,
>> 
>> this is an update on recent changes to the CI bot.
>> 
>> 
>> The bot now cancels builds if a new commit was added to a
>> PR,
 and
>> cancels all builds if the PR was closed.
>> (This was implemented a while ago; I'm just mentioning it
>>> again
> for
>> discoverability)
>> 
>> 
>> Additionally, starting today you can now re-trigger a
>> Travis
 run
> by
>> writing a comment "@flinkbot run travis"; this means you no
> longer
>>> have
>> to commit an empty commit or do other shenanigans to get
 another
>>> build
>> running.
>> Note that this will /not/ work if the PR was re-opened,
>> until
 at
>>> least
 1
>> new build was triggered by a push.
>> 
 
>> 
> 



Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-23 Thread Yun Gao
 Hi Piotr,

  Very thanks for the suggestions!  

 Totally agree with that we could first focus on the broadcast scenarios 
and exposing the broadcastEmit method first considering the semantics and 
performance. 

 For the keyed stream, I also agree with that broadcasting keyed records to 
all the tasks may be confused considering the semantics of keyed partitioner. 
However, in the iteration case supporting broadcast over keyed partitioner 
should be required since users may create any subgraph for the iteration body, 
including the operators with key. I think a possible solution to this issue is 
to introduce another data type for 'broadcastEmit'. For example, for an 
operator Operator, it may broadcast emit another type E instead of T, and 
the transmitting E will bypass the partitioner and setting keyed context. This 
should result in the design to introduce customized operator event (option 1 in 
the document). The cost of this method is that we need to introduce a new type 
of StreamElement and new interface for this type, but it should be suitable for 
both keyed or non-keyed partitioner.

Best,
Yun 



--
From:Piotr Nowojski 
Send Time:2019 Aug. 23 (Fri.) 22:29
To:Zhu Zhu 
Cc:dev ; Yun Gao 
Subject:Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

Hi,

If the primary motivation is broadcasting (for the iterations) and we have no 
immediate need for multicast (cross join), I would prefer to first expose 
broadcast via the DataStream API and only later, once we finally need it, 
support multicast. As I wrote, multicast would be more challenging to 
implement, with more complicated runtime and API. And re-using multicast just 
to support broadcast doesn’t have much sense:

1. It’s a bit obfuscated. It’s easier to understand collectBroadcast(record) or 
broadcastEmit(record) compared to some multicast channel selector that just 
happens to return all of the channels.
2. There are performance benefits of explicitly calling 
`RecordWriter#broadcastEmit`.


On a different note, what would be the semantic of such broadcast emit on 
KeyedStream? Would it be supported? Or would we limit support only to the 
non-keyed streams?

Piotrek

> On 23 Aug 2019, at 12:48, Zhu Zhu  wrote:
> 
> Thanks Piotr,
> 
> Users asked for this feature sometimes ago when they migrating batch jobs to 
> Flink(Blink). 
> It's not very urgent as they have taken some workarounds to solve it.(like 
> partitioning data set to different job vertices)
> So it's fine to not make it top priority.
> 
> Anyway, as a commonly known scenario, I think users can benefit from cross 
> join sooner or later.
> 
> Thanks,
> Zhu Zhu
> 
> Piotr Nowojski mailto:pi...@ververica.com>> 
> 于2019年8月23日周五 下午6:19写道:
> Hi,
> 
> Thanks for the answers :) Ok I understand the full picture now. +1 from my 
> side on solving this issue somehow. But before we start discussing how to 
> solve it one last control question:
> 
> I guess this multicast is intended to be used in blink planner, right? 
> Assuming that we implement the multicast support now, when would it be used 
> by the blink? I would like to avoid a scenario, where we implement an unused 
> feature and we keep maintaining it for a long period of time.
> 
> Piotrek
> 
> PS, try to include motivating examples, including concrete ones in the 
> proposals/design docs, for example in the very first paragraph. Especially if 
> it’s a commonly known feature like cross join :)
> 
> > On 23 Aug 2019, at 11:38, Yun Gao  wrote:
> > 
> > Hi Piotr,
> > 
> >Thanks a lot for sharing the thoughts! 
> > 
> >For the iteration, agree with that multicasting is not necessary. 
> > Exploring the broadcast interface to Output of the operators in some way 
> > should also solve this issue, and I think it should be even more convenient 
> > to have the broadcast method for the iteration. 
> > 
> >Also thanks Zhu Zhu for the cross join case!
> >  Best, 
> >   Yun
> > 
> > 
> > 
> > --
> > From:Zhu Zhu mailto:reed...@gmail.com>>
> > Send Time:2019 Aug. 23 (Fri.) 17:25
> > To:dev mailto:dev@flink.apache.org>>
> > Cc:Yun Gao mailto:yungao...@aliyun.com>>
> > Subject:Re: [DISCUSS] Enhance Support for Multicast Communication Pattern
> > 
> > Hi Piotr,
> > 
> > Yes you are right it's a distributed cross join requirement.
> > Broadcast join can help with cross join cases. But users cannot use it if 
> > the data set to join is too large to fit into one subtask.
> > 
> > Sorry for left some details behind.
> > 
> > Thanks,
> > Zhu Zhu
> > Piotr Nowojski mailto:pi...@ververica.com>> 
> > 于2019年8月23日周五 下午4:57写道:
> > Hi Yun and Zhu Zhu,
> > 
> > Thanks for the more detailed example Zhu Zhu.
> > 
> > As far as I understand for the iterations example we do not need 
> > multicasting. Regarding the Join example, I don’t fully understand it. The 
> > exampl

[DISCUSS] Add ARM CI build to Flink (information-only)

2019-08-23 Thread Stephan Ewen
Hi all!

As part of the Flink on ARM effort, there is a pull request that triggers a
build on OpenLabs CI for each push and runs tests on ARM machines.

Currently that build is roughly equivalent to what the "core" and "tests"
profiles do on Travis.
The result will be posted to the PR comments, similar to the Flink Bot's
Travis build result.
The build currently passes :-) so Flink seems to be okay on ARM.

My suggestion would be to try and add this and gather some experience with
it.
The Travis build results should be our "ground truth" and the ARM CI
(openlabs CI) would be "informational only" at the beginning, but helping
us understand when we break ARM support.

You can see this in the PR that adds the openlabs CI config:
https://github.com/apache/flink/pull/9416

Any objections?

Best,
Stephan


Re: [DISCUSS] Add ARM CI build to Flink (information-only)

2019-08-23 Thread Chesnay Schepler

I'm wondering what we are supposed to do if the build fails?
We aren't providing and guides on setting up an arm dev environment; so 
reproducing it locally isn't possible.


On 23/08/2019 17:55, Stephan Ewen wrote:

Hi all!

As part of the Flink on ARM effort, there is a pull request that triggers a
build on OpenLabs CI for each push and runs tests on ARM machines.

Currently that build is roughly equivalent to what the "core" and "tests"
profiles do on Travis.
The result will be posted to the PR comments, similar to the Flink Bot's
Travis build result.
The build currently passes :-) so Flink seems to be okay on ARM.

My suggestion would be to try and add this and gather some experience with
it.
The Travis build results should be our "ground truth" and the ARM CI
(openlabs CI) would be "informational only" at the beginning, but helping
us understand when we break ARM support.

You can see this in the PR that adds the openlabs CI config:
https://github.com/apache/flink/pull/9416

Any objections?

Best,
Stephan





[VOTE] Release flink-shaded 8.0, release candidate #1

2019-08-23 Thread Chesnay Schepler

Hi everyone,
Please review and vote on the release candidate #1 for the version 8.0, 
as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org 
[2], which are signed with the key with fingerprint 11d464BA [3],

* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-8.0-rc1" [5],
* website pull request listing the new release [6].

The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.


Thanks,
Chesnay

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345488

[2] https://dist.apache.org/repos/dist/dev/flink/flink-shaded-8.0-rc1/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1237
[5] https://github.com/apache/flink-shaded/tree/release-8.0-rc1
[6] https://github.com/apache/flink-web/pull/255



[jira] [Created] (FLINK-13841) Extend Hive version support to all 1.2 and 2.3 versions

2019-08-23 Thread Xuefu Zhang (Jira)
Xuefu Zhang created FLINK-13841:
---

 Summary: Extend Hive version support to all 1.2 and 2.3 versions
 Key: FLINK-13841
 URL: https://issues.apache.org/jira/browse/FLINK-13841
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Reporter: Xuefu Zhang
 Fix For: 1.10.0


This is to support all 1.2 (1.2.0, 1.2.1, 1.2.2) and 2.3 (2.3.0-5) versions.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Use Java's Duration instead of Flink's Time

2019-08-23 Thread vino yang
+1 to replace the Time class provided by Flink with Java's Duration:


   - Java's Duration has better representation than the Flink's Time class;
   - As a built-in Java class, Duration class has a clear advantage over
   Java's Time class when interacting with other Java APIs and third-party
   libraries;


But I have reservations about replacing the Duration and FineDuration
classes in scala with the Duration class in Java. Java and Scala have
different types of systems. Currently, Duration (scala) and FineDuration
(scala) work well.  In addition, this work brings additional complexity and
cost compared to the gains obtained.

Best,
Vino

Zili Chen  于2019年8月23日周五 下午11:14写道:

> Hi Stephan,
>
> I like the idea unify usage of time/duration api. We actually
> use at least five different classes for this purposes(see below).
>
> One thing I'd like to pick up is that duration configuration
> in Flink is almost in pattern as "60 s" that fits in the pattern
> parsed by scala.concurrent.duration.Duration. AFAIK Duration
> in Java 8 doesn't support this pattern. However, we can solve
> it by introduce a DurationUtils.
>
> Also to clarify, we now have (correct me if any other)
>
> java.time.Duration
> scala.concurrent.duration.Duration
> scala.concurrent.duration.FiniteDuration
> org.apache.flink.api.common.time.Time
> org.apache.flink.streaming.api.windowing.time.Time
>
> in use. If we'd prefer java.time.Duration, it is worth to consider
> whether we unify all of them into Java's Duration, i.e., Java's
> Duration is the first class time/duration api, while others should
> be converted into or out from it.
>
> Best,
> tison.
>
>
> Stephan Ewen  于2019年8月23日周五 下午10:45写道:
>
> > Hi all!
> >
> > Many parts of the code use Flink's "Time" class. The Time really is a
> "time
> > interval" or a "Duration".
> >
> > Since Java 8, there is a Java class "Duration" that is nice and flexible
> to
> > use.
> > I would suggest we start using Java Duration instead and drop Time as
> much
> > as possible in the runtime from now on.
> >
> > Maybe even drop that class from the API in Flink 2.0.
> >
> > Best,
> > Stephan
> >
>


Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Gavin Lee
Got it.
Thanks Till & Zili.
+1 for the release notes need to cover such issues.


On Fri, Aug 23, 2019 at 11:01 PM Oytun Tez  wrote:

> Hi all,
>
> We also had to rollback our upgrade effort for 2 reasons:
>
> - Official Docker container is not ready yet
> - This artefact is not published with
> scala: org.apache.flink:flink-queryable-state-client-java_2.11:jar:1.9.0
>
>
>
>
>
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>
>
> On Fri, Aug 23, 2019 at 10:48 AM Zili Chen  wrote:
>
>> Hi Till,
>>
>> Did we mention this in release note(or maybe previous release note where
>> we did the exclusion)?
>>
>> Best,
>> tison.
>>
>>
>> Till Rohrmann  于2019年8月23日周五 下午10:28写道:
>>
>>> Hi Gavin,
>>>
>>> if I'm not mistaken, then the community excluded the Scala FlinkShell
>>> since a couple of versions for Scala 2.12. The problem seems to be that
>>> some of the tests failed. See here [1] for more information.
>>>
>>> [1] https://issues.apache.org/jira/browse/FLINK-10911
>>>
>>> Cheers,
>>> Till
>>>
>>> On Fri, Aug 23, 2019 at 11:44 AM Gavin Lee  wrote:
>>>
 I used package on apache official site, with mirror [1], the difference
 is
 I used scala 2.12 version.
 I also tried to build from source for both scala 2.11 and 2.12, when
 build
 2.12 the FlinkShell.class is in flink-dist jar file but after running
 mvn
 clean package -Dscala-2.12, this class was removed in
 flink-dist_2.12-1.9 jar
 file.
 Seems broken here for scala 2.12, right?

 [1]

 http://mirror.bit.edu.cn/apache/flink/flink-1.9.0/flink-1.9.0-bin-scala_2.12.tgz

 On Fri, Aug 23, 2019 at 4:37 PM Zili Chen  wrote:

 > I downloaded 1.9.0 dist from here[1] and didn't see the issue. Where
 do you
 > download it? Could you try to download the dist from [1] and see
 whether
 > the problem last?
 >
 > Best,
 > tison.
 >
 > [1]
 >
 http://mirrors.tuna.tsinghua.edu.cn/apache/flink/flink-1.9.0/flink-1.9.0-bin-scala_2.11.tgz
 >
 >
 > Gavin Lee  于2019年8月23日周五 下午4:34写道:
 >
 >> Thanks for your reply @Zili.
 >> I'm afraid it's not the same issue.
 >> I found that the FlinkShell.class was not included in flink dist jar
 file
 >> in 1.9.0 version.
 >> Nowhere can find this class file inside jar, either in opt or lib
 >> directory under root folder of flink distribution.
 >>
 >>
 >> On Fri, Aug 23, 2019 at 4:10 PM Zili Chen 
 wrote:
 >>
 >>> Hi Gavin,
 >>>
 >>> I also find a problem in shell if the directory contain whitespace
 >>> then the final command to run is incorrect. Could you check the
 >>> final command to be executed?
 >>>
 >>> FYI, here is the ticket[1].
 >>>
 >>> Best,
 >>> tison.
 >>>
 >>> [1] https://issues.apache.org/jira/browse/FLINK-13827
 >>>
 >>>
 >>> Gavin Lee  于2019年8月23日周五 下午3:36写道:
 >>>
  Why bin/start-scala-shell.sh local return following error?
 
  bin/start-scala-shell.sh local
 
  Error: Could not find or load main class
  org.apache.flink.api.scala.FlinkShell
  For flink 1.8.1 and previous ones, no such issues.
 
  On Fri, Aug 23, 2019 at 2:05 PM qi luo  wrote:
 
 > Congratulations and thanks for the hard work!
 >
 > Qi
 >
 > On Aug 22, 2019, at 8:03 PM, Tzu-Li (Gordon) Tai <
 tzuli...@apache.org>
 > wrote:
 >
 > The Apache Flink community is very happy to announce the release
 of
 > Apache Flink 1.9.0, which is the latest major release.
 >
 > Apache Flink® is an open-source stream processing framework for
 > distributed, high-performing, always-available, and accurate data
 streaming
 > applications.
 >
 > The release is available for download at:
 > https://flink.apache.org/downloads.html
 >
 > Please check out the release blog post for an overview of the
 > improvements for this new major release:
 > https://flink.apache.org/news/2019/08/22/release-1.9.0.html
 >
 > The full release notes are available in Jira:
 >
 >
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344601
 >
 > We would like to thank all contributors of the Apache Flink
 community
 > who made this release possible!
 >
 > Cheers,
 > Gordon
 >
 >
 >
 
  --
  Gavin
 
 >>>
 >>
 >> --
 >> Gavin
 >>
 >

 --
 Gavin

>>>

-- 
Gavin


Re: [DISCUSS] Use Java's Duration instead of Flink's Time

2019-08-23 Thread Zili Chen
Hi vino,

I agree that it introduces extra complexity to replace Duration(Scala)
with Duration(Java) *in Scala code*. We could separate the usage for each
language and use a bridge when necessary.

As a matter of fact, Scala concurrent APIs(including Duration) are used
more than necessary at least in flink-runtime. Also we even try to make
flink-runtime scala free.

Best,
tison.


vino yang  于2019年8月24日周六 上午10:05写道:

> +1 to replace the Time class provided by Flink with Java's Duration:
>
>
>- Java's Duration has better representation than the Flink's Time class;
>- As a built-in Java class, Duration class has a clear advantage over
>Java's Time class when interacting with other Java APIs and third-party
>libraries;
>
>
> But I have reservations about replacing the Duration and FineDuration
> classes in scala with the Duration class in Java. Java and Scala have
> different types of systems. Currently, Duration (scala) and FineDuration
> (scala) work well.  In addition, this work brings additional complexity and
> cost compared to the gains obtained.
>
> Best,
> Vino
>
> Zili Chen  于2019年8月23日周五 下午11:14写道:
>
> > Hi Stephan,
> >
> > I like the idea unify usage of time/duration api. We actually
> > use at least five different classes for this purposes(see below).
> >
> > One thing I'd like to pick up is that duration configuration
> > in Flink is almost in pattern as "60 s" that fits in the pattern
> > parsed by scala.concurrent.duration.Duration. AFAIK Duration
> > in Java 8 doesn't support this pattern. However, we can solve
> > it by introduce a DurationUtils.
> >
> > Also to clarify, we now have (correct me if any other)
> >
> > java.time.Duration
> > scala.concurrent.duration.Duration
> > scala.concurrent.duration.FiniteDuration
> > org.apache.flink.api.common.time.Time
> > org.apache.flink.streaming.api.windowing.time.Time
> >
> > in use. If we'd prefer java.time.Duration, it is worth to consider
> > whether we unify all of them into Java's Duration, i.e., Java's
> > Duration is the first class time/duration api, while others should
> > be converted into or out from it.
> >
> > Best,
> > tison.
> >
> >
> > Stephan Ewen  于2019年8月23日周五 下午10:45写道:
> >
> > > Hi all!
> > >
> > > Many parts of the code use Flink's "Time" class. The Time really is a
> > "time
> > > interval" or a "Duration".
> > >
> > > Since Java 8, there is a Java class "Duration" that is nice and
> flexible
> > to
> > > use.
> > > I would suggest we start using Java Duration instead and drop Time as
> > much
> > > as possible in the runtime from now on.
> > >
> > > Maybe even drop that class from the API in Flink 2.0.
> > >
> > > Best,
> > > Stephan
> > >
> >
>


Re: [DISCUSS] Use Java's Duration instead of Flink's Time

2019-08-23 Thread SHI Xiaogang
+1 to replace Flink's time with Java's Duration.

Besides, i also suggest to use Java's Instant for "point-in-time".
It can take care of time units when we calculate Duration between different
instants.

Regards,
Xiaogang

Zili Chen  于2019年8月24日周六 上午10:45写道:

> Hi vino,
>
> I agree that it introduces extra complexity to replace Duration(Scala)
> with Duration(Java) *in Scala code*. We could separate the usage for each
> language and use a bridge when necessary.
>
> As a matter of fact, Scala concurrent APIs(including Duration) are used
> more than necessary at least in flink-runtime. Also we even try to make
> flink-runtime scala free.
>
> Best,
> tison.
>
>
> vino yang  于2019年8月24日周六 上午10:05写道:
>
> > +1 to replace the Time class provided by Flink with Java's Duration:
> >
> >
> >- Java's Duration has better representation than the Flink's Time
> class;
> >- As a built-in Java class, Duration class has a clear advantage over
> >Java's Time class when interacting with other Java APIs and
> third-party
> >libraries;
> >
> >
> > But I have reservations about replacing the Duration and FineDuration
> > classes in scala with the Duration class in Java. Java and Scala have
> > different types of systems. Currently, Duration (scala) and FineDuration
> > (scala) work well.  In addition, this work brings additional complexity
> and
> > cost compared to the gains obtained.
> >
> > Best,
> > Vino
> >
> > Zili Chen  于2019年8月23日周五 下午11:14写道:
> >
> > > Hi Stephan,
> > >
> > > I like the idea unify usage of time/duration api. We actually
> > > use at least five different classes for this purposes(see below).
> > >
> > > One thing I'd like to pick up is that duration configuration
> > > in Flink is almost in pattern as "60 s" that fits in the pattern
> > > parsed by scala.concurrent.duration.Duration. AFAIK Duration
> > > in Java 8 doesn't support this pattern. However, we can solve
> > > it by introduce a DurationUtils.
> > >
> > > Also to clarify, we now have (correct me if any other)
> > >
> > > java.time.Duration
> > > scala.concurrent.duration.Duration
> > > scala.concurrent.duration.FiniteDuration
> > > org.apache.flink.api.common.time.Time
> > > org.apache.flink.streaming.api.windowing.time.Time
> > >
> > > in use. If we'd prefer java.time.Duration, it is worth to consider
> > > whether we unify all of them into Java's Duration, i.e., Java's
> > > Duration is the first class time/duration api, while others should
> > > be converted into or out from it.
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > Stephan Ewen  于2019年8月23日周五 下午10:45写道:
> > >
> > > > Hi all!
> > > >
> > > > Many parts of the code use Flink's "Time" class. The Time really is a
> > > "time
> > > > interval" or a "Duration".
> > > >
> > > > Since Java 8, there is a Java class "Duration" that is nice and
> > flexible
> > > to
> > > > use.
> > > > I would suggest we start using Java Duration instead and drop Time as
> > > much
> > > > as possible in the runtime from now on.
> > > >
> > > > Maybe even drop that class from the API in Flink 2.0.
> > > >
> > > > Best,
> > > > Stephan
> > > >
> > >
> >
>