Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-26 Thread Kurt Young
>From SQL's perspective, distributed cross join is a valid feature but not
very
urgent. Actually this discuss reminds me about another useful feature
(sorry
for the distraction):

when doing broadcast in batch shuffle mode, we can make each producer only
write one copy of the output data, but not for every consumer. Broadcast
join
is much more useful, and this is a very important optimization. Not sure if
we
have already consider this.

Best,
Kurt


On Mon, Aug 26, 2019 at 12:16 PM Guowei Ma  wrote:

> Thanks Yun for bringing up this discussion and very thanks for all the deep
> thoughts!
>
> For now, I think this discussion contains two scenarios: one if for
> iteration library support and the other is for SQL join support. I think
> both of the two scenarios are useful but they seem to have different best
> suitable solutions. For making the discussion more clear, I would suggest
> to split the discussion into two threads.
>
> And I agree with Piotr that it is very tricky that a keyed stream received
> a "broadcast element". So we may add some new interfaces, which could
> broadcast or process some special "broadcast event". In that way "broadcast
> event" will not be sent with the normal process.
>
> Best,
> Guowei
>
>
> SHI Xiaogang  于2019年8月26日周一 上午9:27写道:
>
> > Hi all,
> >
> > I also think that multicasting is a necessity in Flink, but more details
> > are needed to be considered.
> >
> > Currently network is tightly coupled with states in Flink to achieve
> > automatic scaling. We can only access keyed states in keyed streams and
> > operator states in all streams.
> > In the concrete example of theta-joins implemented with mutlticasting,
> the
> > following questions exist:
> >
> >- In which type of states will the data be stored? Do we need another
> >type of states which is coupled with multicasting streams?
> >- How to ensure the consistency between network and states when jobs
> >scale out or scale in?
> >
> > Regards,
> > Xiaogang
> >
> > Xingcan Cui  于2019年8月25日周日 上午10:03写道:
> >
> > > Hi all,
> > >
> > > Sorry for joining this thread late. Basically, I think enabling
> multicast
> > > pattern could be the right direction, but more detailed implementation
> > > policies need to be discussed.
> > >
> > > Two years ago, I filed an issue [1] about the multicast API. However,
> due
> > > to some reasons, it was laid aside. After that, when I tried to
> > cherry-pick
> > > the change for experimental use, I found the return type of
> > > `selectChannels()` method had changed from `int[]` to `int`, which
> makes
> > > the old implementation not work anymore.
> > >
> > > From my side, the multicast has always been used for theta-join. As far
> > as
> > > I know, it’s an essential requirement for some sophisticated joining
> > > algorithms. Until now, the Flink non-equi joins can still only be
> > executed
> > > single-threaded. If we'd like to make some improvements on this, we
> > should
> > > first take some measures to support multicast pattern.
> > >
> > > Best,
> > > Xingcan
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-6936
> > >
> > > > On Aug 24, 2019, at 5:54 AM, Zhu Zhu  wrote:
> > > >
> > > > Hi Piotr,
> > > >
> > > > Thanks for the explanation.
> > > > Agreed that the broadcastEmit(record) is a better choice for
> > broadcasting
> > > > for the iterations.
> > > > As broadcasting for the iterations is the first motivation, let's
> > support
> > > > it first.
> > > >
> > > > Thanks,
> > > > Zhu Zhu
> > > >
> > > > Yun Gao  于2019年8月23日周五 下午11:56写道:
> > > >
> > > >> Hi Piotr,
> > > >>
> > > >>  Very thanks for the suggestions!
> > > >>
> > > >> Totally agree with that we could first focus on the broadcast
> > > >> scenarios and exposing the broadcastEmit method first considering
> the
> > > >> semantics and performance.
> > > >>
> > > >> For the keyed stream, I also agree with that broadcasting keyed
> > > >> records to all the tasks may be confused considering the semantics
> of
> > > keyed
> > > >> partitioner. However, in the iteration case supporting broadcast
> over
> > > keyed
> > > >> partitioner should be required since users may create any subgraph
> for
> > > the
> > > >> iteration body, including the operators with key. I think a possible
> > > >> solution to this issue is to introduce another data type for
> > > >> 'broadcastEmit'. For example, for an operator Operator, it may
> > > broadcast
> > > >> emit another type E instead of T, and the transmitting E will bypass
> > the
> > > >> partitioner and setting keyed context. This should result in the
> > design
> > > to
> > > >> introduce customized operator event (option 1 in the document). The
> > > cost of
> > > >> this method is that we need to introduce a new type of StreamElement
> > and
> > > >> new interface for this type, but it should be suitable for both
> keyed
> > or
> > > >> non-keyed partitioner.
> > > >>
> > > >> Best,
> > > >> Yun
> > > >>
> > > >>
> > > >>
> > > >> --

[jira] [Created] (FLINK-13850) Refactor part file configuration into a single method

2019-08-26 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-13850:
--

 Summary: Refactor part file configuration into a single method
 Key: FLINK-13850
 URL: https://issues.apache.org/jira/browse/FLINK-13850
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / FileSystem
Reporter: Gyula Fora
 Fix For: 1.10.0


Currently there is only two methods on both format builders
withPartFilePrefix and withPartFileSuffix for configuring the part files but in 
the future it is likely to grow.
 * More settings, different directories for pending / inprogress files etc

I suggest we remove these two methods and replace them with a single : 
withPartFileConfig(..) where we use an extensible config class.

This should be fixed before 1.10 in order to not release the other methods.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-26 Thread Till Rohrmann
The missing support for the Scala shell with Scala 2.12 was documented in
the 1.7 release notes [1].

@Oytun, the docker image should be updated in a bit. Sorry for the
inconveniences. Thanks for the pointer that
flink-queryable-state-client-java_2.11 hasn't been published. We'll upload
this in a bit.

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.7/release-notes/flink-1.7.html#scala-shell-does-not-work-with-scala-212

Cheers,
Till

On Sat, Aug 24, 2019 at 12:14 PM chaojianok  wrote:

> Congratulations and thanks!
> At 2019-08-22 20:03:26, "Tzu-Li (Gordon) Tai"  wrote:
> >The Apache Flink community is very happy to announce the release of Apache
> >Flink 1.9.0, which is the latest major release.
> >
> >Apache Flink® is an open-source stream processing framework for
> >distributed, high-performing, always-available, and accurate data
> streaming
> >applications.
> >
> >The release is available for download at:
> >https://flink.apache.org/downloads.html
> >
> >Please check out the release blog post for an overview of the improvements
> >for this new major release:
> >https://flink.apache.org/news/2019/08/22/release-1.9.0.html
> >
> >The full release notes are available in Jira:
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344601
> >
> >We would like to thank all contributors of the Apache Flink community who
> >made this release possible!
> >
> >Cheers,
> >Gordon
>


[jira] [Created] (FLINK-13851) Time unit of garbage collection in Flink Web UI is not displayed

2019-08-26 Thread Jeff Zhang (Jira)
Jeff Zhang created FLINK-13851:
--

 Summary: Time unit of garbage collection in Flink Web UI is not 
displayed
 Key: FLINK-13851
 URL: https://issues.apache.org/jira/browse/FLINK-13851
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Affects Versions: 1.9.0
Reporter: Jeff Zhang
 Attachments: image-2019-08-26-16-03-45-809.png

!image-2019-08-26-16-03-45-809.png!



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-26 Thread Zili Chen
Hi Oytun,

I think it intents to publish flink-queryable-state-client-java
without scala suffix since it is scala-free. An artifact without
scala suffix has been published [2].

See also [1].

Best,
tison.

[1] https://issues.apache.org/jira/browse/FLINK-12602
[2]
https://mvnrepository.com/artifact/org.apache.flink/flink-queryable-state-client-java/1.9.0



Till Rohrmann  于2019年8月26日周一 下午3:50写道:

> The missing support for the Scala shell with Scala 2.12 was documented in
> the 1.7 release notes [1].
>
> @Oytun, the docker image should be updated in a bit. Sorry for the
> inconveniences. Thanks for the pointer that
> flink-queryable-state-client-java_2.11 hasn't been published. We'll upload
> this in a bit.
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.7/release-notes/flink-1.7.html#scala-shell-does-not-work-with-scala-212
>
> Cheers,
> Till
>
> On Sat, Aug 24, 2019 at 12:14 PM chaojianok  wrote:
>
>> Congratulations and thanks!
>> At 2019-08-22 20:03:26, "Tzu-Li (Gordon) Tai" 
>> wrote:
>> >The Apache Flink community is very happy to announce the release of
>> Apache
>> >Flink 1.9.0, which is the latest major release.
>> >
>> >Apache Flink® is an open-source stream processing framework for
>> >distributed, high-performing, always-available, and accurate data
>> streaming
>> >applications.
>> >
>> >The release is available for download at:
>> >https://flink.apache.org/downloads.html
>> >
>> >Please check out the release blog post for an overview of the
>> improvements
>> >for this new major release:
>> >https://flink.apache.org/news/2019/08/22/release-1.9.0.html
>> >
>> >The full release notes are available in Jira:
>> >
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344601
>> >
>> >We would like to thank all contributors of the Apache Flink community who
>> >made this release possible!
>> >
>> >Cheers,
>> >Gordon
>>
>


[jira] [Created] (FLINK-13852) Support storing in-progress/pending files in different directories (StreamingFileSink)

2019-08-26 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-13852:
--

 Summary: Support storing in-progress/pending files in different 
directories (StreamingFileSink)
 Key: FLINK-13852
 URL: https://issues.apache.org/jira/browse/FLINK-13852
 Project: Flink
  Issue Type: New Feature
  Components: Connectors / FileSystem
Reporter: Gyula Fora


Currently in-progress and pending files are stored in the same directory as the 
final output file. This can be problematic depending on the usage of the final 
output files. One example would be loading the data to hive where we can only 
load all files in a certain directory.

I suggest we allow specifying a Pending/Inprogress base path where we create 
the same bucketing structure as the final files to store only the non-final 
files.

To support this we need to extend the RecoverableWriter interface with a new 
open method for example:

RecoverableFsDataOutputStream open(Path path, Path tmpPath) throws IOException;



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: Watermarks not propagated to WebUI?

2019-08-26 Thread Robert Metzger
Jan, will you be able to test this issue on the now-released Flink 1.9 with
the new UI?

What parallelism is needed to reproduce the issue?


On Thu, Aug 15, 2019 at 1:59 PM Chesnay Schepler  wrote:

> I remember an issue regarding the watermark fetch request from the WebUI
> exceeding some HTTP size limit, since it tries to fetch all watermarks
> at once, and the format of this request isn't exactly efficient.
>
> Querying metrics for individual operators still works since the request
> is small enough.
>
> Not sure whether we ever fixed that.
>
> On 15/08/2019 12:01, Jan Lukavský wrote:
> > Hi,
> >
> > Thomas, thanks for confirming this. I have noticed, that in 1.9 the
> > WebUI has been reworked a lot, does anyone know if this is still an
> > issue? I currently cannot easily try 1.9, so I cannot confirm or
> > disprove that.
> >
> > Jan
> >
> > On 8/14/19 6:25 PM, Thomas Weise wrote:
> >> I have also noticed this issue (Flink 1.5, Flink 1.8), and it appears
> >> with
> >> higher parallelism.
> >>
> >> This can be confusing to the user when watermarks actually work and
> >> can be
> >> observed using the metrics.
> >>
> >> On Wed, Aug 14, 2019 at 7:36 AM Jan Lukavský  wrote:
> >>
> >>> Hi,
> >>>
> >>> is it possible, that watermarks are sometimes not propagated to WebUI,
> >>> although they are internally moving as normal? I see in WebUI every
> >>> operator showing "No Watermark", but outputs seem to be propagated to
> >>> sink (and there are watermark sensitive operations involved - e.g.
> >>> reductions on fixed windows without early emitting). More strangely,
> >>> this happens when I increase parallelism above some threshold. If I use
> >>> parallelism of N, watermarks are shown, when I increase it above some
> >>> number (seems not to be exactly deterministic), watermarks seems to
> >>> disappear.
> >>>
> >>> I'm using Flink 1.8.1.
> >>>
> >>> Did anyone experience something like this before?
> >>>
> >>> Jan
> >>>
> >>>
> >
>
>


[jira] [Created] (FLINK-13853) Running HA (file, async) end-to-end test failed on Travis

2019-08-26 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-13853:
-

 Summary: Running HA (file, async) end-to-end test failed on Travis
 Key: FLINK-13853
 URL: https://issues.apache.org/jira/browse/FLINK-13853
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.10.0
Reporter: Till Rohrmann
 Fix For: 1.10.0


{{Running HA (file, async) end-to-end test}} failed on Travis:

https://api.travis-ci.org/v3/job/576002743/log.txt
https://api.travis-ci.org/v3/job/576002736/log.txt
https://api.travis-ci.org/v3/job/576002730/log.txt
https://api.travis-ci.org/v3/job/576002724/log.txt



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: Watermarks not propagated to WebUI?

2019-08-26 Thread Jan Lukavský

Hi Robert,

I'd very much love to, but because I run my pipeline with Beam, I'm 
afraid I will have to wait a little longer, before Beam has runner for 
1.9 [1]. I'm pretty sure that the watermarks disappeared with overall 
parallelism (over all operators) something above 2000. There was quite a 
lot of operators (shuffling), so the individual parallelism of each 
operator was about 200. The pipeline was spread over 50 taskmanager 
(each having 4 slots).


Jan

[1] https://github.com/apache/beam/pull/9296/

On 8/26/19 10:23 AM, Robert Metzger wrote:
Jan, will you be able to test this issue on the now-released Flink 1.9 
with the new UI?


What parallelism is needed to reproduce the issue?


On Thu, Aug 15, 2019 at 1:59 PM Chesnay Schepler > wrote:


I remember an issue regarding the watermark fetch request from the
WebUI
exceeding some HTTP size limit, since it tries to fetch all
watermarks
at once, and the format of this request isn't exactly efficient.

Querying metrics for individual operators still works since the
request
is small enough.

Not sure whether we ever fixed that.

On 15/08/2019 12:01, Jan Lukavský wrote:
> Hi,
>
> Thomas, thanks for confirming this. I have noticed, that in 1.9 the
> WebUI has been reworked a lot, does anyone know if this is still an
> issue? I currently cannot easily try 1.9, so I cannot confirm or
> disprove that.
>
> Jan
>
> On 8/14/19 6:25 PM, Thomas Weise wrote:
>> I have also noticed this issue (Flink 1.5, Flink 1.8), and it
appears
>> with
>> higher parallelism.
>>
>> This can be confusing to the user when watermarks actually work
and
>> can be
>> observed using the metrics.
>>
>> On Wed, Aug 14, 2019 at 7:36 AM Jan Lukavský mailto:je...@seznam.cz>> wrote:
>>
>>> Hi,
>>>
>>> is it possible, that watermarks are sometimes not propagated
to WebUI,
>>> although they are internally moving as normal? I see in WebUI
every
>>> operator showing "No Watermark", but outputs seem to be
propagated to
>>> sink (and there are watermark sensitive operations involved - e.g.
>>> reductions on fixed windows without early emitting). More
strangely,
>>> this happens when I increase parallelism above some threshold.
If I use
>>> parallelism of N, watermarks are shown, when I increase it
above some
>>> number (seems not to be exactly deterministic), watermarks
seems to
>>> disappear.
>>>
>>> I'm using Flink 1.8.1.
>>>
>>> Did anyone experience something like this before?
>>>
>>> Jan
>>>
>>>
>



Re: [DISCUSS] Add ARM CI build to Flink (information-only)

2019-08-26 Thread Xiyuan Wang
Sorry, maybe my words is misleading.

We are just starting adding ARM support. So the CI is non-voting at this
moment to avoid blocking normal Flink development.

But once the ARM CI works well and stable enough. We should mark it as
voting. It means that in the future, if the ARM test is failed in a PR, the
PR can not be merged. The test log may tell develpers what error is
comming. If the develper need debug the detail on an ARM vm, OpenLab can
provider it.

Adding ARM CI can make sure Flink support ARM originally

I left a workflow in the PR, I'd like to print it here:

   1. Add the basic build script to ensure the CI system and build job
   works as expect. The job should be marked as non-voting first, it means the
   CI test failure won't block Flink PR to be merged.
   2. Add the test script to run unit/intergration test. At this step the
   --fn parameter will be added to mvn test. It will run the full test cases
   in Flink, so that we can find what test is failed on ARM.
   3. Fix the test failure one by one.
   4. Once all the tests are passed, remove the --fn parameter and keep
   watch the CI's status for some days. If some bugs raise then, fix them as
   what we usually do for travis-ci.
   5. Once the CI is stable enought, remove the non-voting tag, so that the
   ARM CI will be the same as travis-ci, to be one of the gate for Flink PR.
   6. Finally, Flink community can announce and release Flink ARM version.


Chesnay Schepler  于2019年8月26日周一 下午2:25写道:

> I'm sorry, but if these issues are only fixed later anyway I see no
> reason to run these tests on each PR. We're just adding noise to each PR
> that everyone will just ignore.
>
> I'm curious as to the benefit of having this directly in Flink; why
> aren't the ARM builds run outside of the Flink project, and fixes for it
> provided?
>
> It seems to me like nothing about these arm builds is actually handled
> by the Flink project.
>
> On 26/08/2019 03:43, Xiyuan Wang wrote:
> > Thanks for Stephan to bring up this topic.
> >
> > The package build jobs work well now. I have a simple online demo which
> is
> > built and ran on a ARM VM. Feel free to have a try[1].
> >
> > As the first step for ARM support, maybe it's good to add them now.
> >
> > While for the next step, the test part is still broken. It relates to
> some
> > points we find:
> >
> > 1. Some unit tests are failed[1] by Java coding. These kind of failure
> can
> > be fixed easily.
> > 2. Some tests are failed by depending on third part libaraies[2]. It
> > includes frocksdb, MapR Client and Netty. They don't have ARM release.
> >  a. Frocksdb: I'm testing it locally now by `make check_some` and
> `make
> > jtest` similar with its travis job. There are 3 tests failed by `make
> > check_some`. Please see the ticket for more details. Once the test pass,
> > frocksdb can release ARM package then.
> >  b. MapR Client. This belongs to MapR company. At this moment, maybe
> we
> > should skip MapR support for Flink ARM.
> >  c. Netty. Actually Netty runs well on our ARM machine. We will ask
> > Netty community to release ARM support. If they do not want, OpenLab will
> > handle a Maven Repository for some common libraries on ARM.
> >
> >
> > For Chesnay's concern:
> >
> > Firstly, OpenLab team will keep maintaining and fixing ARM CI. It means
> > that once build or test fails, we'll fix it at once.
> > Secondly,  OpenLab can provide ARM VMs to everyone for reproducing and
> > testing. You just need to creat a  Test Request issue in openlab[1]. Then
> > we'll create ARM VMs for you, you can  login and do the thing you want.
> >
> > Does it make sense?
> >
> > [1]: http://114.115.168.52:8081/#/overview
> > [1]: https://issues.apache.org/jira/browse/FLINK-13449
> >https://issues.apache.org/jira/browse/FLINK-13450
> > [2]: https://issues.apache.org/jira/browse/FLINK-13598
> > [3]: https://github.com/theopenlab/openlab/issues/new/choose
> >
> >
> >
> >
> > Chesnay Schepler  于2019年8月24日周六 上午12:10写道:
> >
> >> I'm wondering what we are supposed to do if the build fails?
> >> We aren't providing and guides on setting up an arm dev environment; so
> >> reproducing it locally isn't possible.
> >>
> >> On 23/08/2019 17:55, Stephan Ewen wrote:
> >>> Hi all!
> >>>
> >>> As part of the Flink on ARM effort, there is a pull request that
> >> triggers a
> >>> build on OpenLabs CI for each push and runs tests on ARM machines.
> >>>
> >>> Currently that build is roughly equivalent to what the "core" and
> "tests"
> >>> profiles do on Travis.
> >>> The result will be posted to the PR comments, similar to the Flink
> Bot's
> >>> Travis build result.
> >>> The build currently passes :-) so Flink seems to be okay on ARM.
> >>>
> >>> My suggestion would be to try and add this and gather some experience
> >> with
> >>> it.
> >>> The Travis build results should be our "ground truth" and the ARM CI
> >>> (openlabs CI) would be "informational only" at the beginning, but
> helping
> >>> us under

[jira] [Created] (FLINK-13854) Support Aggregating in Join and CoGroup

2019-08-26 Thread Jiayi Liao (Jira)
Jiayi Liao created FLINK-13854:
--

 Summary: Support Aggregating in Join and CoGroup
 Key: FLINK-13854
 URL: https://issues.apache.org/jira/browse/FLINK-13854
 Project: Flink
  Issue Type: New Feature
Affects Versions: 1.9.0
Reporter: Jiayi Liao


In WindowStream we can use  windowStream.aggregate(AggregateFunction, 
WindowFunction) to aggregate input records in real-time.   

I think we should support similar api in JoinedStreams and CoGroupStreams, 
because it's a very huge cost by storing the records log in state backend, 
especially when we don't need the specific records.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13855) Keep Travis build in sync with module structure

2019-08-26 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-13855:
-

 Summary: Keep Travis build in sync with module structure
 Key: FLINK-13855
 URL: https://issues.apache.org/jira/browse/FLINK-13855
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 1.9.0, 1.10.0
Reporter: Till Rohrmann
 Fix For: 1.10.0


We currently run all Travis profiles with {{-Pinclude-kinesis}} even though 
some of the build profiles don't contain the Kinesis connector module. Morever, 
we run every build profile with {{-Pskip-webui-build}} even though 
{{flink-runtime-web}} is not built by every profile. This causes Maven to log:

{code}
19:15:47.404 [WARNING] The requested profile "skip-webui-build" could not be 
activated because it does not exist.
19:15:47.404 [WARNING] The requested profile "include-kinesis" could not be 
activated because it does not exist.
{code}

I think it would be good to keep the build on Travis in sync with the actual 
module structure and not specifying options for module where they are 
superfluous. This might prevent the accidental shadowing/inclusion of modules 
if the profiles change in the future.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13856) Reduce the delete file api when the checkpoint is completed

2019-08-26 Thread andrew.D.lin (Jira)
andrew.D.lin created FLINK-13856:


 Summary: Reduce the delete file api when the checkpoint is 
completed
 Key: FLINK-13856
 URL: https://issues.apache.org/jira/browse/FLINK-13856
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / State Backends
Affects Versions: 1.9.0, 1.8.1
Reporter: andrew.D.lin
 Attachments: f6cc56b7-2c74-4f4b-bb6a-476d28a22096.png

When the new checkpoint is completed, an old checkpoint will be deleted by 
calling CompletedCheckpoint.discardOnSubsume().

When deleting old checkpoints, follow these steps:
1, drop the metadata
2, discard private state objects
3, discard location as a whole

In some cases, is it possible to delete the checkpoint folder recursively by 
one call?

As far as I know the full amount of checkpoint, it should be possible to delete 
the folder directly.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13857) Remove remaining UdfAnalyzer configurations

2019-08-26 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-13857:


 Summary: Remove remaining UdfAnalyzer configurations
 Key: FLINK-13857
 URL: https://issues.apache.org/jira/browse/FLINK-13857
 Project: Flink
  Issue Type: Improvement
  Components: API / DataSet, API / DataStream
Reporter: Dawid Wysakowicz
Assignee: Dawid Wysakowicz
 Fix For: 1.10.0


The UdfAnalyzer code was dropped in 1.9 release. A few configuration 
classes/options were marked as deprecated as part of this effort. Having in 
mind that they take no effect at all and were deprecated in 1.9 release I 
suggest to drop them in 1.10 release.

It also does not break binary compatibility as all the classes were marked with 
PublicEvolving from the very beginning.

I suggest to drop:
* CodeAnalysisMode
* ExecutionConfig#get/setCodeAnalysisMode
* SkipCodeAnalysis



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Add ARM CI build to Flink (information-only)

2019-08-26 Thread Stephan Ewen
Adding CI builds for ARM makes only sense when we actually take them into
account as "blocking a merge", otherwise there is no point in having them.
So we would need to be prepared to do that.

The cases where something runs in UNIX/x64 but fails on ARM are few cases
and so far seem to have been related to libraries or some magic that tries
to do system dependent actions outside Java.

One worthwhile discussion could be whether to run the ARM CI builds as part
of the nightly tests, not on every commit.
There are a lot of nightly tests, for example for different Java / Scala /
Hadoop versions.

On Mon, Aug 26, 2019 at 10:46 AM Xiyuan Wang 
wrote:

> Sorry, maybe my words is misleading.
>
> We are just starting adding ARM support. So the CI is non-voting at this
> moment to avoid blocking normal Flink development.
>
> But once the ARM CI works well and stable enough. We should mark it as
> voting. It means that in the future, if the ARM test is failed in a PR, the
> PR can not be merged. The test log may tell develpers what error is
> comming. If the develper need debug the detail on an ARM vm, OpenLab can
> provider it.
>
> Adding ARM CI can make sure Flink support ARM originally
>
> I left a workflow in the PR, I'd like to print it here:
>
>1. Add the basic build script to ensure the CI system and build job
>works as expect. The job should be marked as non-voting first, it means the
>CI test failure won't block Flink PR to be merged.
>2. Add the test script to run unit/intergration test. At this step the
>--fn parameter will be added to mvn test. It will run the full test cases
>in Flink, so that we can find what test is failed on ARM.
>3. Fix the test failure one by one.
>4. Once all the tests are passed, remove the --fn parameter and keep
>watch the CI's status for some days. If some bugs raise then, fix them as
>what we usually do for travis-ci.
>5. Once the CI is stable enought, remove the non-voting tag, so that
>the ARM CI will be the same as travis-ci, to be one of the gate for Flink
>PR.
>6. Finally, Flink community can announce and release Flink ARM version.
>
>
> Chesnay Schepler  于2019年8月26日周一 下午2:25写道:
>
>> I'm sorry, but if these issues are only fixed later anyway I see no
>> reason to run these tests on each PR. We're just adding noise to each PR
>> that everyone will just ignore.
>>
>> I'm curious as to the benefit of having this directly in Flink; why
>> aren't the ARM builds run outside of the Flink project, and fixes for it
>> provided?
>>
>> It seems to me like nothing about these arm builds is actually handled
>> by the Flink project.
>>
>> On 26/08/2019 03:43, Xiyuan Wang wrote:
>> > Thanks for Stephan to bring up this topic.
>> >
>> > The package build jobs work well now. I have a simple online demo which
>> is
>> > built and ran on a ARM VM. Feel free to have a try[1].
>> >
>> > As the first step for ARM support, maybe it's good to add them now.
>> >
>> > While for the next step, the test part is still broken. It relates to
>> some
>> > points we find:
>> >
>> > 1. Some unit tests are failed[1] by Java coding. These kind of failure
>> can
>> > be fixed easily.
>> > 2. Some tests are failed by depending on third part libaraies[2]. It
>> > includes frocksdb, MapR Client and Netty. They don't have ARM release.
>> >  a. Frocksdb: I'm testing it locally now by `make check_some` and
>> `make
>> > jtest` similar with its travis job. There are 3 tests failed by `make
>> > check_some`. Please see the ticket for more details. Once the test pass,
>> > frocksdb can release ARM package then.
>> >  b. MapR Client. This belongs to MapR company. At this moment,
>> maybe we
>> > should skip MapR support for Flink ARM.
>> >  c. Netty. Actually Netty runs well on our ARM machine. We will ask
>> > Netty community to release ARM support. If they do not want, OpenLab
>> will
>> > handle a Maven Repository for some common libraries on ARM.
>> >
>> >
>> > For Chesnay's concern:
>> >
>> > Firstly, OpenLab team will keep maintaining and fixing ARM CI. It means
>> > that once build or test fails, we'll fix it at once.
>> > Secondly,  OpenLab can provide ARM VMs to everyone for reproducing and
>> > testing. You just need to creat a  Test Request issue in openlab[1].
>> Then
>> > we'll create ARM VMs for you, you can  login and do the thing you want.
>> >
>> > Does it make sense?
>> >
>> > [1]: http://114.115.168.52:8081/#/overview
>> > [1]: https://issues.apache.org/jira/browse/FLINK-13449
>> >https://issues.apache.org/jira/browse/FLINK-13450
>> > [2]: https://issues.apache.org/jira/browse/FLINK-13598
>> > [3]: https://github.com/theopenlab/openlab/issues/new/choose
>> >
>> >
>> >
>> >
>> > Chesnay Schepler  于2019年8月24日周六 上午12:10写道:
>> >
>> >> I'm wondering what we are supposed to do if the build fails?
>> >> We aren't providing and guides on setting up an arm dev environment; so
>> >> reproducing it locally isn't possible.
>> >>
>> >> 

Re: [DISCUSS] Setup a bui...@flink.apache.org mailing list for travis builds

2019-08-26 Thread Kurt Young
Thanks for the updates, Jark! I have subscribed the ML and everything
looks good now.

Best,
Kurt


On Mon, Aug 26, 2019 at 11:17 AM Jark Wu  wrote:

> Hi all,
>
> Sorry it take so long to get back. I have some good news.
>
> After some investigation and development and the help from Chesnay, we
> finally integrated Travis build notification with bui...@flink.apache.org
> mailing list with remaining the beautiful formatting!
> Currently, only the failure and failure->success builds will be notified,
> only builds (include CRON) on apache/flink branches will be notified, the
> pull request builds will not be notified.
>
> The builds mailing list is also available in Flink website community page
> [1]
>
> I would encourage devs to subscribe the builds mailing list, and help the
> community to pay more attention to the build status, especially the CRON
> builds.
>
> Feel free to leave your suggestions and feedbacks here!
>
> 
>
> # The implementation detail:
>
> I implemented a flink-notification-bot[2] to receive Travis webhook[3]
> payload and generate an HTML email and send the email to
> bui...@flink.apache.org.
> The flink-notification-bot is deployed on my own VM in DigitalOcean. You
> can refer the github page [2] of the project to learn more details about
> the implementation and deployment.
> Btw, I'm glad to contribute the project to https://github.com/flink-ci or
> https://github.com/flinkbot if the community accepts.
>
> With the flink-notification-bot, we can easily integrate it with other CI
> service or our own CI, and we can also integrate it with some other
> applications (e.g. DingTalk).
>
> # Rejected Alternative:
>
> Option#1: Sending email notifications via "Travis Email Notification"[4].
> Reasons:
>  - If the emailing notification is set, Travis CI only sends an emails to
> the addresses specified there, rather than to the committer and author.
>  - We will lose the beautiful email formatting when Travis send Email to
> builds ML.
>  - The return-path of emails from Travis CI is not constant, which makes it
> difficult for mailing list to accept it.
>
> Cheers,
> Jark
>
> [1]: https://flink.apache.org/community.html#mailing-lists
> [2]: https://github.com/wuchong/flink-notification-bot
> [3]:
>
> https://docs.travis-ci.com/user/notifications/#configuring-webhook-notifications
> [4]:
>
> https://docs.travis-ci.com/user/notifications/#configuring-email-notifications
>
>
>
>
> On Tue, 30 Jul 2019 at 18:35, Jark Wu  wrote:
>
> > Hi all,
> >
> > Progress updates:
> > 1. the bui...@flink.apache.org can be subscribed now (thanks @Robert),
> > you can send an email to builds-subscr...@flink.apache.org to subscribe.
> > 2. We have a pull request [1] to send only apache/flink builds
> > notifications and it works well.
> > 3. However, all the notifications are rejected by the builds mailing list
> > (the MODERATE mails).
> > I added & checked bui...@travis-ci.org to the subscriber/allow list,
> > but still doesn't work. It might be recognized as spam by the mailing
> list.
> > We are still trying to figure it out, and will update here if we have
> > some progress.
> >
> >
> > Thanks,
> > Jark
> >
> >
> >
> > [1]: https://github.com/apache/flink/pull/9230
> >
> >
> > On Thu, 25 Jul 2019 at 22:59, Robert Metzger 
> wrote:
> >
> >> The mailing list has been created, you can now subscribe to it.
> >>
> >> On Wed, Jul 24, 2019 at 1:43 PM Jark Wu  wrote:
> >>
> >> > Thanks Robert for helping out that.
> >> >
> >> > Best,
> >> > Jark
> >> >
> >> > On Wed, 24 Jul 2019 at 19:16, Robert Metzger 
> >> wrote:
> >> >
> >> > > I've requested the creation of the list, and made Jark, Chesnay and
> me
> >> > > moderators of it.
> >> > >
> >> > > On Wed, Jul 24, 2019 at 1:12 PM Robert Metzger  >
> >> > > wrote:
> >> > >
> >> > > > @Jark: Yes, I will request the creation of a mailing list!
> >> > > >
> >> > > > On Tue, Jul 23, 2019 at 4:48 PM Hugo Louro 
> >> wrote:
> >> > > >
> >> > > >> +1
> >> > > >>
> >> > > >> > On Jul 23, 2019, at 6:15 AM, Till Rohrmann <
> trohrm...@apache.org
> >> >
> >> > > >> wrote:
> >> > > >> >
> >> > > >> > Good idea Jark. +1 for the proposal.
> >> > > >> >
> >> > > >> > Cheers,
> >> > > >> > Till
> >> > > >> >
> >> > > >> >> On Tue, Jul 23, 2019 at 1:59 PM Hequn Cheng <
> >> chenghe...@gmail.com>
> >> > > >> wrote:
> >> > > >> >>
> >> > > >> >> Hi Jark,
> >> > > >> >>
> >> > > >> >> Good idea. +1!
> >> > > >> >>
> >> > > >> >>> On Tue, Jul 23, 2019 at 6:23 PM Jark Wu 
> >> wrote:
> >> > > >> >>>
> >> > > >> >>> Thank you all for your positive feedback.
> >> > > >> >>>
> >> > > >> >>> We have three binding +1s, so I think, we can proceed with
> >> this.
> >> > > >> >>>
> >> > > >> >>> Hi @Robert Metzger  , could you create
> a
> >> > > >> request to
> >> > > >> >>> INFRA for the mailing list?
> >> > > >> >>> I'm not sure if this needs a PMC permission.
> >> > > >> >>>
> >> > > >> >>> Thanks,
> >> > > >> >>> Jar

Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-26 Thread Piotr Nowojski
Hi,

Xiaogang, those things worry me the most.

1. Regarding the broadcasting, doesn’t the BroadcastState [1] cover our issues? 
Can not we construct a job graph, where one operator has two outputs, one keyed 
another broadcasted, which are wired together back to the 
KeyedBroadcastProcessFunction or BroadcastProcessFunction? 

2. Multicast on keyed streams, might be done by iterating over all of the keys. 
However I have a feeling that might not be the feature which distributed 
cross/theta joins would want, since they would probably need a guarantee to 
have only a single key per operator instance.

Kurt, by broadcast optimisation do you mean [2]?

I’m not sure if we should split the discussion yet. Most of the changes 
required by either multicast or broadcast will be in the API/state layers. 
Runtime changes for broadcast would be almost none (just exposing existing 
features) and for multicast they shouldn't be huge as well. However maybe we 
should consider those two things together at the API level, so that we do not 
make wrong decisions when just looking at the simpler/more narrow broadcast 
support?

Piotrek

[1] 
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html
 

[2] https://github.com/apache/flink/pull/7713 


> On 26 Aug 2019, at 09:35, Kurt Young  wrote:
> 
> From SQL's perspective, distributed cross join is a valid feature but not
> very
> urgent. Actually this discuss reminds me about another useful feature
> (sorry
> for the distraction):
> 
> when doing broadcast in batch shuffle mode, we can make each producer only
> write one copy of the output data, but not for every consumer. Broadcast
> join
> is much more useful, and this is a very important optimization. Not sure if
> we
> have already consider this.
> 
> Best,
> Kurt
> 
> 
> On Mon, Aug 26, 2019 at 12:16 PM Guowei Ma  wrote:
> 
>> Thanks Yun for bringing up this discussion and very thanks for all the deep
>> thoughts!
>> 
>> For now, I think this discussion contains two scenarios: one if for
>> iteration library support and the other is for SQL join support. I think
>> both of the two scenarios are useful but they seem to have different best
>> suitable solutions. For making the discussion more clear, I would suggest
>> to split the discussion into two threads.
>> 
>> And I agree with Piotr that it is very tricky that a keyed stream received
>> a "broadcast element". So we may add some new interfaces, which could
>> broadcast or process some special "broadcast event". In that way "broadcast
>> event" will not be sent with the normal process.
>> 
>> Best,
>> Guowei
>> 
>> 
>> SHI Xiaogang  于2019年8月26日周一 上午9:27写道:
>> 
>>> Hi all,
>>> 
>>> I also think that multicasting is a necessity in Flink, but more details
>>> are needed to be considered.
>>> 
>>> Currently network is tightly coupled with states in Flink to achieve
>>> automatic scaling. We can only access keyed states in keyed streams and
>>> operator states in all streams.
>>> In the concrete example of theta-joins implemented with mutlticasting,
>> the
>>> following questions exist:
>>> 
>>>   - In which type of states will the data be stored? Do we need another
>>>   type of states which is coupled with multicasting streams?
>>>   - How to ensure the consistency between network and states when jobs
>>>   scale out or scale in?
>>> 
>>> Regards,
>>> Xiaogang
>>> 
>>> Xingcan Cui  于2019年8月25日周日 上午10:03写道:
>>> 
 Hi all,
 
 Sorry for joining this thread late. Basically, I think enabling
>> multicast
 pattern could be the right direction, but more detailed implementation
 policies need to be discussed.
 
 Two years ago, I filed an issue [1] about the multicast API. However,
>> due
 to some reasons, it was laid aside. After that, when I tried to
>>> cherry-pick
 the change for experimental use, I found the return type of
 `selectChannels()` method had changed from `int[]` to `int`, which
>> makes
 the old implementation not work anymore.
 
 From my side, the multicast has always been used for theta-join. As far
>>> as
 I know, it’s an essential requirement for some sophisticated joining
 algorithms. Until now, the Flink non-equi joins can still only be
>>> executed
 single-threaded. If we'd like to make some improvements on this, we
>>> should
 first take some measures to support multicast pattern.
 
 Best,
 Xingcan
 
 [1] https://issues.apache.org/jira/browse/FLINK-6936
 
> On Aug 24, 2019, at 5:54 AM, Zhu Zhu  wrote:
> 
> Hi Piotr,
> 
> Thanks for the explanation.
> Agreed that the broadcastEmit(record) is a better choice for
>>> broadcasting
> for the iterations.
> As broadcasting for the iterations is the first motivation, let's
>>> support
> it first.
> 
> Thanks,

Re: CiBot Update

2019-08-26 Thread Terry Wang
Very helpful! Thanks Chesnay!
Best,
Terry Wang



> 在 2019年8月23日,下午11:47,Ethan Li  写道:
> 
> Thank you very much Chesnay! This is helpful
> 
>> On Aug 23, 2019, at 2:58 AM, Chesnay Schepler  wrote:
>> 
>> @Ethan Li The source for the CiBot is available here 
>> . The implementation of this command is 
>> tightly connected to how the CiBot works; but conceptually it looks at a PR, 
>> finds the most recent build that ran, and uses the Travis REST API to 
>> restart the build.
>> Additionally, it keeps track of which comments have been processed by 
>> storing the comment ID in the CI report.
>> If you have further questions, feel free to ping me directly.
>> 
>> @Dianfu I agree, we should include it somewhere in either the flinkbot 
>> template or the CI report.
>> 
>> On 23/08/2019 03:35, Dian Fu wrote:
>>> Thanks Chesnay for your great work! A very useful feature!
>>> 
>>> Just one minor suggestion: It will be better if we could add this command 
>>> to the section "Bot commands" in the flinkbot template.
>>> 
>>> Regards,
>>> Dian
>>> 
 在 2019年8月23日,上午2:06,Ethan Li  写道:
 
 My question is specifically about implementation of "@flinkbot run travis"
 
> On Aug 22, 2019, at 1:06 PM, Ethan Li  wrote:
> 
> Hi Chesnay,
> 
> This is really nice feature!
> 
> Can I ask how is this implemented? Do you have the related Jira/PR/docs 
> that I can take a look? I’d like to introduce it to another project if 
> applicable. Thank you very much!
> 
> Best,
> Ethan
> 
>> On Aug 22, 2019, at 8:34 AM, Biao Liu > > wrote:
>> 
>> Thanks Chesnay a lot,
>> 
>> I love this feature!
>> 
>> Thanks,
>> Biao /'bɪ.aʊ/
>> 
>> 
>> 
>> On Thu, 22 Aug 2019 at 20:55, Hequn Cheng > > wrote:
>> 
>>> Cool, thanks Chesnay a lot for the improvement!
>>> 
>>> Best, Hequn
>>> 
>>> On Thu, Aug 22, 2019 at 5:02 PM Zhu Zhu >> > wrote:
>>> 
 Thanks Chesnay for the CI improvement!
 It is very helpful.
 
 Thanks,
 Zhu Zhu
 
 zhijiang >>> > 于2019年8月22日周四 下午4:18写道:
 
> It is really very convenient now. Valuable work, Chesnay!
> 
> Best,
> Zhijiang
> --
> From:Till Rohrmann  >
> Send Time:2019年8月22日(星期四) 10:13
> To:dev mailto:dev@flink.apache.org>>
> Subject:Re: CiBot Update
> 
> Thanks for the continuous work on the CiBot Chesnay!
> 
> Cheers,
> Till
> 
> On Thu, Aug 22, 2019 at 9:47 AM Jark Wu  > wrote:
> 
>> Great work! Thanks Chesnay!
>> 
>> 
>> 
>> On Thu, 22 Aug 2019 at 15:42, Xintong Song > >
> wrote:
>>> The re-triggering travis feature is so convenient. Thanks Chesnay~!
>>> 
>>> Thank you~
>>> 
>>> Xintong Song
>>> 
>>> 
>>> 
>>> On Thu, Aug 22, 2019 at 9:26 AM Stephan Ewen >> >
 wrote:
 Nice, thanks!
 
 On Thu, Aug 22, 2019 at 3:59 AM Zili Chen >>> >
>> wrote:
> Thanks for your announcement. Nice work!
> 
> Best,
> tison.
> 
> 
> vino yang mailto:yanghua1...@gmail.com>> 
> 于2019年8月22日周四 上午8:14写道:
> 
>> +1 for "@flinkbot run travis", it is very convenient.
>> 
>> Chesnay Schepler > > 于2019年8月21日周三
>>> 下午9:12写道:
>>> Hi everyone,
>>> 
>>> this is an update on recent changes to the CI bot.
>>> 
>>> 
>>> The bot now cancels builds if a new commit was added to a
>>> PR,
> and
>>> cancels all builds if the PR was closed.
>>> (This was implemented a while ago; I'm just mentioning it
 again
>> for
>>> discoverability)
>>> 
>>> 
>>> Additionally, starting today you can now re-trigger a
>>> Travis
> run
>> by
>>> writing a comment "@flinkbot run travis"; this means you no
>> longer
 have
>>> to commit an empty commit or do other shenanigans to get
> another
 build
>>> running.
>>> Note that this will /not/ work if the PR was re-opened,
>>> until
> at
 least

[jira] [Created] (FLINK-13858) Add flink-connector-elasticsearch6 Specify the field as the primary key id

2019-08-26 Thread hubin (Jira)
hubin created FLINK-13858:
-

 Summary: Add flink-connector-elasticsearch6 Specify the field as 
the primary key id
 Key: FLINK-13858
 URL: https://issues.apache.org/jira/browse/FLINK-13858
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / ElasticSearch
Affects Versions: 1.9.0, 1.8.1, 1.8.0, 1.7.2, 1.6.4, 1.6.3
Reporter: hubin


For example, syncing an order table sink to es6, order_id is primary key 
`insert into es6_table from select order_id, order_name from source_table`

However, the primary key in es6 is randomly generated, and the order_id cannot 
be used to find the record.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Add ARM CI build to Flink (information-only)

2019-08-26 Thread Xiyuan Wang
Before ARM CI is ready, I can close the CI test for each PR and let it only
be triggered by PR comment.  It's quite easy for OpenLab to do this.

OpenLab have many job piplines[1].  Now I use `check` pipline in
https://github.com/apache/flink/pull/9416. The job trigger contains
github_action and github_comment[2]. I can create a new pipline for Flink,
the new trigger can only contain github_coment like:

trigger:
  github:
 - event: pull_request
   action: comment
   comment: (?i)^\s*recheck_arm_build\s*$

So that the ARM job will not be ran for every PR. It'll be just ran for the
PR which have `recheck_arm_build` comment.

Then once ARM CI is ready, I can add it back.


nightly tests can be added as well of couse. There is a kind of job in
OpenLab called `periodic job`. We can use it for Flink daily nightly tests.
If any error occur, the report can be sent to bui...@flink.apache.org  as
well.

[1]:
https://github.com/theopenlab/openlab-zuul-jobs/blob/master/zuul.d/pipelines.yaml
[2]:
https://github.com/theopenlab/openlab-zuul-jobs/blob/master/zuul.d/pipelines.yaml#L10-L19

Stephan Ewen  于2019年8月26日周一 下午6:13写道:

> Adding CI builds for ARM makes only sense when we actually take them into
> account as "blocking a merge", otherwise there is no point in having them.
> So we would need to be prepared to do that.
>
> The cases where something runs in UNIX/x64 but fails on ARM are few cases
> and so far seem to have been related to libraries or some magic that tries
> to do system dependent actions outside Java.
>
> One worthwhile discussion could be whether to run the ARM CI builds as part
> of the nightly tests, not on every commit.
> There are a lot of nightly tests, for example for different Java / Scala /
> Hadoop versions.
>
> On Mon, Aug 26, 2019 at 10:46 AM Xiyuan Wang 
> wrote:
>
> > Sorry, maybe my words is misleading.
> >
> > We are just starting adding ARM support. So the CI is non-voting at this
> > moment to avoid blocking normal Flink development.
> >
> > But once the ARM CI works well and stable enough. We should mark it as
> > voting. It means that in the future, if the ARM test is failed in a PR,
> the
> > PR can not be merged. The test log may tell develpers what error is
> > comming. If the develper need debug the detail on an ARM vm, OpenLab can
> > provider it.
> >
> > Adding ARM CI can make sure Flink support ARM originally
> >
> > I left a workflow in the PR, I'd like to print it here:
> >
> >1. Add the basic build script to ensure the CI system and build job
> >works as expect. The job should be marked as non-voting first, it
> means the
> >CI test failure won't block Flink PR to be merged.
> >2. Add the test script to run unit/intergration test. At this step the
> >--fn parameter will be added to mvn test. It will run the full test
> cases
> >in Flink, so that we can find what test is failed on ARM.
> >3. Fix the test failure one by one.
> >4. Once all the tests are passed, remove the --fn parameter and keep
> >watch the CI's status for some days. If some bugs raise then, fix
> them as
> >what we usually do for travis-ci.
> >5. Once the CI is stable enought, remove the non-voting tag, so that
> >the ARM CI will be the same as travis-ci, to be one of the gate for
> Flink
> >PR.
> >6. Finally, Flink community can announce and release Flink ARM
> version.
> >
> >
> > Chesnay Schepler  于2019年8月26日周一 下午2:25写道:
> >
> >> I'm sorry, but if these issues are only fixed later anyway I see no
> >> reason to run these tests on each PR. We're just adding noise to each PR
> >> that everyone will just ignore.
> >>
> >> I'm curious as to the benefit of having this directly in Flink; why
> >> aren't the ARM builds run outside of the Flink project, and fixes for it
> >> provided?
> >>
> >> It seems to me like nothing about these arm builds is actually handled
> >> by the Flink project.
> >>
> >> On 26/08/2019 03:43, Xiyuan Wang wrote:
> >> > Thanks for Stephan to bring up this topic.
> >> >
> >> > The package build jobs work well now. I have a simple online demo
> which
> >> is
> >> > built and ran on a ARM VM. Feel free to have a try[1].
> >> >
> >> > As the first step for ARM support, maybe it's good to add them now.
> >> >
> >> > While for the next step, the test part is still broken. It relates to
> >> some
> >> > points we find:
> >> >
> >> > 1. Some unit tests are failed[1] by Java coding. These kind of failure
> >> can
> >> > be fixed easily.
> >> > 2. Some tests are failed by depending on third part libaraies[2]. It
> >> > includes frocksdb, MapR Client and Netty. They don't have ARM release.
> >> >  a. Frocksdb: I'm testing it locally now by `make check_some` and
> >> `make
> >> > jtest` similar with its travis job. There are 3 tests failed by `make
> >> > check_some`. Please see the ticket for more details. Once the test
> pass,
> >> > frocksdb can release ARM package then.
> >> >  b. MapR Client. This belongs to MapR company. At thi

Re: CiBot Update

2019-08-26 Thread Congxian Qiu
Thanks Chesnay for the nice work, it's very helpful

Best,
Congxian


Terry Wang  于2019年8月26日周一 下午6:59写道:

> Very helpful! Thanks Chesnay!
> Best,
> Terry Wang
>
>
>
> > 在 2019年8月23日,下午11:47,Ethan Li  写道:
> >
> > Thank you very much Chesnay! This is helpful
> >
> >> On Aug 23, 2019, at 2:58 AM, Chesnay Schepler 
> wrote:
> >>
> >> @Ethan Li The source for the CiBot is available here <
> https://github.com/flink-ci/ci-bot/>. The implementation of this command
> is tightly connected to how the CiBot works; but conceptually it looks at a
> PR, finds the most recent build that ran, and uses the Travis REST API to
> restart the build.
> >> Additionally, it keeps track of which comments have been processed by
> storing the comment ID in the CI report.
> >> If you have further questions, feel free to ping me directly.
> >>
> >> @Dianfu I agree, we should include it somewhere in either the flinkbot
> template or the CI report.
> >>
> >> On 23/08/2019 03:35, Dian Fu wrote:
> >>> Thanks Chesnay for your great work! A very useful feature!
> >>>
> >>> Just one minor suggestion: It will be better if we could add this
> command to the section "Bot commands" in the flinkbot template.
> >>>
> >>> Regards,
> >>> Dian
> >>>
>  在 2019年8月23日,上午2:06,Ethan Li  写道:
> 
>  My question is specifically about implementation of "@flinkbot run
> travis"
> 
> > On Aug 22, 2019, at 1:06 PM, Ethan Li 
> wrote:
> >
> > Hi Chesnay,
> >
> > This is really nice feature!
> >
> > Can I ask how is this implemented? Do you have the related
> Jira/PR/docs that I can take a look? I’d like to introduce it to another
> project if applicable. Thank you very much!
> >
> > Best,
> > Ethan
> >
> >> On Aug 22, 2019, at 8:34 AM, Biao Liu  mmyy1...@gmail.com>> wrote:
> >>
> >> Thanks Chesnay a lot,
> >>
> >> I love this feature!
> >>
> >> Thanks,
> >> Biao /'bɪ.aʊ/
> >>
> >>
> >>
> >> On Thu, 22 Aug 2019 at 20:55, Hequn Cheng  > wrote:
> >>
> >>> Cool, thanks Chesnay a lot for the improvement!
> >>>
> >>> Best, Hequn
> >>>
> >>> On Thu, Aug 22, 2019 at 5:02 PM Zhu Zhu  > wrote:
> >>>
>  Thanks Chesnay for the CI improvement!
>  It is very helpful.
> 
>  Thanks,
>  Zhu Zhu
> 
>  zhijiang  wangzhijiang...@aliyun.com.invalid>> 于2019年8月22日周四 下午4:18写道:
> 
> > It is really very convenient now. Valuable work, Chesnay!
> >
> > Best,
> > Zhijiang
> >
> --
> > From:Till Rohrmann  trohrm...@apache.org>>
> > Send Time:2019年8月22日(星期四) 10:13
> > To:dev mailto:dev@flink.apache.org>>
> > Subject:Re: CiBot Update
> >
> > Thanks for the continuous work on the CiBot Chesnay!
> >
> > Cheers,
> > Till
> >
> > On Thu, Aug 22, 2019 at 9:47 AM Jark Wu  > wrote:
> >
> >> Great work! Thanks Chesnay!
> >>
> >>
> >>
> >> On Thu, 22 Aug 2019 at 15:42, Xintong Song <
> tonysong...@gmail.com >
> > wrote:
> >>> The re-triggering travis feature is so convenient. Thanks
> Chesnay~!
> >>>
> >>> Thank you~
> >>>
> >>> Xintong Song
> >>>
> >>>
> >>>
> >>> On Thu, Aug 22, 2019 at 9:26 AM Stephan Ewen  >
>  wrote:
>  Nice, thanks!
> 
>  On Thu, Aug 22, 2019 at 3:59 AM Zili Chen <
> wander4...@gmail.com >
> >> wrote:
> > Thanks for your announcement. Nice work!
> >
> > Best,
> > tison.
> >
> >
> > vino yang  yanghua1...@gmail.com>> 于2019年8月22日周四 上午8:14写道:
> >
> >> +1 for "@flinkbot run travis", it is very convenient.
> >>
> >> Chesnay Schepler  ches...@apache.org>> 于2019年8月21日周三
> >>> 下午9:12写道:
> >>> Hi everyone,
> >>>
> >>> this is an update on recent changes to the CI bot.
> >>>
> >>>
> >>> The bot now cancels builds if a new commit was added to a
> >>> PR,
> > and
> >>> cancels all builds if the PR was closed.
> >>> (This was implemented a while ago; I'm just mentioning it
>  again
> >> for
> >>> discoverability)
> >>>
> >>>
> >>> Additionally, starting today you can now re-trigger a
> >>> Travis
> > run
> >> by
> >>> writing a comment "@flinkbot run travis"; this means you no
> >> longer
>  have
> >>> to c

[jira] [Created] (FLINK-13859) JSONDeserializationSchema spell error

2019-08-26 Thread limbo (Jira)
limbo created FLINK-13859:
-

 Summary: JSONDeserializationSchema spell error
 Key: FLINK-13859
 URL: https://issues.apache.org/jira/browse/FLINK-13859
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.9.0
Reporter: limbo
 Attachments: image-2019-08-26-20-14-20-075.png

In kafka page the JsonDeserializationSchema would be JSONDeserializationSchema

!image-2019-08-26-20-14-20-075.png!



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[CODE-STYLE] Builder pattern

2019-08-26 Thread Gyula Fóra
Hi All!

I would like to start a code-style related discussion regarding how we
implement the builder pattern in the Flink project.

It would be the best to have a common approach, there are some aspects of
the pattern that come to my mind please feel free to add anything I missed:

1. Creating the builder objects:

a) Always create using static method in "built" class:
   Here we should have naming guidelines: .builder(..) or
.xyzBuilder(...)
b) Always use builder class constructor
c) Mix: Again we should have some guidelines when to use which

I personally prefer option a) to always have a static method to create the
builder with static method names that end in builder.

2. Setting properties on the builder:

 a) withSomething(...)
 b) setSomething(...)
 c) other

I don't really have a preference but either a or b for consistency.


3. Implementing the builder object:

 a) Immutable -> Creates a new builder object after setting a property
 b) Mutable -> Returns (this) after setting the property

I personally prefer the mutable version as it keeps the builder
implementation much simpler and it seems to be a very common way of doing
it.

What do you all think?

Regards,
Gyula


[jira] [Created] (FLINK-13860) Flink Apache Kudu Connector

2019-08-26 Thread Joao Boto (Jira)
Joao Boto created FLINK-13860:
-

 Summary: Flink Apache Kudu Connector
 Key: FLINK-13860
 URL: https://issues.apache.org/jira/browse/FLINK-13860
 Project: Flink
  Issue Type: New Feature
Reporter: Joao Boto


Hi..

I'm the contributor and maintainer of this connector on Bahir-Flink project

[https://github.com/apache/bahir-flink/tree/master/flink-connector-kudu]

 

but seems that flink-connectors on that project are less maintained an its 
difficult to maintain the code up to date, as PR are not merged and never 
released any version, which makes it difficult to use easily

 

I would like to contribute that code to flink allowing other to contribute and 
use that connector

 

[~fhueske] what do you think?



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [CODE-STYLE] Builder pattern

2019-08-26 Thread Dawid Wysakowicz
Hi Gyula,

A few comments from my side.

Ad. 1 Personally I also prefer a static method in the "built" class. Not
sure if I would be that strict about the "Builder" suffix, though. It is
usually quite easy to realize the method returns a builder rather than
the object itself. In my opinion the suffix might be redundant at times.

Ad. 2 Here I also don't have a strict preference. I like the b) approach
the least though. Whenever I see setXXX I expect this is an old style
setter without return type. For that reason for a generic name I always
prefer withXXX. I see though lots of benefits for the option c), as this
might be the most descriptive option. Some examples that I like the
usage of option c) are:

* org.apache.flink.api.common.state.StateTtlConfig.Builder#useProcessingTime

*
org.apache.flink.api.common.state.StateTtlConfig.Builder#cleanupInBackground

*
org.apache.flink.api.common.state.StateTtlConfig.Builder#cleanupInRocksdbCompactFilter()

To sum up on this topic I would not enforce a strict policy on this
topic. But I am ok with one, if the community prefers it.

Ad. 3 I agree that mutable Builders are way more easier to implement and
usually it does not harm to have them mutable as long as they are not
passed around.


Some other topics that I think are worth considering:

4. Setting the default values:

a) The static method/constructor should accept all the required
parameters (that don't have a reasonable defaults). So that
newBuilder(...).build() construct a valid object.

b) Validation in the build() method. newBuilder().build() might throw an
Exception if some values were not set.

Personally I think the option a) should be strongly preferred. However I
believe there are cases were b) could be acceptable.

5. Building the end object:

a) Always with build() method

b) Allow building object with arbitrary methods

I think the option a) is the most common approach. I think though option
b) should also be allowed if there is a good reason for that. What I can
imagine is when we need some additional information from the build
method (e.g. generic type) or if the method modifies the internal state
of the Builder in a way that it is unsafe to continue setting values on
the Builder.

An overall comment. I think it is good to share opinions on this topic,
but I am afraid there is too many sides to the issue to standardize it
very strictly. I might be wrong though. Really looking forward to the
outcome of this discussion.

Best,

Dawid

On 26/08/2019 14:18, Gyula Fóra wrote:
> Hi All!
>
> I would like to start a code-style related discussion regarding how we
> implement the builder pattern in the Flink project.
>
> It would be the best to have a common approach, there are some aspects of
> the pattern that come to my mind please feel free to add anything I missed:
>
> 1. Creating the builder objects:
>
> a) Always create using static method in "built" class:
>Here we should have naming guidelines: .builder(..) or
> .xyzBuilder(...)
> b) Always use builder class constructor
> c) Mix: Again we should have some guidelines when to use which
>
> I personally prefer option a) to always have a static method to create the
> builder with static method names that end in builder.
>
> 2. Setting properties on the builder:
>
>  a) withSomething(...)
>  b) setSomething(...)
>  c) other
>
> I don't really have a preference but either a or b for consistency.
>
>
> 3. Implementing the builder object:
>
>  a) Immutable -> Creates a new builder object after setting a property
>  b) Mutable -> Returns (this) after setting the property
>
> I personally prefer the mutable version as it keeps the builder
> implementation much simpler and it seems to be a very common way of doing
> it.
>
> What do you all think?
>
> Regards,
> Gyula
>



signature.asc
Description: OpenPGP digital signature


Re: [CODE-STYLE] Builder pattern

2019-08-26 Thread Jark Wu
Hi Gyula,

Thanks for bringing this. I think it would be nice if we have a common
approach to create builder pattern.
Currently, we have a lot of builders but with different tastes.

 > 1. Creating the builder objects:
I prefer option a) too. It would be easier for users to get the builder
instance.

> 2. Setting properties on the builder:
I don't have a preference for it. But I think there is another option might
be more concise, i.e. "something()" without `with` or `set` prefix.
For example:

CsvTableSource source = new CsvTableSource.builder()
.path("/path/to/your/file.csv")
.field("myfield", Types.STRING)
.field("myfield2", Types.INT)
.build();

This pattern is heavily used in flink-table, e.g. `TableSchema`,
`TypeInference`, `BuiltInFunctionDefinition`.

> 3. Implementing the builder object:
I prefer  b) Mutable approach which is simpler from the implementation part.


Besides that, I think maybe we can add some other aspects:

4. Constructor of the main class.
 a) private constructor
 b) public constructor

5. setXXX methods of the main class
 a) setXXX methods are not allowed
 b) setXXX methods are allowed.

I prefer both option a). Because I think one of the reason to have the
builder is that we don't want the constructor public.
A public constructor makes it hard to maintain and evolve compatibly when
adding new parameters, FlinkKafkaProducer is a good example.
For set methods, I think in most cases, we want users to set the fields
eagerly (through the builder) and `setXXX` methods on the main class
is duplicate with the methods on the builder. We should avoid that.


Regards,
Jark


On Mon, 26 Aug 2019 at 20:18, Gyula Fóra  wrote:

> Hi All!
>
> I would like to start a code-style related discussion regarding how we
> implement the builder pattern in the Flink project.
>
> It would be the best to have a common approach, there are some aspects of
> the pattern that come to my mind please feel free to add anything I missed:
>
> 1. Creating the builder objects:
>
> a) Always create using static method in "built" class:
>Here we should have naming guidelines: .builder(..) or
> .xyzBuilder(...)
> b) Always use builder class constructor
> c) Mix: Again we should have some guidelines when to use which
>
> I personally prefer option a) to always have a static method to create the
> builder with static method names that end in builder.
>
> 2. Setting properties on the builder:
>
>  a) withSomething(...)
>  b) setSomething(...)
>  c) other
>
> I don't really have a preference but either a or b for consistency.
>
>
> 3. Implementing the builder object:
>
>  a) Immutable -> Creates a new builder object after setting a property
>  b) Mutable -> Returns (this) after setting the property
>
> I personally prefer the mutable version as it keeps the builder
> implementation much simpler and it seems to be a very common way of doing
> it.
>
> What do you all think?
>
> Regards,
> Gyula
>


[DISCUSS] Builder dedicated for testing

2019-08-26 Thread Zili Chen
Hi devs,

I'd like to share an observation that we have too many
@VisibleForTesting constructors that only used in test scope such as
ExecutionGraph and RestClusterClient.

It would be helpful if we introduce Builders in test scope for build
such instance and remain the production code only necessary
constructors.

Otherwise, code becomes in mess and contributors might be confused by
a series constructors but some of them are just for testing. Note that
@VisibleForTesting doesn't mean a method is *only* for testing.

Best,
tison.


[jira] [Created] (FLINK-13861) No new checkpoint trigged when canceling an expired checkpoint failed

2019-08-26 Thread Congxian Qiu(klion26) (Jira)
Congxian Qiu(klion26) created FLINK-13861:
-

 Summary: No new checkpoint trigged when canceling an expired 
checkpoint failed
 Key: FLINK-13861
 URL: https://issues.apache.org/jira/browse/FLINK-13861
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Affects Versions: 1.9.0, 1.8.1, 1.7.2
Reporter: Congxian Qiu(klion26)
 Fix For: 1.10.0


I encountered this problem in our private fork of Flink, after taking a look at 
the current master branch of Apache Flink, I think the problem exists here also.

Problem Detail:
 1. checkpoint canceled because of expiration, so will call the canceller such 
as below
{code:java}
final Runnable canceller = () -> {
   synchronized (lock) {
  // only do the work if the checkpoint is not discarded anyways
  // note that checkpoint completion discards the pending checkpoint object
  if (!checkpoint.isDiscarded()) {
 LOG.info("Checkpoint {} of job {} expired before completing.", 
checkpointID, job);

 failPendingCheckpoint(checkpoint, 
CheckpointFailureReason.CHECKPOINT_EXPIRED);
 pendingCheckpoints.remove(checkpointID);
 rememberRecentCheckpointId(checkpointID);

 triggerQueuedRequests();
  }
   }
};{code}
 
 But failPendingCheckpoint may throw exceptions because it will call

{{CheckpointCoordinator#failPendingCheckpoint}}

-> {{PendingCheckpoint#abort}}

->  {{PendingCheckpoint#reportFailedCheckpoint}}

-> initialize a FailedCheckpointStates,  may throw an exception by 
{{checkArgument}} 

Did not find more about why there ever failed the {{checkArgument 
currently(this problem did not reproduce frequently)}}, will create an issue 
for that if I have more findings.

 

2. when trigger checkpoint next, we'll first check if there already are too 
many checkpoints such as below
{code:java}
private void checkConcurrentCheckpoints() throws CheckpointException {
   if (pendingCheckpoints.size() >= maxConcurrentCheckpointAttempts) {
  triggerRequestQueued = true;
  if (currentPeriodicTrigger != null) {
 currentPeriodicTrigger.cancel(false);
 currentPeriodicTrigger = null;
  }
  throw new 
CheckpointException(CheckpointFailureReason.TOO_MANY_CONCURRENT_CHECKPOINTS);
   }
}
{code}
the {{pendingCheckpoints.zie() >= maxConcurrentCheckpoitnAttempts}} will always 
true

3. no checkpoint will be triggered ever from that on.

 Because of the {{failPendingCheckpoint}} may throw Exception, so we may place 
the remove pending checkpoint logic in a finally block.

I'd like to file a pr for this if this really needs to fix.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13862) Remove or rewrite Execution Plan docs

2019-08-26 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-13862:


 Summary: Remove or rewrite Execution Plan docs
 Key: FLINK-13862
 URL: https://issues.apache.org/jira/browse/FLINK-13862
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.9.0
Reporter: Stephan Ewen
 Fix For: 1.10.0, 1.9.1


The *Execution Plans* section is totally outdated and refers to the old 
{{tools/planVisalizer.html}} file that has been removed for two years.

https://ci.apache.org/projects/flink/flink-docs-master/dev/execution_plans.html



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-26 Thread Kurt Young
Yes, glad to see that there is already a PR for such optimization.

Best,
Kurt


On Mon, Aug 26, 2019 at 6:59 PM Piotr Nowojski  wrote:

> Hi,
>
> Xiaogang, those things worry me the most.
>
> 1. Regarding the broadcasting, doesn’t the BroadcastState [1] cover our
> issues? Can not we construct a job graph, where one operator has two
> outputs, one keyed another broadcasted, which are wired together back to
> the KeyedBroadcastProcessFunction or BroadcastProcessFunction?
>
> 2. Multicast on keyed streams, might be done by iterating over all of the
> keys. However I have a feeling that might not be the feature which
> distributed cross/theta joins would want, since they would probably need a
> guarantee to have only a single key per operator instance.
>
> Kurt, by broadcast optimisation do you mean [2]?
>
> I’m not sure if we should split the discussion yet. Most of the changes
> required by either multicast or broadcast will be in the API/state layers.
> Runtime changes for broadcast would be almost none (just exposing existing
> features) and for multicast they shouldn't be huge as well. However maybe
> we should consider those two things together at the API level, so that we
> do not make wrong decisions when just looking at the simpler/more narrow
> broadcast support?
>
> Piotrek
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html
> <
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html
> >
> [2] https://github.com/apache/flink/pull/7713 <
> https://github.com/apache/flink/pull/7713>
>
> > On 26 Aug 2019, at 09:35, Kurt Young  wrote:
> >
> > From SQL's perspective, distributed cross join is a valid feature but not
> > very
> > urgent. Actually this discuss reminds me about another useful feature
> > (sorry
> > for the distraction):
> >
> > when doing broadcast in batch shuffle mode, we can make each producer
> only
> > write one copy of the output data, but not for every consumer. Broadcast
> > join
> > is much more useful, and this is a very important optimization. Not sure
> if
> > we
> > have already consider this.
> >
> > Best,
> > Kurt
> >
> >
> > On Mon, Aug 26, 2019 at 12:16 PM Guowei Ma  wrote:
> >
> >> Thanks Yun for bringing up this discussion and very thanks for all the
> deep
> >> thoughts!
> >>
> >> For now, I think this discussion contains two scenarios: one if for
> >> iteration library support and the other is for SQL join support. I think
> >> both of the two scenarios are useful but they seem to have different
> best
> >> suitable solutions. For making the discussion more clear, I would
> suggest
> >> to split the discussion into two threads.
> >>
> >> And I agree with Piotr that it is very tricky that a keyed stream
> received
> >> a "broadcast element". So we may add some new interfaces, which could
> >> broadcast or process some special "broadcast event". In that way
> "broadcast
> >> event" will not be sent with the normal process.
> >>
> >> Best,
> >> Guowei
> >>
> >>
> >> SHI Xiaogang  于2019年8月26日周一 上午9:27写道:
> >>
> >>> Hi all,
> >>>
> >>> I also think that multicasting is a necessity in Flink, but more
> details
> >>> are needed to be considered.
> >>>
> >>> Currently network is tightly coupled with states in Flink to achieve
> >>> automatic scaling. We can only access keyed states in keyed streams and
> >>> operator states in all streams.
> >>> In the concrete example of theta-joins implemented with mutlticasting,
> >> the
> >>> following questions exist:
> >>>
> >>>   - In which type of states will the data be stored? Do we need another
> >>>   type of states which is coupled with multicasting streams?
> >>>   - How to ensure the consistency between network and states when jobs
> >>>   scale out or scale in?
> >>>
> >>> Regards,
> >>> Xiaogang
> >>>
> >>> Xingcan Cui  于2019年8月25日周日 上午10:03写道:
> >>>
>  Hi all,
> 
>  Sorry for joining this thread late. Basically, I think enabling
> >> multicast
>  pattern could be the right direction, but more detailed implementation
>  policies need to be discussed.
> 
>  Two years ago, I filed an issue [1] about the multicast API. However,
> >> due
>  to some reasons, it was laid aside. After that, when I tried to
> >>> cherry-pick
>  the change for experimental use, I found the return type of
>  `selectChannels()` method had changed from `int[]` to `int`, which
> >> makes
>  the old implementation not work anymore.
> 
>  From my side, the multicast has always been used for theta-join. As
> far
> >>> as
>  I know, it’s an essential requirement for some sophisticated joining
>  algorithms. Until now, the Flink non-equi joins can still only be
> >>> executed
>  single-threaded. If we'd like to make some improvements on this, we
> >>> should
>  first take some measures to support multicast pattern.
> 
>  Best,
>  Xingcan
> 
>  [1] https://issues.apache.org/jira/browse/

Re: [CODE-STYLE] Builder pattern

2019-08-26 Thread Piotr Nowojski
Hi,

I agree with Dawid, modulo that I don’t have any preference about point 2 - I’m 
ok even with not enforcing this. 

One side note about point 4. There are use cases where passing obligatory 
parameters in the build method itself might make sense:

I. - when those parameters can not be or can not be easily passed via the 
constructor. Good example of that is “builder” pattern for the StreamOperators 
(StreamOperatorFactory), where factory is constructed on the API level in the 
client, then it’s being serialised and sent over the network and reconstructed 
on the TaskManager, where StreamOperator is finally constructed. The issue is 
that some of the obligatory parameters are only available on the TaskManager, 
so they can not be passed on a DataStream level in the client.
II. - when builder might be used to create multiple instances of the object 
with different values.

Piotrek

> On 26 Aug 2019, at 15:12, Jark Wu  wrote:
> 
> Hi Gyula,
> 
> Thanks for bringing this. I think it would be nice if we have a common
> approach to create builder pattern.
> Currently, we have a lot of builders but with different tastes.
> 
>> 1. Creating the builder objects:
> I prefer option a) too. It would be easier for users to get the builder
> instance.
> 
>> 2. Setting properties on the builder:
> I don't have a preference for it. But I think there is another option might
> be more concise, i.e. "something()" without `with` or `set` prefix.
> For example:
> 
> CsvTableSource source = new CsvTableSource.builder()
>.path("/path/to/your/file.csv")
>.field("myfield", Types.STRING)
>.field("myfield2", Types.INT)
>.build();
> 
> This pattern is heavily used in flink-table, e.g. `TableSchema`,
> `TypeInference`, `BuiltInFunctionDefinition`.
> 
>> 3. Implementing the builder object:
> I prefer  b) Mutable approach which is simpler from the implementation part.
> 
> 
> Besides that, I think maybe we can add some other aspects:
> 
> 4. Constructor of the main class.
> a) private constructor
> b) public constructor
> 
> 5. setXXX methods of the main class
> a) setXXX methods are not allowed
> b) setXXX methods are allowed.
> 
> I prefer both option a). Because I think one of the reason to have the
> builder is that we don't want the constructor public.
> A public constructor makes it hard to maintain and evolve compatibly when
> adding new parameters, FlinkKafkaProducer is a good example.
> For set methods, I think in most cases, we want users to set the fields
> eagerly (through the builder) and `setXXX` methods on the main class
> is duplicate with the methods on the builder. We should avoid that.
> 
> 
> Regards,
> Jark
> 
> 
> On Mon, 26 Aug 2019 at 20:18, Gyula Fóra  wrote:
> 
>> Hi All!
>> 
>> I would like to start a code-style related discussion regarding how we
>> implement the builder pattern in the Flink project.
>> 
>> It would be the best to have a common approach, there are some aspects of
>> the pattern that come to my mind please feel free to add anything I missed:
>> 
>> 1. Creating the builder objects:
>> 
>> a) Always create using static method in "built" class:
>>   Here we should have naming guidelines: .builder(..) or
>> .xyzBuilder(...)
>> b) Always use builder class constructor
>> c) Mix: Again we should have some guidelines when to use which
>> 
>> I personally prefer option a) to always have a static method to create the
>> builder with static method names that end in builder.
>> 
>> 2. Setting properties on the builder:
>> 
>> a) withSomething(...)
>> b) setSomething(...)
>> c) other
>> 
>> I don't really have a preference but either a or b for consistency.
>> 
>> 
>> 3. Implementing the builder object:
>> 
>> a) Immutable -> Creates a new builder object after setting a property
>> b) Mutable -> Returns (this) after setting the property
>> 
>> I personally prefer the mutable version as it keeps the builder
>> implementation much simpler and it seems to be a very common way of doing
>> it.
>> 
>> What do you all think?
>> 
>> Regards,
>> Gyula
>> 



[jira] [Created] (FLINK-13863) Update Operations Playground to Flink 1.9.0

2019-08-26 Thread Fabian Hueske (Jira)
Fabian Hueske created FLINK-13863:
-

 Summary: Update Operations Playground to Flink 1.9.0
 Key: FLINK-13863
 URL: https://issues.apache.org/jira/browse/FLINK-13863
 Project: Flink
  Issue Type: Task
  Components: Documentation
Affects Versions: 1.9.0
Reporter: Fabian Hueske
Assignee: Fabian Hueske


Update the operations playground to Flink 1.9.0
This includes:
* Updating the flink-playgrounds repository
* Updating the "Getting Started/Docker Playgrounds" section in the 
documentation.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-26 Thread Yun Gao

Hi,

Very thanks for all the points raised ! 

@Piotr For using another edge to broadcast the event, I think it may not be 
able to address the iteration case. The primary problem is that with  two edges 
we cannot ensure the order of records. However, In the iteration case, the 
broadcasted event is used to mark the progress of the iteration and it works 
like watermark, thus its position relative to the normal records can not change.
And @Piotr, @Xiaogang, for the requirements on the state, I think different 
options seems vary. The first option is to allow Operator to broadcast a 
separate event and have a separate process method for this event. To be detail, 
we may add a new type of StreamElement called Event and allow Operator to 
broadcastEmit Event. Then in the received side, we could add a new 
`processEvent` method to the (Keyed)ProcessFunction. Similar to the broadcast 
side of KeyedBroadcastProcessFunction, in this new method users cannot access 
keyed state with specific key, but can register a state function to touch all 
the elements in the keyed state. This option needs to modify the runtime to 
support the new type of StreamElement, but it does not affect the semantics of 
states and thus it has no requirements on state.
The second option is to allow Operator to broadcastEmit T and in the 
receiver side, user can process the broadcast element with the existing process 
method. This option is consistent with the OperatorState, but for keyedState we 
may send a record to tasks that do not containing the corresponding keyed 
state, thus it should require some changes on the State.
The third option is to support the generic Multicast. For keyedState it also 
meets the problem of inconsistency between network partitioner and keyed state 
partitioner, and if we want to rely on it to implement the non-key join, it 
should be also meet the problem of cannot control the partitioning of operator 
state. Therefore, it should also require some changes on the State.
Then for the different scenarios proposed, the iteration case in fact requires 
exactly the ability to broadcast a different event type. In the iteration the 
fields of the progress event are in fact different from that of normal records. 
It does not contain actual value but contains some fields for the downstream 
operators to align the events and track the progress. Therefore, broadcasting a 
different event type is able to solve the iteration case without the 
requirements on the state. Besides, allowing the operator to broadcast a 
separate event may also facilitate some other user cases, for example, users 
may notify the downstream operators to change logic if some patterns are 
matched. The notification might be different from the normal records and users 
do not need to uniform them with a wrapper type manually if the operators are 
able to broadcast a separate event. However, it truly cannot address the 
non-key join scenarios. 
Since allowing broadcasting a separate event seems to be able to serve as a 
standalone functionality, and it does not require change on the state, I am 
thinking that is it possible for us to partition to multiple steps and supports 
broadcasting events first ? At the same time we could also continue working on 
other options to support more scenarios like non-key join and they seems to 
requires more thoughts.

Best,
Yun



--
From:Piotr Nowojski 
Send Time:2019 Aug. 26 (Mon.) 18:59
To:dev 
Cc:Yun Gao 
Subject:Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

Hi,

Xiaogang, those things worry me the most.
1. Regarding the broadcasting, doesn’t the BroadcastState [1] cover our issues? 
Can not we construct a job graph, where one operator has two outputs, one keyed 
another broadcasted, which are wired together back to the 
KeyedBroadcastProcessFunction or BroadcastProcessFunction? 

2. Multicast on keyed streams, might be done by iterating over all of the keys. 
However I have a feeling that might not be the feature which distributed 
cross/theta joins would want, since they would probably need a guarantee to 
have only a single key per operator instance.

Kurt, by broadcast optimisation do you mean [2]?

I’m not sure if we should split the discussion yet. Most of the changes 
required by either multicast or broadcast will be in the API/state layers. 
Runtime changes for broadcast would be almost none (just exposing existing 
features) and for multicast they shouldn't be huge as well. However maybe we 
should consider those two things together at the API level, so that we do not 
make wrong decisions when just looking at the simpler/more narrow broadcast 
support?

Piotrek

[1] 
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html
[2] https://github.com/apache/flink/pull/7713


On 26 Aug 2019, at 09:35, Kurt Young  wrote:
From SQL's perspective, distributed cross join is a valid featur

Re: [DISCUSS] Use Java's Duration instead of Flink's Time

2019-08-26 Thread Stephan Ewen
Seems everyone is in favor in principle.

  - For public APIs, I would keep Time for now (to not break the API).
Maybe add a Duration variant and deprecate the Time variant, but not remove
it before Flink 1.0
  - For all runtime Java code, switch to Java's Duration now
  - For all Scala code let's see how much we can switch to Java Durations
without blowing up stuff. After all, like Tison said, we want to get the
runtime Scala free in the future.

On Mon, Aug 26, 2019 at 3:45 AM Jark Wu  wrote:

> +1 to use Java's Duration instead of Flink's Time.
>
> Regarding to the Duration parsing, we have mentioned this in FLIP-54[1] to
> use `org.apache.flink.util.TimeUtils` for the parsing.
>
> Best,
> Jark
>
> [1]:
>
> https://docs.google.com/document/d/1IQ7nwXqmhCy900t2vQLEL3N2HIdMg-JO8vTzo1BtyKU/edit#heading=h.egdwkc93dn1k
>
> On Sat, 24 Aug 2019 at 18:24, Zhu Zhu  wrote:
>
> > +1 since Java Duration is more common and powerful than Flink Time.
> >
> > For whether to drop scala Duration for parsing duration OptionConfig, I
> > think it's another question and should be discussed in another thread.
> >
> > Thanks,
> > Zhu Zhu
> >
> > Becket Qin  于2019年8月24日周六 下午4:16写道:
> >
> > > +1, makes sense. BTW, we probably need a FLIP as this is a public API
> > > change.
> > >
> > > On Sat, Aug 24, 2019 at 8:11 AM SHI Xiaogang 
> > > wrote:
> > >
> > > > +1 to replace Flink's time with Java's Duration.
> > > >
> > > > Besides, i also suggest to use Java's Instant for "point-in-time".
> > > > It can take care of time units when we calculate Duration between
> > > different
> > > > instants.
> > > >
> > > > Regards,
> > > > Xiaogang
> > > >
> > > > Zili Chen  于2019年8月24日周六 上午10:45写道:
> > > >
> > > > > Hi vino,
> > > > >
> > > > > I agree that it introduces extra complexity to replace
> > Duration(Scala)
> > > > > with Duration(Java) *in Scala code*. We could separate the usage
> for
> > > each
> > > > > language and use a bridge when necessary.
> > > > >
> > > > > As a matter of fact, Scala concurrent APIs(including Duration) are
> > used
> > > > > more than necessary at least in flink-runtime. Also we even try to
> > make
> > > > > flink-runtime scala free.
> > > > >
> > > > > Best,
> > > > > tison.
> > > > >
> > > > >
> > > > > vino yang  于2019年8月24日周六 上午10:05写道:
> > > > >
> > > > > > +1 to replace the Time class provided by Flink with Java's
> > Duration:
> > > > > >
> > > > > >
> > > > > >- Java's Duration has better representation than the Flink's
> > Time
> > > > > class;
> > > > > >- As a built-in Java class, Duration class has a clear
> advantage
> > > > over
> > > > > >Java's Time class when interacting with other Java APIs and
> > > > > third-party
> > > > > >libraries;
> > > > > >
> > > > > >
> > > > > > But I have reservations about replacing the Duration and
> > FineDuration
> > > > > > classes in scala with the Duration class in Java. Java and Scala
> > have
> > > > > > different types of systems. Currently, Duration (scala) and
> > > > FineDuration
> > > > > > (scala) work well.  In addition, this work brings additional
> > > complexity
> > > > > and
> > > > > > cost compared to the gains obtained.
> > > > > >
> > > > > > Best,
> > > > > > Vino
> > > > > >
> > > > > > Zili Chen  于2019年8月23日周五 下午11:14写道:
> > > > > >
> > > > > > > Hi Stephan,
> > > > > > >
> > > > > > > I like the idea unify usage of time/duration api. We actually
> > > > > > > use at least five different classes for this purposes(see
> below).
> > > > > > >
> > > > > > > One thing I'd like to pick up is that duration configuration
> > > > > > > in Flink is almost in pattern as "60 s" that fits in the
> pattern
> > > > > > > parsed by scala.concurrent.duration.Duration. AFAIK Duration
> > > > > > > in Java 8 doesn't support this pattern. However, we can solve
> > > > > > > it by introduce a DurationUtils.
> > > > > > >
> > > > > > > Also to clarify, we now have (correct me if any other)
> > > > > > >
> > > > > > > java.time.Duration
> > > > > > > scala.concurrent.duration.Duration
> > > > > > > scala.concurrent.duration.FiniteDuration
> > > > > > > org.apache.flink.api.common.time.Time
> > > > > > > org.apache.flink.streaming.api.windowing.time.Time
> > > > > > >
> > > > > > > in use. If we'd prefer java.time.Duration, it is worth to
> > consider
> > > > > > > whether we unify all of them into Java's Duration, i.e., Java's
> > > > > > > Duration is the first class time/duration api, while others
> > should
> > > > > > > be converted into or out from it.
> > > > > > >
> > > > > > > Best,
> > > > > > > tison.
> > > > > > >
> > > > > > >
> > > > > > > Stephan Ewen  于2019年8月23日周五 下午10:45写道:
> > > > > > >
> > > > > > > > Hi all!
> > > > > > > >
> > > > > > > > Many parts of the code use Flink's "Time" class. The Time
> > really
> > > > is a
> > > > > > > "time
> > > > > > > > interval" or a "Duration".
> > > > > > > >
> > > > > > > > Since Java 8, there is a Java class "Duration" that is nice
> and
> > > > > > flexible
> > > > > > > to

[jira] [Created] (FLINK-13864) StreamingFileSink: Allow inherited classes to extend StreamingFileSink correctly

2019-08-26 Thread Kailash Hassan Dayanand (Jira)
Kailash Hassan Dayanand created FLINK-13864:
---

 Summary: StreamingFileSink: Allow inherited classes to extend 
StreamingFileSink correctly
 Key: FLINK-13864
 URL: https://issues.apache.org/jira/browse/FLINK-13864
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / FileSystem
Reporter: Kailash Hassan Dayanand


Currently the StreamingFileSink can't be extended correctly as there are a few 
issues [PR |[https://github.com/apache/flink/pull/8469]] merged for  this 
[Jira|https://issues.apache.org/jira/browse/FLINK-12539]

Mailing list discussion: 
[http://mail-archives.apache.org/mod_mbox/flink-dev/201908.mbox/%3CCACGLQUAxXjr2mBOf-6hbXcwmWoH5ib_0YEy-Vyjj%3DEPyQ25Qiw%40mail.gmail.com%3E]

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13865) Support custom config in Flink docker image

2019-08-26 Thread Dagang Wei (Jira)
Dagang Wei created FLINK-13865:
--

 Summary: Support custom config in Flink docker image
 Key: FLINK-13865
 URL: https://issues.apache.org/jira/browse/FLINK-13865
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / Kubernetes
Affects Versions: 1.9.0, 1.8.1, 1.8.0, 1.7.2, 1.6.4, 1.6.3
Reporter: Dagang Wei


[Flink docker images](https://github.com/docker-flink/docker-flink) do not 
support custom config, in order to do so, the user has to build their own image 
based on the official image and modify config/flink-conf.yaml. It would be much 
easier if the image accepts custom config through args and in 
[docker-entrypoint.sh](https://github.com/docker-flink/docker-flink/blob/master/docker-entrypoint.sh)
 it automatically merge the custom config into config/flink-conf.yaml.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13866) develop testing plan for many Hive versions that we support

2019-08-26 Thread Bowen Li (Jira)
Bowen Li created FLINK-13866:


 Summary: develop testing plan for many Hive versions that we 
support
 Key: FLINK-13866
 URL: https://issues.apache.org/jira/browse/FLINK-13866
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Hive
Reporter: Bowen Li
 Fix For: 1.10.0


with FLINK-13841, we will start to support quite a few hive version, let alone 
other major versions like 1.1, 2.2, and 3.x.

We need to come up with a testing plan to cover all these Hive versions to 
guarantee 1) help identify and fix breaking changes ASAP, 2) minimize 
developers' efforts in manually test and maintain compatibilities of all these 
Hive versions, and automate as much as possible.

Set it to 1.10.0 for now.

cc [~xuefuz] [~lirui] [~Terry1897]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-26 Thread SHI Xiaogang
Hi, Yun Gao

The discussion seems to move in a different direction, changing from
supporting multicasting to implementing new iteration libraries on data
streams.

Regarding the broadcast events in iterations, many details of new iteration
libraries are unclear,
1. How the iteration progress is determined and notified? The iterations
are synchronous or asynchronous? As far as i know, progress tracking for
asynchronous iterations is very difficult.
2. Do async I/O operators allowed in the iterations? If so, how the
broadcast events are checkpointed and restored? How broadcast events are
distributed when the degree of parallelism changes?
3. Do the emitted broadcast events carry the sender's index? Will they be
aligned in a similar way to checkpoint barriers in downstream operators?
4. In the case of synchronous iterations, do we need something similar to
barrier buffers to guarantee the correctness of iterations?
5. Will checkpointing be enabled in iterations? If checkpointing is
enabled, how will checkpoint barriers interact with broadcast events?

I think a detailed design document for iterations will help understand
these problems, hencing improving the discussion.

I also suggest a new thread for the discussion on iterations.
This thread should focus on multicasting and discuss those problems related
to multicasting, including how data is delivered and states are partitioned.

Regards,
Xiaogang

Yun Gao  于2019年8月26日周一 下午11:35写道:

>
> Hi,
>
> Very thanks for all the points raised !
>
> @Piotr For using another edge to broadcast the event, I think it may not
> be able to address the iteration case. The primary problem is that with
> two edges we cannot ensure the order of records. However, In the iteration
> case, the broadcasted event is used to mark the progress of the iteration
> and it works like watermark, thus its position relative to the normal
> records can not change.
> And @Piotr, @Xiaogang, for the requirements on the state, I think
> different options seems vary. The first option is to allow Operator to
> broadcast a separate event and have a separate process method for this
> event. To be detail, we may add a new type of StreamElement called Event
> and allow Operator to broadcastEmit Event. Then in the received side, we
> could add a new `processEvent` method to the (Keyed)ProcessFunction.
> Similar to the broadcast side of KeyedBroadcastProcessFunction, in this new
> method users cannot access keyed state with specific key, but can register
> a state function to touch all the elements in the keyed state. This option
> needs to modify the runtime to support the new type of StreamElement, but
> it does not affect the semantics of states and thus it has no requirements
> on state.
> The second option is to allow Operator to broadcastEmit T and in the
> receiver side, user can process the broadcast element with the existing
> process method. This option is consistent with the OperatorState, but for
> keyedState we may send a record to tasks that do not containing the
> corresponding keyed state, thus it should require some changes on the State.
> The third option is to support the generic Multicast. For keyedState it
> also meets the problem of inconsistency between network partitioner and
> keyed state partitioner, and if we want to rely on it to implement the
> non-key join, it should be also meet the problem of cannot control the
> partitioning of operator state. Therefore, it should also require some
> changes on the State.
> Then for the different scenarios proposed, the iteration case in fact
> requires exactly the ability to broadcast a different event type. In the
> iteration the fields of the progress event are in fact different from that
> of normal records. It does not contain actual value but contains some
> fields for the downstream operators to align the events and track the
> progress. Therefore, broadcasting a different event type is able to solve
> the iteration case without the requirements on the state. Besides, allowing
> the operator to broadcast a separate event may also facilitate some other
> user cases, for example, users may notify the downstream operators to
> change logic if some patterns are matched. The notification might be
> different from the normal records and users do not need to uniform them
> with a wrapper type manually if the operators are able to broadcast a
> separate event. However, it truly cannot address the non-key join
> scenarios.
> Since allowing broadcasting a separate event seems to be able to serve as
> a standalone functionality, and it does not require change on the state, I
> am thinking that is it possible for us to partition to multiple steps and
> supports broadcasting events first ? At the same time we could also
> continue working on other options to support more scenarios like non-key
> join and they seems to requires more thoughts.
>
> Best,
> Yun
>
>
>
> --
> From:Piotr 

Re: [DISCUSS] Setup a bui...@flink.apache.org mailing list for travis builds

2019-08-26 Thread jincheng sun
Great Job Jark :)
Best, Jincheng

Kurt Young  于2019年8月26日周一 下午6:38写道:

> Thanks for the updates, Jark! I have subscribed the ML and everything
> looks good now.
>
> Best,
> Kurt
>
>
> On Mon, Aug 26, 2019 at 11:17 AM Jark Wu  wrote:
>
> > Hi all,
> >
> > Sorry it take so long to get back. I have some good news.
> >
> > After some investigation and development and the help from Chesnay, we
> > finally integrated Travis build notification with
> bui...@flink.apache.org
> > mailing list with remaining the beautiful formatting!
> > Currently, only the failure and failure->success builds will be notified,
> > only builds (include CRON) on apache/flink branches will be notified, the
> > pull request builds will not be notified.
> >
> > The builds mailing list is also available in Flink website community page
> > [1]
> >
> > I would encourage devs to subscribe the builds mailing list, and help the
> > community to pay more attention to the build status, especially the CRON
> > builds.
> >
> > Feel free to leave your suggestions and feedbacks here!
> >
> > 
> >
> > # The implementation detail:
> >
> > I implemented a flink-notification-bot[2] to receive Travis webhook[3]
> > payload and generate an HTML email and send the email to
> > bui...@flink.apache.org.
> > The flink-notification-bot is deployed on my own VM in DigitalOcean. You
> > can refer the github page [2] of the project to learn more details about
> > the implementation and deployment.
> > Btw, I'm glad to contribute the project to https://github.com/flink-ci
> or
> > https://github.com/flinkbot if the community accepts.
> >
> > With the flink-notification-bot, we can easily integrate it with other CI
> > service or our own CI, and we can also integrate it with some other
> > applications (e.g. DingTalk).
> >
> > # Rejected Alternative:
> >
> > Option#1: Sending email notifications via "Travis Email Notification"[4].
> > Reasons:
> >  - If the emailing notification is set, Travis CI only sends an emails to
> > the addresses specified there, rather than to the committer and author.
> >  - We will lose the beautiful email formatting when Travis send Email to
> > builds ML.
> >  - The return-path of emails from Travis CI is not constant, which makes
> it
> > difficult for mailing list to accept it.
> >
> > Cheers,
> > Jark
> >
> > [1]: https://flink.apache.org/community.html#mailing-lists
> > [2]: https://github.com/wuchong/flink-notification-bot
> > [3]:
> >
> >
> https://docs.travis-ci.com/user/notifications/#configuring-webhook-notifications
> > [4]:
> >
> >
> https://docs.travis-ci.com/user/notifications/#configuring-email-notifications
> >
> >
> >
> >
> > On Tue, 30 Jul 2019 at 18:35, Jark Wu  wrote:
> >
> > > Hi all,
> > >
> > > Progress updates:
> > > 1. the bui...@flink.apache.org can be subscribed now (thanks @Robert),
> > > you can send an email to builds-subscr...@flink.apache.org to
> subscribe.
> > > 2. We have a pull request [1] to send only apache/flink builds
> > > notifications and it works well.
> > > 3. However, all the notifications are rejected by the builds mailing
> list
> > > (the MODERATE mails).
> > > I added & checked bui...@travis-ci.org to the subscriber/allow
> list,
> > > but still doesn't work. It might be recognized as spam by the mailing
> > list.
> > > We are still trying to figure it out, and will update here if we
> have
> > > some progress.
> > >
> > >
> > > Thanks,
> > > Jark
> > >
> > >
> > >
> > > [1]: https://github.com/apache/flink/pull/9230
> > >
> > >
> > > On Thu, 25 Jul 2019 at 22:59, Robert Metzger 
> > wrote:
> > >
> > >> The mailing list has been created, you can now subscribe to it.
> > >>
> > >> On Wed, Jul 24, 2019 at 1:43 PM Jark Wu  wrote:
> > >>
> > >> > Thanks Robert for helping out that.
> > >> >
> > >> > Best,
> > >> > Jark
> > >> >
> > >> > On Wed, 24 Jul 2019 at 19:16, Robert Metzger 
> > >> wrote:
> > >> >
> > >> > > I've requested the creation of the list, and made Jark, Chesnay
> and
> > me
> > >> > > moderators of it.
> > >> > >
> > >> > > On Wed, Jul 24, 2019 at 1:12 PM Robert Metzger <
> rmetz...@apache.org
> > >
> > >> > > wrote:
> > >> > >
> > >> > > > @Jark: Yes, I will request the creation of a mailing list!
> > >> > > >
> > >> > > > On Tue, Jul 23, 2019 at 4:48 PM Hugo Louro 
> > >> wrote:
> > >> > > >
> > >> > > >> +1
> > >> > > >>
> > >> > > >> > On Jul 23, 2019, at 6:15 AM, Till Rohrmann <
> > trohrm...@apache.org
> > >> >
> > >> > > >> wrote:
> > >> > > >> >
> > >> > > >> > Good idea Jark. +1 for the proposal.
> > >> > > >> >
> > >> > > >> > Cheers,
> > >> > > >> > Till
> > >> > > >> >
> > >> > > >> >> On Tue, Jul 23, 2019 at 1:59 PM Hequn Cheng <
> > >> chenghe...@gmail.com>
> > >> > > >> wrote:
> > >> > > >> >>
> > >> > > >> >> Hi Jark,
> > >> > > >> >>
> > >> > > >> >> Good idea. +1!
> > >> > > >> >>
> > >> > > >> >>> On Tue, Jul 23, 2019 at 6:23 PM Jark Wu 
> > >> wrote:
> > >> > > >> >>>
> > >> > > >

Re: [DISCUSS] Flink Python User-Defined Function for Table API

2019-08-26 Thread jincheng sun
Hi Dian, can you check if you have edit access? :)


Dian Fu  于2019年8月26日周一 上午10:52写道:

> Hi Jincheng,
>
> Appreciated for the kind tips and offering of help. Definitely need it!
> Could you grant me write permission for confluence? My Id: Dian Fu
>
> Thanks,
> Dian
>
> > 在 2019年8月26日,上午9:53,jincheng sun  写道:
> >
> > Thanks for your feedback Hequn & Dian.
> >
> > Dian, I am glad to see that you want help to create the FLIP!
> > Everyone will have first time, and I am very willing to help you complete
> > your first FLIP creation. Here some tips:
> >
> > - First I'll give your account write permission for confluence.
> > - Before create the FLIP, please have look at the FLIP Template [1],
> (It's
> > better to know more about FLIP by reading [2])
> > - Create Flink Python UDFs related JIRAs after completing the VOTE of
> > FLIP.(I think you also can bring up the VOTE thread, if you want! )
> >
> > Any problems you encounter during this period,feel free to tell me that
> we
> > can solve them together. :)
> >
> > Best,
> > Jincheng
> >
> >
> >
> >
> > [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP+Template
> > [2]
> >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
> >
> >
> > Hequn Cheng  于2019年8月23日周五 上午11:54写道:
> >
> >> +1 for starting the vote.
> >>
> >> Thanks Jincheng a lot for the discussion.
> >>
> >> Best, Hequn
> >>
> >> On Fri, Aug 23, 2019 at 10:06 AM Dian Fu  wrote:
> >>
> >>> Hi Jincheng,
> >>>
> >>> +1 to start the FLIP create and VOTE on this feature. I'm willing to
> help
> >>> on the FLIP create if you don't mind. As I haven't created a FLIP
> before,
> >>> it will be great if you could help on this. :)
> >>>
> >>> Regards,
> >>> Dian
> >>>
>  在 2019年8月22日,下午11:41,jincheng sun  写道:
> 
>  Hi all,
> 
>  Thanks a lot for your feedback. If there are no more suggestions and
>  comments, I think it's better to  initiate a vote to create a FLIP for
>  Apache Flink Python UDFs.
>  What do you think?
> 
>  Best, Jincheng
> 
>  jincheng sun  于2019年8月15日周四 上午12:54写道:
> 
> > Hi Thomas,
> >
> > Thanks for your confirmation and the very important reminder about
> >>> bundle
> > processing.
> >
> > I have had add the description about how to perform bundle processing
> >>> from
> > the perspective of checkpoint and watermark. Feel free to leave
> >>> comments if
> > there are anything not describe clearly.
> >
> > Best,
> > Jincheng
> >
> >
> > Dian Fu  于2019年8月14日周三 上午10:08写道:
> >
> >> Hi Thomas,
> >>
> >> Thanks a lot the suggestions.
> >>
> >> Regarding to bundle processing, there is a section "Checkpoint"[1]
> in
> >>> the
> >> design doc which talks about how to handle the checkpoint.
> >> However, I think you are right that we should talk more about it,
> >> such
> >>> as
> >> what's bundle processing, how it affects the checkpoint and
> >> watermark,
> >>> how
> >> to handle the checkpoint and watermark, etc.
> >>
> >> [1]
> >>
> >>>
> >>
> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3
> >> <
> >>
> >>>
> >>
> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3
> >>>
> >>
> >> Regards,
> >> Dian
> >>
> >>> 在 2019年8月14日,上午1:01,Thomas Weise  写道:
> >>>
> >>> Hi Jincheng,
> >>>
> >>> Thanks for putting this together. The proposal is very detailed,
> >> thorough
> >>> and for me as a Beam Flink runner contributor easy to understand :)
> >>>
> >>> One thing that you should probably detail more is the bundle
> >> processing. It
> >>> is critically important for performance that multiple elements are
> >>> processed in a bundle. The default bundle size in the Flink runner
> >> is
> >> 1s or
> >>> 1000 elements, whichever comes first. And for streaming, you can
> >> find
> >> the
> >>> logic necessary to align the bundle processing with watermarks and
> >>> checkpointing here:
> >>>
> >>
> >>>
> >>
> https://github.com/apache/beam/blob/release-2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java
> >>>
> >>> Thomas
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, Aug 13, 2019 at 7:05 AM jincheng sun <
> >>> sunjincheng...@gmail.com>
> >>> wrote:
> >>>
>  Hi all,
> 
>  The Python Table API(without Python UDF support) has already been
> >> supported
>  and will be available in the coming release 1.9.
>  As Python UDF is very important for Python users, we'd like to
> >> start
> >> the
>  discussion about the Python UDF support in the Python Table API.
>  Aljoscha Krettek, Dian Fu and I have discussed 

Re: [DISCUSS] Flink Python User-Defined Function for Table API

2019-08-26 Thread Dian Fu
Hi Jincheng,

Thanks! It works.

Thanks,
Dian

> 在 2019年8月27日,上午10:55,jincheng sun  写道:
> 
> Hi Dian, can you check if you have edit access? :)
> 
> 
> Dian Fu  于2019年8月26日周一 上午10:52写道:
> 
>> Hi Jincheng,
>> 
>> Appreciated for the kind tips and offering of help. Definitely need it!
>> Could you grant me write permission for confluence? My Id: Dian Fu
>> 
>> Thanks,
>> Dian
>> 
>>> 在 2019年8月26日,上午9:53,jincheng sun  写道:
>>> 
>>> Thanks for your feedback Hequn & Dian.
>>> 
>>> Dian, I am glad to see that you want help to create the FLIP!
>>> Everyone will have first time, and I am very willing to help you complete
>>> your first FLIP creation. Here some tips:
>>> 
>>> - First I'll give your account write permission for confluence.
>>> - Before create the FLIP, please have look at the FLIP Template [1],
>> (It's
>>> better to know more about FLIP by reading [2])
>>> - Create Flink Python UDFs related JIRAs after completing the VOTE of
>>> FLIP.(I think you also can bring up the VOTE thread, if you want! )
>>> 
>>> Any problems you encounter during this period,feel free to tell me that
>> we
>>> can solve them together. :)
>>> 
>>> Best,
>>> Jincheng
>>> 
>>> 
>>> 
>>> 
>>> [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP+Template
>>> [2]
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
>>> 
>>> 
>>> Hequn Cheng  于2019年8月23日周五 上午11:54写道:
>>> 
 +1 for starting the vote.
 
 Thanks Jincheng a lot for the discussion.
 
 Best, Hequn
 
 On Fri, Aug 23, 2019 at 10:06 AM Dian Fu  wrote:
 
> Hi Jincheng,
> 
> +1 to start the FLIP create and VOTE on this feature. I'm willing to
>> help
> on the FLIP create if you don't mind. As I haven't created a FLIP
>> before,
> it will be great if you could help on this. :)
> 
> Regards,
> Dian
> 
>> 在 2019年8月22日,下午11:41,jincheng sun  写道:
>> 
>> Hi all,
>> 
>> Thanks a lot for your feedback. If there are no more suggestions and
>> comments, I think it's better to  initiate a vote to create a FLIP for
>> Apache Flink Python UDFs.
>> What do you think?
>> 
>> Best, Jincheng
>> 
>> jincheng sun  于2019年8月15日周四 上午12:54写道:
>> 
>>> Hi Thomas,
>>> 
>>> Thanks for your confirmation and the very important reminder about
> bundle
>>> processing.
>>> 
>>> I have had add the description about how to perform bundle processing
> from
>>> the perspective of checkpoint and watermark. Feel free to leave
> comments if
>>> there are anything not describe clearly.
>>> 
>>> Best,
>>> Jincheng
>>> 
>>> 
>>> Dian Fu  于2019年8月14日周三 上午10:08写道:
>>> 
 Hi Thomas,
 
 Thanks a lot the suggestions.
 
 Regarding to bundle processing, there is a section "Checkpoint"[1]
>> in
> the
 design doc which talks about how to handle the checkpoint.
 However, I think you are right that we should talk more about it,
 such
> as
 what's bundle processing, how it affects the checkpoint and
 watermark,
> how
 to handle the checkpoint and watermark, etc.
 
 [1]
 
> 
 
>> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3
 <
 
> 
 
>> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3
> 
 
 Regards,
 Dian
 
> 在 2019年8月14日,上午1:01,Thomas Weise  写道:
> 
> Hi Jincheng,
> 
> Thanks for putting this together. The proposal is very detailed,
 thorough
> and for me as a Beam Flink runner contributor easy to understand :)
> 
> One thing that you should probably detail more is the bundle
 processing. It
> is critically important for performance that multiple elements are
> processed in a bundle. The default bundle size in the Flink runner
 is
 1s or
> 1000 elements, whichever comes first. And for streaming, you can
 find
 the
> logic necessary to align the bundle processing with watermarks and
> checkpointing here:
> 
 
> 
 
>> https://github.com/apache/beam/blob/release-2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java
> 
> Thomas
> 
> 
> 
> 
> 
> 
> 
> On Tue, Aug 13, 2019 at 7:05 AM jincheng sun <
> sunjincheng...@gmail.com>
> wrote:
> 
>> Hi all,
>> 
>> The Python Table API(without Python UDF support) has already been
 supported
>> and will be available in the coming release 1.9.
>> As Python UDF is very important for Pyth

Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-26 Thread Yun Gao
 Hi Xiaogang,

  Very thanks for also considering the iteration case! :) These points are 
really important for iteration. As a whole, we are implementing a new iteration 
library on top of Stream API. As a library, most of its implementation does not 
need to touch Runtime layer, but it really has some new requirements on the 
API, like the one for being able to broadcast the progressive events. To be 
more detail, these events indeed carry the sender's index and the downstream 
operators need to do alignment the events from all the upstream operators. It 
works very similar to watermark, thus these events do not need to be contained 
in checkpoints. 

Some other points are also under implementation. However, since some part of 
the design is still under discussion internally, we may not be able to start a 
new discussion on iteration immediately. Besides, we should also need to fix 
the problems that may have new requirements on the Runtime, like broadcasting 
events, to have a complete design. Therefore, I think we may still first have 
the broadcasting problem settled in this thread? Based on the points learned in 
the discussion, now I think that we might be able to decouple the broadcasting 
events requirements and more generalized multicasting mechanism. :)

Best,
Yun



--
From:SHI Xiaogang 
Send Time:2019 Aug. 27 (Tue.) 09:16
To:dev ; Yun Gao 
Cc:Piotr Nowojski 
Subject:Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

Hi, Yun Gao

The discussion seems to move in a different direction, changing from supporting 
multicasting to implementing new iteration libraries on data streams. 

Regarding the broadcast events in iterations, many details of new iteration 
libraries are unclear,
1. How the iteration progress is determined and notified? The iterations are 
synchronous or asynchronous? As far as i know, progress tracking for 
asynchronous iterations is very difficult.
2. Do async I/O operators allowed in the iterations? If so, how the broadcast 
events are checkpointed and restored? How broadcast events are distributed when 
the degree of parallelism changes?
3. Do the emitted broadcast events carry the sender's index? Will they be 
aligned in a similar way to checkpoint barriers in downstream operators?
4. In the case of synchronous iterations, do we need something similar to 
barrier buffers to guarantee the correctness of iterations?
5. Will checkpointing be enabled in iterations? If checkpointing is enabled, 
how will checkpoint barriers interact with broadcast events?

I think a detailed design document for iterations will help understand these 
problems, hencing improving the discussion. 

I also suggest a new thread for the discussion on iterations. 
This thread should focus on multicasting and discuss those problems related to 
multicasting, including how data is delivered and states are partitioned.

Regards,
Xiaogang
Yun Gao  于2019年8月26日周一 下午11:35写道:

 Hi,

 Very thanks for all the points raised ! 

 @Piotr For using another edge to broadcast the event, I think it may not be 
able to address the iteration case. The primary problem is that with  two edges 
we cannot ensure the order of records. However, In the iteration case, the 
broadcasted event is used to mark the progress of the iteration and it works 
like watermark, thus its position relative to the normal records can not change.
 And @Piotr, @Xiaogang, for the requirements on the state, I think different 
options seems vary. The first option is to allow Operator to broadcast a 
separate event and have a separate process method for this event. To be detail, 
we may add a new type of StreamElement called Event and allow Operator to 
broadcastEmit Event. Then in the received side, we could add a new 
`processEvent` method to the (Keyed)ProcessFunction. Similar to the broadcast 
side of KeyedBroadcastProcessFunction, in this new method users cannot access 
keyed state with specific key, but can register a state function to touch all 
the elements in the keyed state. This option needs to modify the runtime to 
support the new type of StreamElement, but it does not affect the semantics of 
states and thus it has no requirements on state.
 The second option is to allow Operator to broadcastEmit T and in the 
receiver side, user can process the broadcast element with the existing process 
method. This option is consistent with the OperatorState, but for keyedState we 
may send a record to tasks that do not containing the corresponding keyed 
state, thus it should require some changes on the State.
 The third option is to support the generic Multicast. For keyedState it also 
meets the problem of inconsistency between network partitioner and keyed state 
partitioner, and if we want to rely on it to implement the non-key join, it 
should be also meet the problem of cannot control the partitioning of operator 
state. Therefore, it should also require some

Flink 1.9 build failed

2019-08-26 Thread Simon Su
Hi all 
 I’m trying to build flink 1.9 release branch, it raises the error like:


Could not resolve dependencies for project 
org.apache.flink:flink-s3-fs-hadoop:jar:1.9-SNAPSHOT: Could not find artifact 
org.apache.flink:flink-fs-hadoop-shaded:jar:tests:1.9-SNAPSHOT in maven-ali 
(http://maven.aliyun.com/nexus/content/groups/public/)


My maven command : 
mvn install   -Dmaven.test.skip=true -Dcheckstyle.skip -Dlicense.skip=true 
-Drat.ignoreErrors=true -DskipTests -Pvendor-repos -DskipTests


Can anyone help me for this ? 


Thanks,
SImon



[jira] [Created] (FLINK-13867) Write file only once when doing blocking broadcast shuffle

2019-08-26 Thread Kurt Young (Jira)
Kurt Young created FLINK-13867:
--

 Summary: Write file only once when doing blocking broadcast shuffle
 Key: FLINK-13867
 URL: https://issues.apache.org/jira/browse/FLINK-13867
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Network
Reporter: Kurt Young


When doing broadcast shuffle in BATCH/BLOCKING fashion, the producer can only 
write one copy of output data for all possible consumers, instead of writing 
one copy of data for each consumer. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Enhance Support for Multicast Communication Pattern

2019-08-26 Thread SHI Xiaogang
Hi Yun Gao,

Thanks a lot for your clarification.

Now that the notification of broadcast events requires alignment whose
implementation, in my opinion, will affect the correctness of synchronous
iterations, I prefer to postpone the discussion until you have completed
the design of the new iteration library, or at least the progress tracking
part. Otherwise, the discussion for broadcasting events may become an empty
talk if it does not fit in with the final design.

What do you think?

Regards,
Xiaogang

Yun Gao  于2019年8月27日周二 上午11:33写道:

>  Hi Xiaogang,
>
>   Very thanks for also considering the iteration case! :) These points
> are really important for iteration. As a whole, we are implementing a new
> iteration library on top of Stream API. As a library, most of its
> implementation does not need to touch Runtime layer, but it really has some
> new requirements on the API, like the one for being able to broadcast the
> progressive events. To be more detail, these events indeed carry the
> sender's index and the downstream operators need to do alignment the events
> from all the upstream operators. It works very similar to watermark, thus
> these events do not need to be contained in checkpoints.
>
> Some other points are also under implementation. However, since some part
> of the design is still under discussion internally, we may not be able to
> start a new discussion on iteration immediately. Besides, we should also
> need to fix the problems that may have new requirements on the Runtime,
> like broadcasting events, to have a complete design. Therefore, I think we
> may still first have the broadcasting problem settled in this thread? Based
> on the points learned in the discussion, now I think that we might be able
> to decouple the broadcasting events requirements and more generalized
> multicasting mechanism. :)
>
> Best,
> Yun
>
>
>
> --
> From:SHI Xiaogang 
> Send Time:2019 Aug. 27 (Tue.) 09:16
> To:dev ; Yun Gao 
> Cc:Piotr Nowojski 
> Subject:Re: [DISCUSS] Enhance Support for Multicast Communication Pattern
>
> Hi, Yun Gao
>
> The discussion seems to move in a different direction, changing from
> supporting multicasting to implementing new iteration libraries on data
> streams.
>
> Regarding the broadcast events in iterations, many details of new
> iteration libraries are unclear,
> 1. How the iteration progress is determined and notified? The iterations
> are synchronous or asynchronous? As far as i know, progress tracking for
> asynchronous iterations is very difficult.
> 2. Do async I/O operators allowed in the iterations? If so, how the
> broadcast events are checkpointed and restored? How broadcast events are
> distributed when the degree of parallelism changes?
> 3. Do the emitted broadcast events carry the sender's index? Will they be
> aligned in a similar way to checkpoint barriers in downstream operators?
> 4. In the case of synchronous iterations, do we need something similar to
> barrier buffers to guarantee the correctness of iterations?
> 5. Will checkpointing be enabled in iterations? If checkpointing is
> enabled, how will checkpoint barriers interact with broadcast events?
>
> I think a detailed design document for iterations will help understand
> these problems, hencing improving the discussion.
>
> I also suggest a new thread for the discussion on iterations.
> This thread should focus on multicasting and discuss those problems
> related to multicasting, including how data is delivered and states are
> partitioned.
>
> Regards,
> Xiaogang
> Yun Gao  于2019年8月26日周一 下午11:35写道:
>
>  Hi,
>
>  Very thanks for all the points raised !
>
>  @Piotr For using another edge to broadcast the event, I think it may not
> be able to address the iteration case. The primary problem is that with
> two edges we cannot ensure the order of records. However, In the iteration
> case, the broadcasted event is used to mark the progress of the iteration
> and it works like watermark, thus its position relative to the normal
> records can not change.
>  And @Piotr, @Xiaogang, for the requirements on the state, I think
> different options seems vary. The first option is to allow Operator to
> broadcast a separate event and have a separate process method for this
> event. To be detail, we may add a new type of StreamElement called Event
> and allow Operator to broadcastEmit Event. Then in the received side, we
> could add a new `processEvent` method to the (Keyed)ProcessFunction.
> Similar to the broadcast side of KeyedBroadcastProcessFunction, in this new
> method users cannot access keyed state with specific key, but can register
> a state function to touch all the elements in the keyed state. This option
> needs to modify the runtime to support the new type of StreamElement, but
> it does not affect the semantics of states and thus it has no requirements
> on state.
>  The second option is to allow Operator to broadcastEmit T and

[jira] [Created] (FLINK-13868) Job vertex add taskmanager id in rest api

2019-08-26 Thread lining (Jira)
lining created FLINK-13868:
--

 Summary: Job vertex add taskmanager id in rest api
 Key: FLINK-13868
 URL: https://issues.apache.org/jira/browse/FLINK-13868
 Project: Flink
  Issue Type: Improvement
Reporter: lining


In web, user want to see subtask run in which taskmanager. But now there is no 
taskmanager's id, user have to judge it by host and port. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)