Re: [DISCUSS] Per-key event time

2017-02-22 Thread Paris Carbone
Hey Jamie! Key-based progress tracking sounds like local-only progress tracking to me, there is no need to use a low watermarking mechanism at all since all streams of a key are handled by a single partition at a time (per operator). Thus, this could be much easier to implement and support (i.e.

[jira] [Created] (FLINK-5893) Race condition in removing previous JobManagerRegistration in ResourceManager

2017-02-22 Thread zhijiang (JIRA)
zhijiang created FLINK-5893: --- Summary: Race condition in removing previous JobManagerRegistration in ResourceManager Key: FLINK-5893 URL: https://issues.apache.org/jira/browse/FLINK-5893 Project: Flink

[jira] [Created] (FLINK-5892) Recover job state at the granularity of operator

2017-02-22 Thread MaGuowei (JIRA)
MaGuowei created FLINK-5892: --- Summary: Recover job state at the granularity of operator Key: FLINK-5892 URL: https://issues.apache.org/jira/browse/FLINK-5892 Project: Flink Issue Type: New Feature

[DISCUSS] Per-key event time

2017-02-22 Thread Jamie Grier
Hi Flink Devs, Use cases that I see quite frequently in the real world would benefit from a different watermarking / event time model than the one currently implemented in Flink. I would call Flink's current approach partition-based watermarking or maybe subtask-based watermarking. In this model

Re: [DISCUSS] Project build time and possible restructuring

2017-02-22 Thread Greg Hogan
An additional option for reducing time to build and test is parallel execution. This would help users more than on TravisCI since we're generally running on multi-core machines rather than VM slices. Is the idea that each user would only check out the modules that he or she is developing with? For

[jira] [Created] (FLINK-5891) ConnectedComponents is broken when object reuse enabled

2017-02-22 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-5891: - Summary: ConnectedComponents is broken when object reuse enabled Key: FLINK-5891 URL: https://issues.apache.org/jira/browse/FLINK-5891 Project: Flink Issue Type: B

[jira] [Created] (FLINK-5890) GatherSumApply broken when object reuse enabled

2017-02-22 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-5890: - Summary: GatherSumApply broken when object reuse enabled Key: FLINK-5890 URL: https://issues.apache.org/jira/browse/FLINK-5890 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-5889) Improving the Flink Python batch API test framework

2017-02-22 Thread Lior Amar (JIRA)
Lior Amar created FLINK-5889: Summary: Improving the Flink Python batch API test framework Key: FLINK-5889 URL: https://issues.apache.org/jira/browse/FLINK-5889 Project: Flink Issue Type: Improve

Re: [DISCUSS] Code style / checkstyle

2017-02-22 Thread Dawid Wysakowicz
So how about preparing a code style and corresponding checkstyle and enabling it gradually module by module whenever shepherd/commiter with expertise in a module will have time to introduce/check such a change? Maybe it will make the "snowball effect" happen? I agree there is no point in preparing

Re: [DISCUSS] Project build time and possible restructuring

2017-02-22 Thread Fabian Hueske
Hi everybody, I think this should be a discussion about the benefits and drawbacks of separating the code into distinct repositories from a development point of view. So I agree with Stephan that we should not divide the community by creating separate groups of committers. Also the discussion abou

[jira] [Created] (FLINK-5888) ForwardedFields annotation is not generating optimised execution plan in example KMeans job

2017-02-22 Thread Ziyad Muhammed Mohiyudheen (JIRA)
Ziyad Muhammed Mohiyudheen created FLINK-5888: - Summary: ForwardedFields annotation is not generating optimised execution plan in example KMeans job Key: FLINK-5888 URL: https://issues.apache.org/jira/

[jira] [Created] (FLINK-5887) Make CheckpointBarrier type immutable

2017-02-22 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-5887: --- Summary: Make CheckpointBarrier type immutable Key: FLINK-5887 URL: https://issues.apache.org/jira/browse/FLINK-5887 Project: Flink Issue Type: Improvement

Re: [DISCUSS] Code style / checkstyle

2017-02-22 Thread Chesnay Schepler
For file where we don't enforce checkstyle things should work they way they do right now. Turn off auto-formatting, and only format code that you touched and that's it. For these modification we will have to check them manually in the PRs as we do now. On 22.02.2017 16:22, Greg Hogan wrote:

Re: [DISCUSS] Table API / SQL indicators for event and processing time

2017-02-22 Thread Timo Walther
Hi everyone, I have create an issue [1] to track the progress of this topic. I have written a little design document [2] how we could implement the indicators and which parts have to be touched. I would suggest to implement a prototype, also to see what is possible and can be integrated both

Re: [DISCUSS] Code style / checkstyle

2017-02-22 Thread Greg Hogan
Will not the code style be applied on save to any user-modified file? So this will clutter PRs and overwrite history. On Wed, Feb 22, 2017 at 6:19 AM, Dawid Wysakowicz < wysakowicz.da...@gmail.com> wrote: > I also agree with Till and Chesnayl. Anyway as to "capture the current > style" I have som

[jira] [Created] (FLINK-5886) Python API for streaming applications

2017-02-22 Thread Zohar Mizrahi (JIRA)
Zohar Mizrahi created FLINK-5886: Summary: Python API for streaming applications Key: FLINK-5886 URL: https://issues.apache.org/jira/browse/FLINK-5886 Project: Flink Issue Type: New Feature

[jira] [Created] (FLINK-5885) Java code snippet instead of scala in documentation

2017-02-22 Thread Evgeny Vanslov (JIRA)
Evgeny Vanslov created FLINK-5885: - Summary: Java code snippet instead of scala in documentation Key: FLINK-5885 URL: https://issues.apache.org/jira/browse/FLINK-5885 Project: Flink Issue Typ

Re: [DISCUSS] Project build time and possible restructuring

2017-02-22 Thread Gábor Hermann
@Stephan: Although I tried to raise some issues about splitting committers, I'm still strongly in favor of some kind of restructuring. We just have to be conscious about the disadvantages. Not splitting the committers could leave the libraries in the same stalling status, described by Till.

[jira] [Created] (FLINK-5884) Integrate time indicators for Table API & SQL

2017-02-22 Thread Timo Walther (JIRA)
Timo Walther created FLINK-5884: --- Summary: Integrate time indicators for Table API & SQL Key: FLINK-5884 URL: https://issues.apache.org/jira/browse/FLINK-5884 Project: Flink Issue Type: New Fea

Re: [DISCUSS] Should we supply a new Iterator instance for Functions with Iterable input(s) like CoGroupFunction ?

2017-02-22 Thread Lin Li
I created a jira https://issues.apache.org/jira/browse/FLINK-5883, and will work on this asap. 2017-02-22 21:01 GMT+08:00 Aljoscha Krettek : > I think this was mostly an oversight on my part that was possible because > we didn't have good test-coverage that was enforcing correctness. Please go >

[jira] [Created] (FLINK-5883) Re-adding the Exception-thrown code for ListKeyGroupedIterator when the iterator is requested the second time

2017-02-22 Thread lincoln.lee (JIRA)
lincoln.lee created FLINK-5883: -- Summary: Re-adding the Exception-thrown code for ListKeyGroupedIterator when the iterator is requested the second time Key: FLINK-5883 URL: https://issues.apache.org/jira/browse/FLINK

Re: [DISCUSS] Project build time and possible restructuring

2017-02-22 Thread Stephan Ewen
Hi all! Thanks for kicking this off, Till, it is a good discussion to have. A few thoughts from my side: - From what I get from the first responses, from a development convenience point the split in repositories would be desirable. - The biggest obstacles on that way are probably the followi

Re: [DISCUSS] Project build time and possible restructuring

2017-02-22 Thread Aljoscha Krettek
I'm not against splitting but I wan't to highlight that there are other options: - We could split the tests run on travis logically. For example, run unit tests and integration tests separately. This would have the benefit that you would see early on if the (fast) unit tests fail. We could also sp

Re: [DISCUSS] Should we supply a new Iterator instance for Functions with Iterable input(s) like CoGroupFunction ?

2017-02-22 Thread Aljoscha Krettek
I think this was mostly an oversight on my part that was possible because we didn't have good test-coverage that was enforcing correctness. Please go ahead and open an issue for re-adding the throw. On Wed, 22 Feb 2017 at 13:28 Lin Li wrote: > Thank you for the answer! > > The discussion on FLIN

Re: [DISCUSS] Should we supply a new Iterator instance for Functions with Iterable input(s) like CoGroupFunction ?

2017-02-22 Thread Lin Li
Thank you for the answer! The discussion on FLINK-1023 is very clear to me. I agree with that throws a TraversableOnceException when the iterator is requested the second time. @Aljoscha git history shows you removed the exception-thrown code from FLINK-1110, would you mind me create an issue and

Re: [DISCUSS] Code style / checkstyle

2017-02-22 Thread Dawid Wysakowicz
I also agree with Till and Chesnayl. Anyway as to "capture the current style" I have some doubts if this is possible, as it changes file to file. Chesnay's suggestion as to were enforce the checkstyle seems reasonable to me, but I am quite new to the community :). Enabling checkstyle for particula

Re: [DISCUSS] Code style / checkstyle

2017-02-22 Thread Chesnay Schepler
I agree with Till. I would propose enforcing checkstyle on a subset of the modules, basically those that are not flink-runtime, flink-java, flink-streaming-java. These are the ones imo where messing with the history can be detrimental; for the others it isn't really important imo. (Note that i

Re: [DISCUSS] Code style / checkstyle

2017-02-22 Thread Till Rohrmann
I think that not enforcing a code style is as good as not having any code style to be honest. Having an IntelliJ or Eclipse profile is nice and some people will probably use it, but my gut feeling is that the majority won't notice it. Cheers, Till On Wed, Feb 22, 2017 at 11:15 AM, Ufuk Celebi wr

[jira] [Created] (FLINK-5882) TableFunction (UDTF) should support variable types and variable arguments

2017-02-22 Thread Zhuoluo Yang (JIRA)
Zhuoluo Yang created FLINK-5882: --- Summary: TableFunction (UDTF) should support variable types and variable arguments Key: FLINK-5882 URL: https://issues.apache.org/jira/browse/FLINK-5882 Project: Flink

Re: [DISCUSS] Should we supply a new Iterator instance for Functions with Iterable input(s) like CoGroupFunction ?

2017-02-22 Thread Ufuk Celebi
On Wed, Feb 22, 2017 at 11:19 AM, Till Rohrmann wrote: > In general, you’re right Lin Li that we don’t honour the Iterable contract > which should allow you to create an arbitrary number of iterators over the > data. Honestly, I’m not sure why we did this change because it’s not very > intuitive.

Re: KeyGroupRangeAssignment ?

2017-02-22 Thread Till Rohrmann
Hi Ovidiu, given you experiments the data distribution does not look too bad. Maybe you won't get a 10 keys per operator assignment but it should be around 10 keys per operator. Have you tried measuring the latency between the fastest and slowest subtask in your topology? But I've also heard from

[jira] [Created] (FLINK-5881) Scalar should support variable types and variable arguments

2017-02-22 Thread Zhuoluo Yang (JIRA)
Zhuoluo Yang created FLINK-5881: --- Summary: Scalar should support variable types and variable arguments Key: FLINK-5881 URL: https://issues.apache.org/jira/browse/FLINK-5881 Project: Flink

Re: [DISCUSS] Should we supply a new Iterator instance for Functions with Iterable input(s) like CoGroupFunction ?

2017-02-22 Thread Till Rohrmann
Hi Lin Li, I think the oversight is more that we don’t throw a TraversableOnceException if you request more than one iterator as it is the case for the Iterables used for the non collection mode. Otherwise you will have a different behaviour for the collection and the non collection mode. In gene

Re: Visualizing topologies

2017-02-22 Thread Ufuk Celebi
Hey Ken! This looks really good. +1 to make this available publicly. We can link it from the Flink website and the viz tool Pat linked to. The vizualizer has currently some open issues, it is not up to date with the one that is part of the Flink web UI. – Ufuk On Wed, Feb 22, 2017 at 3:01 AM,

[jira] [Created] (FLINK-5880) Add documentation for object reuse for DataStream API

2017-02-22 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-5880: --- Summary: Add documentation for object reuse for DataStream API Key: FLINK-5880 URL: https://issues.apache.org/jira/browse/FLINK-5880 Project: Flink Iss

Re: [DISCUSS] Code style / checkstyle

2017-02-22 Thread Ufuk Celebi
Kurt's proposal sounds reasonable. What about the following: - We try to capture the current style in an code style configuration (for IntelliJ and maybe Eclipse) - We provide that on the website for contributors to download - We don't enforce it, but new contributions and changes are free to form

Re: [DISCUSS] Should we supply a new Iterator instance for Functions with Iterable input(s) like CoGroupFunction ?

2017-02-22 Thread Aljoscha Krettek
Hi, this is probably an oversight. If it helps you implement the feature, please go ahead and add a sub-issue for solving the Iterator problem. Best, Aljoscha On Tue, 21 Feb 2017 at 16:13 Lin Li wrote: > Hi, > > When I try to implement > https://issues.apache.org/jira/browse/FLINK-5498 > vi

Re: [DISCUSS] Project build time and possible restructuring

2017-02-22 Thread Gábor Hermann
Hi all, I'm also in favor of splitting, but only in terms of committers. I agree with Theodore, that async releases would cause confusion. With time based releases [1] it should be easy to sync release. Even if it's possible to add committers to different components, should we do a more fine

[jira] [Created] (FLINK-5879) ExecutionAttemptID should invoke super() in constructor

2017-02-22 Thread Biao Liu (JIRA)
Biao Liu created FLINK-5879: --- Summary: ExecutionAttemptID should invoke super() in constructor Key: FLINK-5879 URL: https://issues.apache.org/jira/browse/FLINK-5879 Project: Flink Issue Type: Bug