[Delta Iterations] The dirty insides(insights)

2015-03-19 Thread Andra Lungu
Hello, I've used delta iterations several times up until now, but I just realized that I never fully understood what happens inside. And the documentation only explains things from a user's perspective. Which is why I could really use your help :). Here goes nothing: In Gelly, Graph.java, there i

Re: [Delta Iterations] The dirty insides(insights)

2015-03-19 Thread Robert Metzger
Did you send an empty email to user-subscr...@flink.apache.org ? That should subscribe you. On Thu, Mar 19, 2015 at 9:25 AM, Andra Lungu wrote: > Hello, > > I've used delta iterations several times up until now, but I just realized > that I never fully understood what happens inside. And the doc

Re: [DISCUSS] Submitting small PRs rather than massive ones

2015-03-19 Thread Stephan Ewen
I like this proposal very much. We should do that as much as possible. Pull requests with renaming easily add up to many files, it is harder there. Am 18.03.2015 19:39 schrieb "Henry Saputra" : > Hi All, > > Recently there have been some PRs with massive changes which include > multiple JIRA tick

Re: [DISCUSS] Submitting small PRs rather than massive ones

2015-03-19 Thread Ufuk Celebi
On 19 Mar 2015, at 09:43, Stephan Ewen wrote: > I like this proposal very much. We should do that as much as possible. Same here. Makes it also easier to track progress. (I think this should go hand in hand with better design descriptions in the corresponding JIRAs.)

Re: [DISCUSS] Submitting small PRs rather than massive ones

2015-03-19 Thread Maximilian Michels
I agree with you, Henry. Reviewing hundreds of changes class files is a difficult and a nearly impossible task to do exhaustive. However, splitting up pull requests also has some drawbacks. For example, discussions and comments are also split up and harder to keep up with. Also, pull requests migh

[jira] [Created] (FLINK-1745) Add k-nearest-neighbours algorithm to machine learning library

2015-03-19 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-1745: Summary: Add k-nearest-neighbours algorithm to machine learning library Key: FLINK-1745 URL: https://issues.apache.org/jira/browse/FLINK-1745 Project: Flink

[jira] [Created] (FLINK-1746) Add linear discriminant analysis to machine learning library

2015-03-19 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-1746: Summary: Add linear discriminant analysis to machine learning library Key: FLINK-1746 URL: https://issues.apache.org/jira/browse/FLINK-1746 Project: Flink I

[jira] [Created] (FLINK-1747) Remove deadlock detection and pipeline breaker placement in optimizer

2015-03-19 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-1747: -- Summary: Remove deadlock detection and pipeline breaker placement in optimizer Key: FLINK-1747 URL: https://issues.apache.org/jira/browse/FLINK-1747 Project: Flink

[jira] [Created] (FLINK-1748) Integrate PageRank implementation into machine learning library

2015-03-19 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-1748: Summary: Integrate PageRank implementation into machine learning library Key: FLINK-1748 URL: https://issues.apache.org/jira/browse/FLINK-1748 Project: Flink

[jira] [Created] (FLINK-1749) Add Boosting algorithm for ensemble learning to machine learning library

2015-03-19 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-1749: Summary: Add Boosting algorithm for ensemble learning to machine learning library Key: FLINK-1749 URL: https://issues.apache.org/jira/browse/FLINK-1749 Project: Flink

[jira] [Created] (FLINK-1750) Add canonical correlation analysis (CCA) to machine learning library

2015-03-19 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-1750: Summary: Add canonical correlation analysis (CCA) to machine learning library Key: FLINK-1750 URL: https://issues.apache.org/jira/browse/FLINK-1750 Project: Flink

Re: Queries regarding RDFs with Flink

2015-03-19 Thread Flavio Pompermaier
Hi to all, I'm back to this task again :) Summarizing again: I have some source dataset that has contains RDF "stars" (SubjectURI, RdfType and a list of RDF triples belonging to this subject -> the "a.k.a." star schema) and I have to extract some sub-graphs for some RDF types of interest. As descr

Re: Overview of Memory Management in Flink

2015-03-19 Thread Maximilian Michels
We should have all parts of Flink documented that well. Very nice overview, Stephan! Two comments: Considering the lazy memory segment allocation. Besides, the problem of initial garbage collection of these dynamically allocated segments, wouldn't it also have the disadvantage that we could not g

Re: [DISCUSS] Submitting small PRs rather than massive ones

2015-03-19 Thread Henry Saputra
Yeah, renaming totally will contain large file changes. On Thu, Mar 19, 2015 at 1:43 AM, Stephan Ewen wrote: > I like this proposal very much. We should do that as much as possible. > > Pull requests with renaming easily add up to many files, it is harder there. > Am 18.03.2015 19:39 schrieb "Hen

Re: [DISCUSS] Submitting small PRs rather than massive ones

2015-03-19 Thread Henry Saputra
+1 to that, Ufuk. Making JIRA more descriptive and contain design would make it better to review b4 jumping to the diff in the PRs. On Thu, Mar 19, 2015 at 2:17 AM, Ufuk Celebi wrote: > > On 19 Mar 2015, at 09:43, Stephan Ewen wrote: > >> I like this proposal very much. We should do that as mu

Re: [DISCUSS] Submitting small PRs rather than massive ones

2015-03-19 Thread Henry Saputra
Great suggestion and observation Max. +1 Yes, we should also splitting PRs into right and logical commits that will definitely help with review. Like you said some PRs are just large in nature and splitting it into pieces may not work well so doing right commits should go hand in hand with small P

[jira] [Created] (FLINK-1751) Update CLI documentation page

2015-03-19 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-1751: - Summary: Update CLI documentation page Key: FLINK-1751 URL: https://issues.apache.org/jira/browse/FLINK-1751 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-1752) Add test to Kafka streaming connector

2015-03-19 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1752: - Summary: Add test to Kafka streaming connector Key: FLINK-1752 URL: https://issues.apache.org/jira/browse/FLINK-1752 Project: Flink Issue Type: Improvement

[jira] [Created] (FLINK-1753) Add more tests for Kafka Connectors

2015-03-19 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1753: - Summary: Add more tests for Kafka Connectors Key: FLINK-1753 URL: https://issues.apache.org/jira/browse/FLINK-1753 Project: Flink Issue Type: Improvement

[jira] [Created] (FLINK-1754) Deadlock in job execution

2015-03-19 Thread Sebastian Kruse (JIRA)
Sebastian Kruse created FLINK-1754: -- Summary: Deadlock in job execution Key: FLINK-1754 URL: https://issues.apache.org/jira/browse/FLINK-1754 Project: Flink Issue Type: Bug Repor

[jira] [Created] (FLINK-1755) NullPointerException in LocalInputChannel.getNextLookAhead()

2015-03-19 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1755: - Summary: NullPointerException in LocalInputChannel.getNextLookAhead() Key: FLINK-1755 URL: https://issues.apache.org/jira/browse/FLINK-1755 Project: Flink

Re: Flink cluster dev environment in Docker

2015-03-19 Thread Henry Saputra
+1 for the idea. I cross post this to dev@ list for FYI - Henry On Tue, Mar 17, 2015 at 2:54 AM, Robert Metzger wrote: > Hey Emmanuel, > > thank you for this great contribution. I'm going to test the docker > deployment soon. > > I would actually like to include the files into the Flink source