Re: Gelly Blog Post

2015-08-22 Thread Andra Lungu
Hi, I updated the Gelly Blog Post. https://docs.google.com/document/d/1FMtpwKSE3kY7RfH082LzQpWrY6o-fdZVxqambIiC_rU/edit?usp=sharing Let me know if there are comments or further suggestions. Otherwise, we'll kick this one out of the nest, to see if it flies :) On Wed, Jun 3, 2015 at 11:37 AM, Va

[jira] [Created] (FLINK-2562) [py] Implement KeySelectors

2015-08-22 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-2562: --- Summary: [py] Implement KeySelectors Key: FLINK-2562 URL: https://issues.apache.org/jira/browse/FLINK-2562 Project: Flink Issue Type: Sub-task

Re: [FAILING TEST] StateCheckpoinedITCase

2015-08-22 Thread Márton Balassi
+1 for Vasia's suggestion On Aug 22, 2015 8:07 PM, "Vasiliki Kalavri" wrote: > I just came across 2 more :/ > I'm also in favor of tracking these with JIRA. How about "test-stability" > for a label? > > -V. > > On 21 August 2015 at 12:47, Matthias J. Sax > > wrote: > > > I like the idea with the

Re: Graph.fromDataSet function calls: flink-gelly

2015-08-22 Thread Sachin Goel
Hi Vasia In that case, we might as well change it. getExecutionEnvironment actually goes through the contextFactory, and it then depends on this contextFactory how it generates new environments. For example, all the Test Environments currently are shared, i.e., the factory returns the same object

Re: [ANNOUNCE] New Committer Chesnay Schepler

2015-08-22 Thread Vasiliki Kalavri
Congrats and welcome Chesnay! On 21 August 2015 at 12:19, Stephan Ewen wrote: > Welcome! > > On Fri, Aug 21, 2015 at 10:42 AM, Fabian Hueske wrote: > > > Welcome on board Chesnay! > > > > > > 2015-08-20 21:25 GMT+02:00 Andra Lungu : > > > > > Congrats Chesnay! :) > > > > > > On Thu, Aug 20, 201

Re: [FAILING TEST] StateCheckpoinedITCase

2015-08-22 Thread Vasiliki Kalavri
I just came across 2 more :/ I'm also in favor of tracking these with JIRA. How about "test-stability" for a label? -V. On 21 August 2015 at 12:47, Matthias J. Sax wrote: > I like the idea with the special label. Otherwise, it will be difficult > to find the correct tickets. > > -Matthias > > O

Re: Graph.fromDataSet function calls: flink-gelly

2015-08-22 Thread Vasiliki Kalavri
Hi Sachin, I was actually under the impression that ExecutionEnvironment.getExecutionEnvironment() returns the current environment, if one has already been created. I don't think that creating a second one is intentional there and if that's the case, we should change it. Cheers, Vasia. On 21 Aug

[jira] [Created] (FLINK-2561) Sync Gelly Java and Scala APIs

2015-08-22 Thread Vasia Kalavri (JIRA)
Vasia Kalavri created FLINK-2561: Summary: Sync Gelly Java and Scala APIs Key: FLINK-2561 URL: https://issues.apache.org/jira/browse/FLINK-2561 Project: Flink Issue Type: Task Compo

Re: Join hints for the Gelly functions

2015-08-22 Thread Stephan Ewen
This is an interesting issue, because, quite frankly, the join hint you passed simply reversed the sides of the join. The algorithm is still the same and has the same minimum memory requirements. The fact that it made a difference is quite curious. The only thing I can imagine is that this hint ch

Re: Join hints for the Gelly functions

2015-08-22 Thread Andra Lungu
Your arguments are perfectly valid. So, what I suggest is to have the functions as they are now, e.g. groupReduceOnNeighbors and to add a groupReduceOnNeighbors(blablaSameArguments, boolean useJoinHints). That way, the user can decide whether they'd like to trade speed for a program that actually

Re: Join hints for the Gelly functions

2015-08-22 Thread Vasiliki Kalavri
Hey, I agree with Martin on this. It's the optimizer's job to decide the join strategy. Maybe the join hint worked on 99% of your cases, but we can't simply generalize this for all datasets and algorithms and hard-code a joint hint that assumes that the vertex set is always much smaller than the

Re: Join hints for the Gelly functions

2015-08-22 Thread Martin Junghanns
Hi, I guess enforcing a Join Strategy by default is not the best option since you can't assume what the user did before actually calling the Gelly functions and how the data looks like (maybe its one of the 1% graphs where the relation is the other way around or the vertex data set is very la

Join hints for the Gelly functions

2015-08-22 Thread Andra Lungu
Hey everyone, When coding for my thesis, I observed that half of the current Gelly functions (the ones that use join operators) fail on a cluster environment with the following exception: java.lang.IllegalArgumentException: Too few memory segments provided. Hash Join needs at least 33 memory segm