Re: GSoC Project Proposal Draft: Code Generation in Serializers

2016-04-16 Thread Márton Balassi
Hi Gábor, I think that adding the Janino dep to flink-core should be fine, as it has quite slim dependencies [1,2] which are generally orthogonal to Flink's main dependency line (also it is already used elsewhere). As for mixing Scala code that is used from the Java parts of the same maven module

[jira] [Created] (FLINK-3773) Scanners are left unclosed in SqlExplainTest

2016-04-16 Thread Ted Yu (JIRA)
Ted Yu created FLINK-3773: - Summary: Scanners are left unclosed in SqlExplainTest Key: FLINK-3773 URL: https://issues.apache.org/jira/browse/FLINK-3773 Project: Flink Issue Type: Bug Repo

Re: GSoC Project Proposal Draft: Code Generation in Serializers

2016-04-16 Thread Gábor Horváth
Hi! Table API already uses code generation and the Janino compiler [1]. Is it a dependency that is ok to add to flink-core? In case it is ok, I think I will use the same in order to be consistent with the other code generation efforts. I started to look at the Table API code generation [2] and it

[jira] [Created] (FLINK-3772) Graph algorithms for vertex and edge degree

2016-04-16 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-3772: - Summary: Graph algorithms for vertex and edge degree Key: FLINK-3772 URL: https://issues.apache.org/jira/browse/FLINK-3772 Project: Flink Issue Type: New Feature

Re: Flink optimizer optimizations

2016-04-16 Thread Matthias J. Sax
Sure. WITHOUT. Thanks. Good catch :) On 04/16/2016 01:18 PM, Ufuk Celebi wrote: > On Sat, Apr 16, 2016 at 1:05 PM, Matthias J. Sax wrote: >> (with the need to sort the data, because both >> datasets will be sorted on A already). Thus, the overhead of sorting in >> the group might pay of in the j

Re: Flink optimizer optimizations

2016-04-16 Thread Ufuk Celebi
On Sat, Apr 16, 2016 at 1:05 PM, Matthias J. Sax wrote: > (with the need to sort the data, because both > datasets will be sorted on A already). Thus, the overhead of sorting in > the group might pay of in the join. I think you meant to write withOUT the need to the sort the data, right?

Re: Flink optimizer optimizations

2016-04-16 Thread Matthias J. Sax
Assume you have a groupBy followed by a join. DataSet1 (nor sorted) -> groupBy(A) --> join(1.A == 2.A) ^ DataSet2 (sorted on A) -+ For groupBy(A) of DataSet1 the optimizer can pick hash-grouping or the more expensive sort-based-grouping. If

[jira] [Created] (FLINK-3771) Methods for translating Graphs

2016-04-16 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-3771: - Summary: Methods for translating Graphs Key: FLINK-3771 URL: https://issues.apache.org/jira/browse/FLINK-3771 Project: Flink Issue Type: Improvement Comp

[jira] [Created] (FLINK-3770) Fix TriangleEnumerator performance

2016-04-16 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-3770: - Summary: Fix TriangleEnumerator performance Key: FLINK-3770 URL: https://issues.apache.org/jira/browse/FLINK-3770 Project: Flink Issue Type: Improvement