Re: Adding a new operator

2015-04-27 Thread Fabian Hueske
I am not sure if I got everything right. Let me first explain how a Reduce operator is executed in Flink at the moment: Assume a data set is randomly distributed across four machines and not sorted. When a Reduce function is applied on this data set, each of the four machines run a Combine operat

Re: Adding a new operator

2015-04-27 Thread Kostas Tzoumas
Some form of tree aggregation is useful in many cases, and IMO a good addition to the system. Kostas On Mon, Apr 27, 2015 at 11:04 AM, Andra Lungu wrote: > Hi Fabian, > > After a quick look at the current behaviour of Flink's combinable reduce, I > saw that it does something like this: > > http

Re: Adding a new operator

2015-04-27 Thread Andra Lungu
Hi Fabian, After a quick look at the current behaviour of Flink's combinable reduce, I saw that it does something like this: https://docs.google.com/drawings/d/1WfGJq1ZNQ-F0EQZ2TwEYS_861xc3fSdfJL9Z4VdXBQU/edit?usp=sharing It basically iterates over all the key groups two by two and if it finds tw

Re: Adding a new operator

2015-04-27 Thread Vasiliki Kalavri
Hi, Andra is working on a modified reduce operator, which would internally create an aggregation tree. This is work related to her thesis and we want to use it in graph computations for skewed inputs. It might or might not be a good idea to add it as a Flink operator and we will need to evaluate t

Re: Adding a new operator

2015-04-27 Thread Fabian Hueske
Hi Andra, is there a JIRA for the new runtime operator? Adding a new operator is a lot of work and touches many core parts of the system. It would be good to start a discussion about that early in the process to make sure that the design is aligned with the system. Otherwise, duplicated work migh

Re: Adding a new operator

2015-04-26 Thread Andra Lungu
Yes Markus, ds.reduce() -> AllReduceDriver ds.groupBy().reduce() -> ReduceDriver It's very intuitive ;) On Sun, Apr 26, 2015 at 12:34 PM, Markus Holzemer < holzemer.mar...@googlemail.com> wrote: > Hey Andrea, > perhaps you are looking at the wrong ReduceDriver? > As you can see in the DriverStr

Re: Adding a new operator

2015-04-26 Thread Markus Holzemer
Hey Andrea, perhaps you are looking at the wrong ReduceDriver? As you can see in the DriverStrategy enum there is several different ReduceDrivers depending on the strategy the optimizer chooses. best, Markus 2015-04-26 12:26 GMT+02:00 Andra Lungu : > Hey guys, > > I am trying to add a new runtim