I am not sure if I got everything right.
Let me first explain how a Reduce operator is executed in Flink at the
moment:
Assume a data set is randomly distributed across four machines and not
sorted.
When a Reduce function is applied on this data set, each of the four
machines run a Combine operat
Some form of tree aggregation is useful in many cases, and IMO a good
addition to the system.
Kostas
On Mon, Apr 27, 2015 at 11:04 AM, Andra Lungu wrote:
> Hi Fabian,
>
> After a quick look at the current behaviour of Flink's combinable reduce, I
> saw that it does something like this:
>
> http
Hi Fabian,
After a quick look at the current behaviour of Flink's combinable reduce, I
saw that it does something like this:
https://docs.google.com/drawings/d/1WfGJq1ZNQ-F0EQZ2TwEYS_861xc3fSdfJL9Z4VdXBQU/edit?usp=sharing
It basically iterates over all the key groups two by two and if it finds
tw
Hi,
Andra is working on a modified reduce operator, which would internally
create an aggregation tree.
This is work related to her thesis and we want to use it in graph
computations for skewed inputs.
It might or might not be a good idea to add it as a Flink operator and we
will need to evaluate t
Hi Andra,
is there a JIRA for the new runtime operator?
Adding a new operator is a lot of work and touches many core parts of the
system.
It would be good to start a discussion about that early in the process to
make sure that the design is aligned with the system.
Otherwise, duplicated work migh
Yes Markus,
ds.reduce() -> AllReduceDriver
ds.groupBy().reduce() -> ReduceDriver
It's very intuitive ;)
On Sun, Apr 26, 2015 at 12:34 PM, Markus Holzemer <
holzemer.mar...@googlemail.com> wrote:
> Hey Andrea,
> perhaps you are looking at the wrong ReduceDriver?
> As you can see in the DriverStr
Hey Andrea,
perhaps you are looking at the wrong ReduceDriver?
As you can see in the DriverStrategy enum there is several different
ReduceDrivers depending on the strategy the optimizer chooses.
best,
Markus
2015-04-26 12:26 GMT+02:00 Andra Lungu :
> Hey guys,
>
> I am trying to add a new runtim