Hi everyone,
I do have an application that requires to combine/reduce results in
pairs - and the results of this step again in pairs. The binary way ;).
In case it is not clear what is meant, have a look at the schema:
Assume to have 4 inputs (in my case from an earlier MR job). Now the
first two inputs need to be combined and the second two. The results
need to be combined again.
x ---
|- x*y ---
y --- |
|- x*y*z*w
z --- |
|- z*w ---
w ---
Note that building x*y from x and y is an expensive operation which is
why I want to use 2 nodes to combine the 4 inputs in the first step. It
is not an option to collect all inputs and do the combinations on 1 node.
I guess I basically want to know whether there is a way to have logN
reduce steps in a row to combine all inputs to just one final output?!
Any suggestions are appreciated :)
- Tim