On Thu, Jun 26, 2014 at 9:15 AM, Aureliano Buendia <[email protected]> wrote:
> Summingbird is for map/reduce. Dataflow is the third generation of google's
> map/reduce, and it generalizes map/reduce the way Spark does. See more about
> this here: http://youtu.be/wtLJPvx7-ys?t=2h37m8s

Yes, my point was that Summingbird is similar in that it is a
higher-level service for batch/streaming computation, not that it is
similar for being MapReduce-based.

> It seems Dataflow is based on this paper:
> http://pages.cs.wisc.edu/~akella/CS838/F12/838-CloudPapers/FlumeJava.pdf

FlumeJava maps to Crunch in the Hadoop ecosystem. I think Dataflows is
more than that but yeah that seems to be some of the 'language'. It is
similar in that it is a distributed collection abstraction.

Reply via email to