Re: Streams work

Mark H. Mon, 26 Jan 2009 11:57:56 -0800

On Jan 24, 2:25 pm, Frantisek Sodomka <fsodo...@gmail.com> wrote:
> Word "streams" invokes association to data-flow languages. For a
> while, I was following Project V:
>
> Simple Example of the Difference Between Imperative, Functional and
> Data 
> Flow.http://my.opera.com/Vorlath/blog/2008/01/06/simple-example-of-the-dif...


Some complaints about the article:

1. The reference to the "repeated squaring" article is kind of silly.
Computing x^y iteratively is just unrolling a sequential recursion.
It's something you do either as an optimization or if your chosen
programming language doesn't support recursion (e.g., Fortran 77).

2. Furthermore, unlike what the article says, it's easy to parallelize
the iterative squaring procedure, using a scan operation.  For P
processors, this requires creating P log P storage locations for
intermediate values, but you need at least P storage locations for the
pipelining approach anyway.

3. The "turning the call graph 90 degrees" diagram doesn't illustrate
the point of data flow for parallelism:  it's about pipelining
operations on a stream of data, not just one datum ("x", in the
diagram).  The diagram should show different values flowing through
the pipes.

The main complaint I have is that there's no need to see dataflow as a
radical new thing; it's just pipelining.  There's task parallelism and
there's pipeline parallelism.  Both are useful and of course it's nice
to have a language and runtime that can support both efficiently.  The
problem in both cases is making sure that the optimization actually
improves performance.  Both optimizations can fail if it's not
possible to find a good schedule that keeps all the processors busy.
For example, if you're mapping f(g(h(x))) over a collection, the
functions f, g, h had better take about the same time to complete.  If
you're computing a(x) = b(x) + c(x) + d(x) with task parallelism,
you'll only get a 3x speedup if b, c, and d all take about the same
time to complete.

Pipelining is also a good sequential optimization.  When you're
mapping multiple functions over the same collection, it avoids the
creation of temporary collections, and also avoids iterating over the
collection multiple times (thus saving things like memory bandwidth
and the cost of hash table lookups).  SERIES is my favorite example of
a package (in ANSI Common Lisp) that performs this optimization; it's
a source-to-source translator that converts maps over collections into
iterative loops (recall that ANSI Common Lisp doesn't have tail
recursion).

mfh


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Streams work

Reply via email to