Observations about new lazy branch

Mark Engelberg Fri, 27 Feb 2009 20:10:54 -0800

I just finished porting my combinatorics code to the new lazy
constructs, and I discovered some subtleties to using lazy-seq that
were not at first apparent.


To begin with, consider the two versions of map:
The old way:

(defn map
  ([f coll]
   (when (seq coll)
     (lazy-cons (f (first coll)) (map f (rest coll)))))

The new way:

(defn map
  ([f coll]
   (lazy-seq
    (when-let [s (seq coll)]
      (cons (f (first s)) (map f (rest s))))))

Let's imagine that you are using map on a collection for which it is
very computation intensive to generate the rest, but trivial to
generate the first.

I believe that in this case, you'd get more desirable behavior from
the old way than the new way.

That's because the original lazy-cons kept the first and rest thunks
separate.  It would force the first to get the rest, but it would NOT
force the rest to get at the first.  So if you ask for the first, it
wouldn't do all the rest-generating computation.  However, the new
version uses a delayed regular cons.  So when you invoke the first of
the sequence, both arguments of cons are evaluated, so (map f (rest
s)) is called, and therefore (rest s) must be evaluated.

If this is unclear, consider this:
(defn a [] (do (println "a") nil))
(def s (lazy-seq (cons (a) (a))))
(first s)

If you type this into the REPL, you'll see that the rest gets
evaluated when you ask for the first.

To get the desired separation of first and rest evaluation with the
new lazy-seq function, you'd actually need to do something like this:

(def lazier-map
  (let [step (fn step [f coll]
               (when-let [s (seq coll)]
                 (cons (f (first s)) (lazy-seq (step f (rest s))))))]
    (fn [f coll]
      (lazy-seq (step f coll)))))

Basically, I've made a helper function which uses lazy-seq to delay
the evaluation of the rest, and then wrapped the call to the helper
function in a lazy-seq in order to delay evaluation of the very first
item.

As a challenge, try to similarly fix up the version of filter provided
on the lazy page so that it fully separates the evaluation of first
and rest, thus protecting you against unnecessary evaluation if rest
is a slow operation on coll.  I think you'll find that you end up with
two levels of indirection, and it's extremely difficult to write it
properly.  Post your simplest solution here.

Now in both these examples, you could argue that in all likelihood,
rest will be a fast operation.  But I chose map here because everyone
knows the way we're supposed to write map, so it seemed like a good
choice to illustrate my concerns without having to explain how the
function works, etc.

But this kind of problem does actually occur in practice.  It appeared
in several places in the code I ported, because my code tends to do
most of the work within the arguments to the recursive call.  I found
it difficult to reason about where to place the calls to lazy-seq in
order to achieve the separation I needed for evaluating the first and
the rest.  I think I pulled it off correctly, but I've got to say I'm
not crazy about how, to do the "right thing", the code ends up looking
quite obfuscated.

In summary, the new version gives you the most power to place the
delays where you want, but it's hard to get it right.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Observations about new lazy branch

Reply via email to