On Jan 23, 3:31 am, Konrad Hinsen <konrad.hin...@laposte.net> wrote:
> On 22.01.2009, at 19:50, Rich Hickey wrote:
>
>
>
> >> Does that mean that calling seq on a stream converts the stream into
> >> a seq for all practical purposes? That sounds a bit dangerous
> >> considering that so many operations in Clojure call seq implicitly.
> >> One can easily have a seq "steal" a stream and not notice it before
> >> all memory is used up by the seq.
>
> > Calling seq on a stream yields a seq that will forever own the stream
> > - if you think about it a bit, you'll see why that has to be the case.
>
> > OTOH, that seq is lazy, so I'm not sure what the memory issue is.
>
> If my understanding is correct, then
>
> (def rand-stream (stream (fn [_] (rand))))
> (take 5 rand-stream)
>
> will create a seq on the stream that is referenced by the stream. As
> long as the stream is referenced by a var, the seq will remain
> referenced as well. Seqs being cached, this means that the whole
> random number sequence will be kept in memory.
>
Creating stateful streams and leaving them lying around in named
globals is not the intended use case. They are for immediate use in
computational pipelines. They are even less collections than are seqs,
i.e. not at all.
> The only way to avoid this seems to be not calling any sequence
> function on a stream. I could use for example
>
> (defn take-stream
> [n s]
> (let [iter (stream-iter s)
> eos (Object.)
> vs (doall (for [_ (range n)] (next! iter eos)))]
> (do (detach! iter) vs)))
>
> (take-stream 5 rand-stream)
>
> Writing take-stream made me discover another pitfall: the stream
> seems to keep a reference to its iter object as well, meaning that is
> never released without an explicit call to detach!. I had expected to
> be able to create a "local" iter in a let and have it disappear and
> release the stream when it goes out of scope.
Were that the case, then the map* and filter* examples wouldn't work,
The most common idiom is to obtain an iter on the incoming stream,
create a computational stage with a generator that wraps that iter,
and returns a stream that owns that generator. So certainly it can't
go out of scope at the end of the let.
> I guess that would
> require the stream not to keep a reference to the iter, but just a
> flag that an iter exists. Which in turn requires that the iter resets
> the flag when it goes out of scope. I don't even know if that is
> doable in the JVM.
>
Nope. You can't tie things like this to the lifetime of GC-able
entities, nor would you want to try to understand a system that did.
> > Again, I don't see the enormous side effect. Steams form a safe,
> > stateful pipeline, you'll generally only call seq on the end of the
> > pipe. If you ask for a seq on a stream you are asking for a (lazy)
> > reification. That reification and ownership is what makes the pipeline
> > safe.
>
> Then why not make a pipeline using lazy sequences right from the
> start? I don't see anything that I could do better with streams than
> with lazy sequences.
>
There are a couple of advantages. First, streams are faster, at least
2x faster. Since a lazy sequence must allocate per stage, a multi-
stage pipeline would incur multiple allocations per step. A stream
could be built that has no allocation other than the results. If your
calculations per step are significant, they'll dominate the time. but
when they are not, this allocation time matters.
Second, streams are fully lazy. Seqs could be made fully lazy, but
currently are not.
Third, stream iters currently provide transparent MT access. Doing the
same for a seq means wrapping it in a ref.
Rich
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---