On Thu, May 5, 2011 at 4:41 PM, Alan <a...@malloys.org> wrote: > Right. But if I drop all references to the returned function after I'm > done with it, it gets GCed. If there's some class holding a reference > to it forever, it will never get cleaned up. For example, ((fnmaker4 > (range)) 1e6) will (I think?) currently result in a million ints being > held in memory at once, as things are now. Those things will get > thrown away shortly thereafter, though.
True. We only need to worry about those integers being held if we keep a reference to it somewhere. Under my externalizability proposal, the fnmaker4 instance would hold a reference to that instance of (range) in its metadata, but it already holds such a reference in an instance variable somewhere. If the fnmaker4 instance becomes unreachable, the GC will collect it and the (range) instance even with the metadata. Only if it's externalized is there an issue. Case 1 is not real "externalization" and is (eval `(some stuff ~the-fnmaker4-instance more stuff)) and suchlike. In that case there's no real externalization required; eval can just embed a reference to the fnmaker4 instance directly into the generated class. As long as things are configured such that whole classes can be unloaded if no longer in use, if the return value from eval is discarded and gets eligible for GC, the (range) instance can still become GCable. Case 2 is actual conversion of the closure into code that can recreate it. The easy way to handle it is to disallow lazy seqs, or walk them to some point and reject if over some length/byte limit, but that's kind of icky. Alternatively, we can have lazy seq externalization be done by externalizing the unrealized seq -- that is, the fact of a lazy seq with some particular generator function. That gives us another function to externalize. Let's consider the output of (map #(* x x) (range)) -- it turns out that this is a LazySeq object built ultimately around delay and force. Somewhere in there is a fn that is closed over (range) and #(* x x) and generates each successive element of the lazy sequence. Externalizing that requires externalizing (range) and #(* x x). The latter is trivial -- it's not even closed over anything. The former is a specially-implemented lazy sequence instance and that implementation can just externalize using print-dup as #=(range), #=(range 3), #=(range 3 7), or the like. The general pattern will be that lazy sequences amount to a special kind of closure over, frequently, an input sequence, plus some other values, often including more functions. The regress stops when it hits something implemented the way range is, or an explicit use of the lazy-seq macro, or similar. But the lazy-seq macro also just wraps calls to a step function that is generally closed over something. I'm fairly confident that it can, in principle, be done. Of course, it won't always be doable -- a line-seq for example will not be externalizable by the default means because eventually somewhere in its guts is some function that has closed over a file handle. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en