Well, I've dug around on this some more, and I'm unfortunately no
closer to finding an answer. I decided to try to whittle down the
source code to the minimal set which exhibits the problem, and post
the result here, so that at least there's a higher chance that if I'm
making a mistake, someone might be able to identify it and point it
out.

Without further ado, here it is:

(import '(org.apache.lucene.store FSDirectory)
        '(org.apache.lucene.index IndexReader)
        '(org.apache.lucene.search IndexSearcher))

(def *vendors* #{ "1211", "7784" })

(defn document-seq [index-path]
  (let [directory (. FSDirectory (getDirectory index-path))
        searcher (new IndexSearcher directory)
        reader (. searcher getIndexReader)
        numDocs (. reader numDocs)]
   (map (fn [i] (. reader document i)) (range 0 numDocs))))

(defn my-filter-pred [document]
  (let [item (. document get Constants/ITEM_ID)]
    (contains? *vendors* item)))

(defn splode [index-path]
  (with-local-vars [doc-count 0]
    (doseq [document (filter my-filter-pred (document-seq index-
path))]
      (var-set doc-count (inc @doc-count)))
    'done))

I did the same heap analysis on this, and found exactly the same
results. That is to say that

clojure.core$filter__3364$fn__3367

is a stack local (which I think is what makes it a GC root), and it
has a reference to "coll", which is a variable used in the definition
of filter, which refers to the whole gigantic list of lazy-conses. You
can see that I'm operating on lucene indexes - I actually tried to
rewrite this all using something more fundamental (like a sequence of
random strings), but I could not come up with something that caused
the heap to blow out this way.

I would love to find that I'm making some mistake in how I've written
my sequences, but I'm running out of ideas.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to