After working around the seq + closure = death problem, I still had a severe memory leak in my code, which took many hours to find.
Holding a reference to a string returned by clojure.string/split is somehow retaining a reference to the original string. In my case I needed to hold the first column of each row in a tsv file that was 4G in size. This resulted in holding the entire 4G in memory. Here's a demo. Function "data" returns a seq of lines that are about 1000 bytes. The first column, however, is just a few bytes, and 10k of them should easily fit in 10M of heap space. But, no: $ LEIN_JVM_OPTS=-Xmx10M lein repl REPL started; server listening on localhost port 34955 user=> (defn data [] (for [i (range)] (str "row " i "\t" (clojure.string/join "" (repeat 1000 "x"))))) #'user/data user=> (def x (vec (take 10000 (map #(first (clojure.string/split % #"\t")) (data))))) java.lang.OutOfMemoryError: Java heap space (NO_SOURCE_FILE:4) user=> If I copy the returned string with the String constructor, it's fine: $ LEIN_JVM_OPTS=-Xmx10M lein repl REPL started; server listening on localhost port 20587 user=> (defn data [] (for [i (range)] (str "row " i "\t" (clojure.string/join "" (repeat 1000 "x"))))) #'user/data user=> (def x (vec (take 10000 (map #(String. (first (clojure.string/split % #"\t"))) (data))))) #'user/x user=> (x 10) "row 10" user=> Two observations about this. First, this behavior is very unexpected to me. I don't understand if it is a property of strings, collections, or string/split specifically that is causing it. Is there something in the docs that I overlooked, that would have warned of this? Second, for tracking down problems like this, the available tooling is pathetic, to put it as politely as possible. jhat would not trace the the leaked strings. It consistently froze up when tracing them to GC roots. visualvm traced it back to CacheLRU, as in the screenshot I posted in the other thread, which was perfectly uninformative. Without any usable tooling, the only workflow I found to narrow the problem was to iteratively stub out portions of code and re-run the program for several minutes to determine if the leak was active. Obviously, this is incredibly painful, slow, and tedious. I'm hoping someone can tell me there's a better way. Note that the leak did not appear in when exercising subsystems independently, because in that case no references were retained from one subsystem to the other. So, "try it in the repl" was not an effective strategy. -- -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.