Okay that's great. Thanks, you guys. Was read-lines only holding onto the head of the line seq because I bound it in the let statement?
On Fri, Aug 17, 2012 at 11:09 AM, Ben Smith-Mannschott <bsmith.o...@gmail.com> wrote: > On Thu, Aug 16, 2012 at 11:47 PM, David Jacobs <da...@wit.io> wrote: >> I'm trying to grab 5 lines by their line numbers from a large (> 1GB) file >> with Clojure. >> >> So far I've got: >> >> (defn multi-nth [values indices] >> (map (partial nth values) indices)) >> >> (defn read-lines [file indices] >> (with-open [rdr (clojure.java.io/reader file)] >> (let [lines (line-seq rdr)] >> (multi-nth lines indices)))) >> >> Now, (read-lines "my-file" [0]) works without a problem. However, passing in >> [0 1] gives me the following error: "java.lang.RuntimeException: >> java.io.IOException: Stream closed" >> >> It seems that the stream is being closed before I can read the second line >> from the file. Interestingly, if I manually pull out a line from the file >> with something like `(nth lines 200)`, the `multi-nth` call works for all >> values <= 200. >> >> Any idea what's going on? >> >> PS This question is on SO if someone wants points: >> http://stackoverflow.com/questions/11995807/lazily-extract-lines-from-large-file > > The lazyness of map is biting you. The result of read-lines will not > have been fully realized before the file is closed. Also, calling nth > repeatedly is not going to do wonders for efficiency. Try this on for > size: > > > (ns nthlines.core > (:require [clojure.java.io :as io])) > > (defn multi-nth [values indices] > (let [matches-index? (set indices)] > (keep-indexed #(when (matches-index? %1) %2) values))) > > (defn read-lines [file indices] > (with-open [r (io/reader file)] > (doall (multi-nth (line-seq r) indices)))) > > (comment > > (def words "/Users/bsmith/w/nthlines/words.txt") > (def nlines 84918960) ;; 856MB with one word per line > > (time (read-lines words [0 1 2 (- nlines 2) (- nlines 1)])) > > ;;=> "Elapsed time: 18778.904 msecs" > ;; ("A" "a" "aa" "Zyzomys" "Zyzzogeton") > > ) > > // Ben > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with your > first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en