On Thu, Aug 16, 2012 at 11:47 PM, David Jacobs <da...@wit.io> wrote: > I'm trying to grab 5 lines by their line numbers from a large (> 1GB) file > with Clojure. > > So far I've got: > > (defn multi-nth [values indices] > (map (partial nth values) indices)) > > (defn read-lines [file indices] > (with-open [rdr (clojure.java.io/reader file)] > (let [lines (line-seq rdr)] > (multi-nth lines indices)))) > > Now, (read-lines "my-file" [0]) works without a problem. However, passing in > [0 1] gives me the following error: "java.lang.RuntimeException: > java.io.IOException: Stream closed" > > It seems that the stream is being closed before I can read the second line > from the file. Interestingly, if I manually pull out a line from the file > with something like `(nth lines 200)`, the `multi-nth` call works for all > values <= 200. > > Any idea what's going on? > > PS This question is on SO if someone wants points: > http://stackoverflow.com/questions/11995807/lazily-extract-lines-from-large-file
The lazyness of map is biting you. The result of read-lines will not have been fully realized before the file is closed. Also, calling nth repeatedly is not going to do wonders for efficiency. Try this on for size: (ns nthlines.core (:require [clojure.java.io :as io])) (defn multi-nth [values indices] (let [matches-index? (set indices)] (keep-indexed #(when (matches-index? %1) %2) values))) (defn read-lines [file indices] (with-open [r (io/reader file)] (doall (multi-nth (line-seq r) indices)))) (comment (def words "/Users/bsmith/w/nthlines/words.txt") (def nlines 84918960) ;; 856MB with one word per line (time (read-lines words [0 1 2 (- nlines 2) (- nlines 1)])) ;;=> "Elapsed time: 18778.904 msecs" ;; ("A" "a" "aa" "Zyzomys" "Zyzzogeton") ) // Ben -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en