Okay that's great. Thanks, you guys. Was read-lines only holding onto
the head of the line seq because I bound it in the let statement?

On Fri, Aug 17, 2012 at 11:09 AM, Ben Smith-Mannschott
<bsmith.o...@gmail.com> wrote:
> On Thu, Aug 16, 2012 at 11:47 PM, David Jacobs <da...@wit.io> wrote:
>> I'm trying to grab 5 lines by their line numbers from a large (> 1GB) file
>> with Clojure.
>>
>> So far I've got:
>>
>> (defn multi-nth [values indices]
>>   (map (partial nth values) indices))
>>
>> (defn read-lines [file indices]
>>   (with-open [rdr (clojure.java.io/reader file)]
>>     (let [lines (line-seq rdr)]
>>       (multi-nth lines indices))))
>>
>> Now, (read-lines "my-file" [0]) works without a problem. However, passing in
>> [0 1] gives me the following error: "java.lang.RuntimeException:
>> java.io.IOException: Stream closed"
>>
>> It seems that the stream is being closed before I can read the second line
>> from the file. Interestingly, if I manually pull out a line from the file
>> with something like `(nth lines 200)`, the `multi-nth` call works for all
>> values <= 200.
>>
>> Any idea what's going on?
>>
>> PS This question is on SO if someone wants points:
>> http://stackoverflow.com/questions/11995807/lazily-extract-lines-from-large-file
>
> The lazyness of map is biting you. The result of read-lines will not
> have been fully realized before the file is closed.  Also, calling nth
> repeatedly is not going to do wonders for efficiency. Try this on for
> size:
>
>
> (ns nthlines.core
>   (:require [clojure.java.io :as io]))
>
> (defn multi-nth [values indices]
>   (let [matches-index? (set indices)]
>     (keep-indexed #(when (matches-index? %1) %2) values)))
>
> (defn read-lines [file indices]
>   (with-open [r (io/reader file)]
>     (doall (multi-nth (line-seq r) indices))))
>
> (comment
>
>   (def words "/Users/bsmith/w/nthlines/words.txt")
>   (def nlines 84918960) ;; 856MB with one word per line
>
>   (time (read-lines words [0 1 2 (- nlines 2) (- nlines 1)]))
>
>   ;;=> "Elapsed time: 18778.904 msecs"
>   ;;   ("A" "a" "aa" "Zyzomys" "Zyzzogeton")
>
> )
>
> // Ben
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to