Er.. This version is better. Uses hasNext instead of catching the exception:
(defn lazy-read-records [file regex] (let [scanner (java.util.Scanner. file) get-next (fn get-next [] (if (not (.hasNext scanner)) () (cons (.next scanner) (lazy-seq (get-next)))))] (.useDelimiter scanner regex) (get-next))) On Aug 17, 2010, at 2:29 PM, Jeff Palmucci wrote: > I'm assuming your problem is with memory, and not multithreaded reading. > Given that: > > I also work with files much too big to fit into memory. > > You could just use java.util.Scanner. That has a useDelimiter method, so you > can set the pattern to break on: > > (defn lazy-read-records [file regex] > (let [scanner (java.util.Scanner. file) > get-next (fn get-next [] > (try > (cons (.next scanner) > (lazy-seq (get-next))) > (catch java.util.NoSuchElementException e ())))] > (.useDelimiter scanner regex) > (lazy-seq (get-next)))) > > The trick here is that the sequence is lazy. It won't read the file until it > needs to in order to return the next element. > > If you don't hold onto the head of the sequence, the front part can be > garbage collected while you are working further down. > > PS If, for some reason, you want the character indices rather than the actual > records, replace (.next scanner) with: > > (do (.next scanner) > (.start (.match scanner))) > > On Aug 16, 2010, at 5:22 PM, cej38 wrote: > >> Hello, >> I work with text files that are, at times, too large to read in all >> at one time. In searching for a way to read in only part of the file >> I came across >> http://meshy.org/2009/12/13/widefinder-2-with-clojure.html >> >> I am only interested in the chunk-file and read-lines-range functions. >> >> My problem is that I would like to change chunk-file, so that instead >> of looking for the next line break, it would look for some regular >> expression (to be given as part of the function call), and would then >> report the position of the first character of every instance of that >> regular expression. >> >> After working on this for a couple of days I am raising the white >> flag. Is there someone that can help me with this? >> >> Thanks. >> >> >> >> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clojure@googlegroups.com >> Note that posts from new members are moderated - please be patient with your >> first post. >> To unsubscribe from this group, send email to >> clojure+unsubscr...@googlegroups.com >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en