Hi everybody! I'am experimenting with clojure and as an exercice I use the facebook puzzles (http://www.facebook.com/careers/puzzles.php?puzzle_id=20) Most puzzles require to read from a text file "efficiently". So I try to not read the full file at a time, but process it lazily.
For that I made a very small helper library that try to benefit of lazy sequences: ;Pattern instances are immutables and thread safe (def split-pattern (java.util.regex.Pattern/compile "\\s")) (defn split-words [string] "Split the provided string into words. Separators are space and tabs" (if (nil? string) nil (vec (remove #(.equals % "") (.split split-pattern string))))) (defn read-text-file [file-name] "Read a text file, line per line, lazily returning nil when end of file has been reached. Each line is a vector of words" (let [reader (java.io.BufferedReader. (java.io.FileReader. file- name))] (map split-words (repeatedly #(.readLine reader))))) (defn next-line [lines] (first (take 1 lines))) So basically, a file is a lazy sequence of lines, and each line is a vector of words. Lazy behavior seems to be working at first, it I write: (let [data (read-text-file "liars.txt")] (take 5 data)) -> (["5"] ["Stephen" "1"] ["Tommaso"] ["Tommaso" "1"] ["Galileo"]) It correctly return the 5 first lines of my file. Perfect that's exactly what I want. But when really using it, it doesn't work. If I call several time the take function, it always return the first lines instead of providing the next ones: (let [data (read-text-file "liars.txt")] [(take 5 data) (take 5 data)]) =>[(["5"] ["Stephen" "1"] ["Tommaso"] ["Tommaso" "1"] ["Galileo"]) (["5"] ["Stephen" "1"] ["Tommaso"] ["Tommaso" "1"] ["Galileo"])] If I call take 10 directly, it works as expected: (let [data (read-text-file "liars.txt")] (take 10 data)) =>(["5"] ["Stephen" "1"] ["Tommaso"] ["Tommaso" "1"] ["Galileo"] ["Isaac" "1"] ["Tommaso"] ["Galileo" "1"] ["Tommaso"] ["George" "2"]) You would say, why not just take all data from the stream and then process it? Well the file has a specific format, first line contain some data, then few next line contain another data and so on. So I want to have a function that will read only a subpart of the file for example, another function another part and call them sequentially. But as shown in the simple previous example it simply doesn't work. My understanding is that some immutable thing is in the middle and it act like the data reference isn't changed between calls. That not what I want obviously as I'am getting data from a java stream, that is not supposed to be immutable. And how can I manage correctly this kind of cases? Efficiantly and idiomatically. Thanks in advance, Nicolas. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en