I have done futher experimentations and have found that, if I call directly readLine method, it work like a normal stream:
(let [reader (java.io.BufferedReader. (java.io.FileReader. "liars.txt"))] [(.readLine reader) (.readLine reader) (.readLine reader) (.readLine reader)]) => ["5" "Stephen 1" "Tommaso" "Tommaso 1"] You see that there is no immutability that come into play. BufferedReader being a mutable java object work as intended. So it is just that if I make a lazy sequence of it with repeatedly, I create an immutable object, and thus experiment the behavior of my previous post. What would be the best (idiomatic and efficiant) solution then? - wrap my file under a lazy sequence, call take & drop recursively with recur - just call readLine method directly when I need more lines and process them. The second doesn't seems clojurish while I'am not sure on the memory usage implications on the first for the lazy sequence. Any insights? Best Regards, Nicolas On 1 nov, 01:09, nchurch <nchubr...@gmail.com> wrote: > The problem you're having doesn't have anything to do with file > reads. Every time you call (take 5 data), you're calling it on the > same item 'data'; your variable 'data' doesn't change between each > call. The chief thing you have to understand about Clojure is that > variables never change. Never. If you want 'change' you need to use > refs, atoms, etc. > > So for instance if you wrote > > (let [x (iterate inc 1)] > [(take 5 x) (take 5 x)]) > > you'd get > > [(1 2 3 4 5) (1 2 3 4 5)] > > You need to make the next call on the \rest of the sequence, which you > can get by calling (drop n data). Then your processing function could > be something like > > (loop [x data] > (your-processing-function-here (take 5 x)) > (recur (drop 5 x))) > > That will walk right through the sequence or the file or whatever you > have (if it's infinite, make sure to write a terminating condition, > e.g.: > > (loop [x (iterate inc 1)] > (print (take 5 x)) > (if (> (first x) 50) nil > (recur (drop 5 x)))) > ) > > On Oct 31, 2:53 pm, Nicolas <bousque...@gmail.com> wrote: > > > > > > > > > Hi everybody! > > > I'am experimenting with clojure and as an exercice I use the facebook > > puzzles (http://www.facebook.com/careers/puzzles.php?puzzle_id=20) > > Most puzzles require to read from a text file "efficiently". So I try > > to not read the full file at a time, but process it lazily. > > > For that I made a very small helper library that try to benefit of > > lazy sequences: > > > ;Pattern instances are immutables and thread safe > > (def split-pattern (java.util.regex.Pattern/compile "\\s")) > > > (defn split-words [string] > > "Split the provided string into words. Separators are space and > > tabs" > > (if (nil? string) > > nil > > (vec (remove #(.equals % "") (.split split-pattern string))))) > > > (defn read-text-file [file-name] > > "Read a text file, line per line, lazily returning nil when end of > > file has been reached. Each line is a vector of words" > > (let [reader (java.io.BufferedReader. (java.io.FileReader. file- > > name))] > > (map split-words (repeatedly #(.readLine reader))))) > > > (defn next-line [lines] > > (first (take 1 lines))) > > > So basically, a file is a lazy sequence of lines, and each line is a > > vector of words. > > > Lazy behavior seems to be working at first, it I write: > > > (let [data (read-text-file "liars.txt")] > > (take 5 data)) > > > -> (["5"] ["Stephen" "1"] ["Tommaso"] ["Tommaso" "1"] ["Galileo"]) > > > It correctly return the 5 first lines of my file. Perfect that's > > exactly what I want. > > > But when really using it, it doesn't work. If I call several time the > > take function, it always return the first lines instead of providing > > the next ones: > > > (let [data (read-text-file "liars.txt")] > > [(take 5 data) (take 5 data)]) > > =>[(["5"] ["Stephen" "1"] ["Tommaso"] ["Tommaso" "1"] ["Galileo"]) > > (["5"] ["Stephen" "1"] ["Tommaso"] ["Tommaso" "1"] ["Galileo"])] > > > If I call take 10 directly, it works as expected: > > > (let [data (read-text-file "liars.txt")] > > (take 10 data)) > > =>(["5"] ["Stephen" "1"] ["Tommaso"] ["Tommaso" "1"] ["Galileo"] > > ["Isaac" "1"] ["Tommaso"] ["Galileo" "1"] ["Tommaso"] ["George" "2"]) > > > You would say, why not just take all data from the stream and then > > process it? > > > Well the file has a specific format, first line contain some data, > > then few next line contain another data and so on. So I want to have a > > function that will read only a subpart of the file for example, > > another function another part and call them sequentially. But as shown > > in the simple previous example it simply doesn't work. > > > My understanding is that some immutable thing is in the middle and it > > act like the data reference isn't changed between calls. That not what > > I want obviously as I'am getting data from a java stream, that is not > > supposed to be immutable. > > > And how can I manage correctly this kind of cases? Efficiantly and > > idiomatically. > > > Thanks in advance, > > > Nicolas. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en