Re: Exercise: words frequency ranking

2009-01-03 Thread Emeka
Thanks. I have learnt some new. Emeka --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to cloj

Re: Exercise: words frequency ranking

2009-01-03 Thread Christian Vest Hansen
Hehe, "venlig hilsen" is danish for "kind regards" :) On Sat, Jan 3, 2009 at 3:23 PM, Emeka wrote: > Venlig hilsen and Timothy Prately > > Thanks so much. > > Emeka > > > > > -- Venlig hilsen / Kind regards, Christian Vest Hansen. --~--~-~--~~~---~--~~ You re

Re: Exercise: words frequency ranking

2009-01-03 Thread Emeka
Venlig hilsen and Timothy Prately Thanks so much. Emeka --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group,

Re: Exercise: words frequency ranking

2008-12-29 Thread Timothy Pratley
> (defn top-words-core [s] >      (reduce #(assoc %1 %2 (inc (%1 %2 0))) {} >              (re-seq #"\w+" >                      (.toLowerCase s "maps are functions of their keys" means: user=> ({:a 1, :b 2, :c 3} :a) 1 Here we created a map {:a 1, :b 2, :c 3}, can then called it like a funct

Re: Exercise: words frequency ranking

2008-12-29 Thread Christian Vest Hansen
On Mon, Dec 29, 2008 at 10:49 AM, Emeka wrote: > Hello sir, > > I would have asked this question in the thread but , I don't want to create > noise over this issue. > I have not been able to get my head around your code or Clojure. I need some > support. > > > (defn top-words-core [s] > (red

Re: Exercise: words frequency ranking

2008-12-29 Thread Emeka
Hello sir, I would have asked this question in the thread but , I don't want to create noise over this issue. I have not been able to get my head around your code or Clojure. I need some support. (defn top-words-core [s] (reduce #(assoc %1 %2 (inc (%1 %2 0))) {} (re-seq #"\w+"

Re: Exercise: words frequency ranking

2008-12-29 Thread Timothy Pratley
You could consider using a StreamTokenizer: (import '(java.io StreamTokenizer BufferedReader FileReader)) (defn wordfreq [filename] (with-local-vars [words {}] (let [st (StreamTokenizer. (BufferedReader. (FileReader. filename)))] (loop [tt (.nextToken st)] (when (not= tt Strea

Re: Exercise: words frequency ranking

2008-12-28 Thread Chouser
On Sun, Dec 28, 2008 at 9:22 AM, Boyd Brown wrote: > > Hello. I can't seem to find 'spit'. 'spit' is in clojure-contrib: http://code.google.com/p/clojure-contrib/source/browse/trunk/src/clojure/contrib/duck_streams.clj?r=325#177 It's inclusion in clojure.core is planned (search for spit): http

Re: Exercise: words frequency ranking

2008-12-28 Thread Boyd Brown
Hello. I can't seem to find 'spit'. java exception: unable to resolve symbol spit. I'm using Clojure Box rev1142. Tried using the clojure.jar from the 20081217 release of Clojure but to no avail. spit is not documented on the clojure site API page like slurp is. I can't find it in clojure co

Re: Exercise: words frequency ranking

2008-12-27 Thread Piotr 'Qertoip' Włodarek
> Some robustness notes: > > On 5.2MB file, it takes 9s compared to 7s of improved Mibu version, or > 7s of mine initial one. > > On 38MB file, it takes 53s and about 270MB of memory. Similarly, the > initial one and the mibu versions take 39s and also about 270MB of > memory. I also like Ipetit c

Re: Exercise: words frequency ranking

2008-12-27 Thread Piotr 'Qertoip' Włodarek
And the nice pastie version: http://pastie.org/347369 regards, Piotrek --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe fr

Re: Exercise: words frequency ranking

2008-12-27 Thread Piotr 'Qertoip' Włodarek
Thank you for all improvements and suggestions. Based on your feedback, here is my final version: (defn read-words "Given a file, return a seq of every word in the file, normalizing words by coverting them to lower case and splitting on whitespace" [in-filepath] (re-seq #"\w+"

Re: Exercise: words frequency ranking

2008-12-26 Thread Mibu
I wrote what I think is the idiomatic version. Idiomatically, you delay execution of functions over lazy sequences, so if you get a sequence of a million words and you only take the first 100, you don't lowercase (or whatever else) the entire sequence. Also a smart compiler on a multi-core machine

Re: Exercise: words frequency ranking

2008-12-26 Thread Piotr 'Qertoip' Włodarek
On Dec 25, 4:58 pm, Mibu wrote: > My version: > > (defn top-words [input-filename result-filename] >   (spit result-filename >         (apply str >                (map #(format "%s : %d\n" (first %) (second %)) >                     (sort-by #(-(val %)) >                              (reduce #(co

Re: Exercise: words frequency ranking

2008-12-26 Thread lpetit
What would you think of this form of coding ? - The rationale is to separate functions that deal with system "boundaries" from "core algorithmic functions". So you should at least have two functions : one that does not deal with input/output formats : will only deal with clojure/java constructs. -

Re: Exercise: words frequency ranking

2008-12-26 Thread lpetit
Instead of #(- (val %)), one could also use the compose function : (comp - val) My 0,02 EURO, -- Laurent On Dec 25, 4:58 pm, Mibu wrote: > My version: > > (defn top-words [input-filename result-filename] > (spit result-filename > (apply str >(map #(format "%s : %d\n"

Re: Exercise: words frequency ranking

2008-12-25 Thread Meikel Brandmeyer
Hi, Am 25.12.2008 um 17:24 schrieb wwmorgan: A better implementation would split the different steps of the program into separate functions. This increases readability and testability of the source code, and encourages the reuse of code in new programs. Yes. One can think of the data flowing

Re: Exercise: words frequency ranking

2008-12-25 Thread wwmorgan
A better implementation would split the different steps of the program into separate functions. This increases readability and testability of the source code, and encourages the reuse of code in new programs. I haven't tested this program, but hopefully you'll understand the general approach. Als

Re: Exercise: words frequency ranking

2008-12-25 Thread Mibu
My version: (defn top-words [input-filename result-filename] (spit result-filename (apply str (map #(format "%s : %d\n" (first %) (second %)) (sort-by #(-(val %)) (reduce #(conj %1 { %2 (inc (%1 %2 0)) }) {}