Processing large binary file?

2010-08-21 Thread Piotr 'Qertoip'; Włodarek
I need to process large binary files, i.e. to remove ^M characters. Let's assume files are about 50MB - small enough to be processed in memory (but not with a naive implementation). The following code works, except it throws OutOfMemoryError for file as small as 6MB: (defn read-bin-file [file]

Re: How to convert a sequence to a byte[]?

2010-08-10 Thread Piotr 'Qertoip'; Włodarek
On Aug 10, 7:19 pm, Janico Greifenberg wrote: > By into-array default, into-array returns an array of the capital-B > Bytes (that's what the cryptic  [Ljava.lang.Byte; in the error message > means). To get an array of primitive bytes (the class being printed as > [B), you can pass the type as addi

How to convert a sequence to a byte[]?

2010-08-10 Thread Piotr 'Qertoip'; Włodarek
I need to write raw bytes to the file. I do it with: (.write (FileOutputStream "/path") bytes) ...where bytes must be of type byte[]. Please note it cannot be Byte[]. I tried to convert my sequence with both (bytes) and/or (into-array) functions and got frustrated, one example: user=> (

Re: Convert arabic to roman

2009-12-26 Thread Piotr 'Qertoip'; Włodarek
On Dec 26, 3:46 am, Mark Engelberg wrote: > I reworked your example in a way that I believe to be more clear. > I'll leave it to other readers to judge: This is awesome, thank you. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this grou

Convert arabic to roman

2009-12-25 Thread Piotr 'Qertoip'; Włodarek
What is the most clear and idiomatic way to convert arabic numbers to roman notation? My take: (def roman (hash-map 1 "I", 4 "IV", 5 "V", 9 "IX", 10 "X", 40 "XL", 50 "L", 90 "XC", 100 "C", 400 "CD", 500 "D", 900 "CM", 1000 "M" ) ) (defn to-r

Re: Parallel words frequency ranking

2008-12-30 Thread Piotr 'Qertoip'; Włodarek
On Dec 29, 7:02 am, Chouser wrote: > > (defn split-string-in-two [s] > >  (let [chunk-size (quot (count s) 2)] > >    [(subs s 0 chunk-size), (subs s chunk-size)])) > > Might this cut a word in half and produce (slightly) incorrect > results? True, I decided to let it be for the sake of simplici

Parallel words frequency ranking

2008-12-28 Thread Piotr 'Qertoip'; Włodarek
Following my recent adventure with words ranking, here's the parallel version: (use 'clojure.contrib.duck-streams) (defn top-words-core [s] (reduce #(assoc %1 %2 (inc (%1 %2 0))) {} (re-seq #"\w+" (.toLowerCase s (defn format-words [words] (apply

Re: Exercise: words frequency ranking

2008-12-27 Thread Piotr 'Qertoip'; Włodarek
> Some robustness notes: > > On 5.2MB file, it takes 9s compared to 7s of improved Mibu version, or > 7s of mine initial one. > > On 38MB file, it takes 53s and about 270MB of memory. Similarly, the > initial one and the mibu versions take 39s and also about 270MB of > memory. I also like Ipetit c

Re: Exercise: words frequency ranking

2008-12-27 Thread Piotr 'Qertoip'; Włodarek
And the nice pastie version: http://pastie.org/347369 regards, Piotrek --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe fr

Re: Exercise: words frequency ranking

2008-12-27 Thread Piotr 'Qertoip'; Włodarek
Thank you for all improvements and suggestions. Based on your feedback, here is my final version: (defn read-words "Given a file, return a seq of every word in the file, normalizing words by coverting them to lower case and splitting on whitespace" [in-filepath] (re-seq #"\w+"

Re: Exercise: words frequency ranking

2008-12-26 Thread Piotr 'Qertoip'; Włodarek
On Dec 25, 4:58 pm, Mibu wrote: > My version: > > (defn top-words [input-filename result-filename] >   (spit result-filename >         (apply str >                (map #(format "%s : %d\n" (first %) (second %)) >                     (sort-by #(-(val %)) >                              (reduce #(co

Exercise: words frequency ranking

2008-12-25 Thread Piotr 'Qertoip'; Włodarek
Given the input text file, the program should write to disk a ranking of words sorted by frequency, like: the : 52483 and : 32558 of : 23477 a : 22486 to : 21993 My first implementation: (defn topwords [in-

doseq and dotimes are broken in clojure coming with enclojure

2008-12-23 Thread Piotr 'Qertoip'; Włodarek
Just a note, as I couldn't find that info on the web easilly. The clojure.jar coming with Netbeans plugin enclojure alpha-1.1076.0 October 22, 2008 apparently has a bug. The doseq and dotimes do not work: user=> (doseq [word ["one" "two" "three"]] (println word)) Throws java.lang.Exception: Una

Re: Exercise: how to print multiplication table?

2008-12-23 Thread Piotr 'Qertoip'; Włodarek
Many thanks to all of you. I have several working implementations right now and I'm feeling enlightened ;-) On Dec 22, 5:19 pm, Chouser wrote: > On Mon, Dec 22, 2008 at 10:23 AM, Piotr 'Qertoip' Włodarek > > wrote: > > > (defn multiplication-row [n k]

Simple doseq problem

2008-12-22 Thread Piotr 'Qertoip'; Włodarek
user=> (doseq [word ("one" "two" "three")] (println word)) Throws java.lang.Exception: Unable to resolve symbol: word in this context. Could you please give any working example of doseq? I've seen one or two examples on the web but they doesn't seem to work. Regards, Piotrek --~--~-~--

Exercise: how to print multiplication table?

2008-12-22 Thread Piotr 'Qertoip'; Włodarek
Hello, Being new to Clojure, to Lisp and to functional programming in general, I have some trouble wraping my head around it. As the first exercice, I would like to print multiplication table of specified order, like: (print-multiplication-table 3) 1 2 3 2 4 6 3 6 9 I came that far: