Hi Steven, My bad. You need to invoke the code using the command
lein run -m rdp.214-intermediate-arr 1 true The `1` tells it to select a certain input file, (in this case the biggest) and the `true` tells it to use the function that internally uses a java array (as opposed to the function that internally uses a transient map.) The above mentioned command takes around 250secs on my laptop. My apologies again, I should have made it clear how to execute the project. On Saturday, 16 May 2015 05:48:43 UTC+5:30, Steven Yi wrote: > > Hi Amith, > > I checked out your project from git and just doing 'lein run' I got a > reported: > > "Elapsed time: 185.651689 msecs" > > However, if I modify the -main function in 214_intermediate.clj to wrap > the time testing with (doseq [_ (range 20)]), to run the test multiple > times, the behavior is much better after the first 9 or so runs, and by the > end it is down to: > > "Elapsed time: 35.574945 msecs" > > I think you might be running into a situation where the VM hasn't run the > functions enough times to optimize. > > Rather than use the time function, you might want to try using > criterium[1] to benchmark the code. The site explains all the wonderful > things it does to get a good benchmark. > > Unfortunately, if your data set is small, and you're running a one off > calculation like this, I don't know if there's much to improve due to the > warmup cost. You could fiddle a bit with VM flags to try to optimize with > less calls (I can't recall the flag, but I think the JVM defaults to > optimizing after a functions been called 10,000 times). On the other hand, > if you're processing larger datasets, I think it's reassuring that once > warmed up, the Clojure code performs pretty well. > > For reference, this was run on an Macbook Pro 13" early 2011, Core i7 > 2.7ghz. > > steven > > [1] - https://github.com/hugoduncan/criterium/ > > On Friday, May 15, 2015 at 3:59:22 AM UTC-4, Amith George wrote: >> >> Thanks for the detailed suggestions. Implementing them did bring the >> execution time down to around 250secs. Though that value is still much >> longer than 45secs. Could you please verify if I have implemented them >> correctly? >> >> Code - >> https://github.com/amithgeorge/reddit-dailyprogrammer-clojure/blob/4ec02600e025d257a59d76fee0ad0eb01f4785ff/src/rdp/214_intermediate_arr.clj >> >> project.clj contains the line >> >> :jvm-opts ^:replace ["-Xms1024m" "-Xmx1g" "-server"] >> >> > 0) Switch to Clojure 1.7.0-beta3 - it's faster and some things below >> are dependent on it for best performance. And use Java 1.8. >> >> Done. >> >> > 1) Parse the lines you're reading directly into longs (Clojure >> focuses on 64-bit primitives - longs and doubles) >> >> (defn- read-line-nums >> ([] (read-line-nums (read-line) #(Long/parseLong %1))) >> ([input-line parse-fn] >> (if-let [line input-line] >> (->> line >> (#(clojure.string/split %1 #" ")) >> (map parse-fn))))) >> >> > 2) Put the longs first into a data structure that preserves the >> primitive type. The two best options for that here are records (which can >> have primitive fields) and arrays. I would create a Canvas defrecord with >> ^long width and height and a Paper defrecord with all ^long fields for >> example. >> >> (defrecord Paper [^long color >> ^long x1 ^long y1 >> ^long x2 ^long y2]) >> >> (defrecord Canvas [^long width ^long height]) >> >> (defn- make-paper >> ([^long w ^long h] (make-paper [0 0 0 w h])) >> ([^longs [color x y w h]] >> (Paper. color x y (+ x w -1) (+ y h -1)))) >> >> I had to make the second arity accept a vector, as it seems impossible to >> create a function accepting more than 4 primitive arguments. Though I >> suppose I could grouped the args into a `record` of its own? >> >> > 3) Store the papers in a vector (using transient to create it) >> >> The `read-input-file` function which creates and stores the papers, it >> finishes within 20ms. So I didn't bother using a transient. >> >> > 4) I suspect visible-color and covered? could probably be tightened >> up into a reduce over papers or a single loop-recur over papers - can't say >> I totally get what's happening there. >> >> The papers vector represents a sheets of papers that have been stacked on >> top of each other. ie, the last element in the vector is the canvas, and >> the first element is the topmost sheet. Each sheet is of different >> dimensions. Depending on the coordinates where they are placed, they may >> cover a part of the canvas. Different sheets may overlap in the areas they >> cover and thus we only want to consider the topmost sheet for that area. >> >> The `visible-color` function accepts a canvas coordinate and the stack >> papers and goes through the stack to find the first sheet that covers said >> coordinate. That papers color will be visible color for that coordinate. >> >> > 5) In visible-color-frequencies, you could use "update" instead of >> get + transient assoc! on the acc map, but this is never going to be >> terribly fast. Another option here would be to create an array with the max >> color (you could track that while reading if it's not a well-known answer) >> and bash the array. That can retain int or long counters and will be *way* >> faster. >> >> (defn- visible-color-frequencies-arr >> [{:keys [colors canvas papers]}] >> (let [colorCounts (long-array (count colors))] >> (reduce >> (fn [_ ^longs coord] >> (if-let [color (visible-color coord papers)] >> (aset-long colorCounts color (+ 1 (aget colorCounts color))) >> _)) >> -1 >> (for [^long y (range (:height canvas)) >> ^long x (range (:width canvas))] >> [x y])) >> (zipmap (range) colorCounts))) >> >> It's a bit messy with me abusing reduce and totally ignoring the >> accumulator, but the version using arrays is atleast 10 seconds faster than >> the one using transient maps. That said, even the hashmap version took >> around 260-270secs. So the bulk of the time savings is caused by some other >> change. >> >> > 6) You can use (set! *unchecked-math* :warn-on-boxed) to get faster >> math (no overflow checks) and also issue warnings (added in 1.7) if you >> happened to use boxed math by accident. >> >> I added these to both the old code (which didn't use type hints) and the >> new one. I ran the `-main` function of both from within the repl and using >> lein run. I didn't see any warnings in any of the executions. How am I >> supposed to use these? >> >> (defn -main >> ([] (-main "0" "false")) >> ([index use-arrays] >> (time >> (binding [*unchecked-math* :warn-on-boxed >> *warn-on-reflection* true] >> (if-not (Boolean/parseBoolean use-arrays) >> (solve (input-files (Integer/parseInt index))) >> (solve-arr (input-files (Integer/parseInt index)))))))) >> >> > Use: ^:jvm-opts ^:replace ["-server"] >> >> In my case there isn't any noticable difference between explicitly using >> '-server' and not using it. Which brings me to my final question, >> >> > The major problem here is that you are using boxed math for >> everything instead of primitives. >> >> From what I understand of the changes I made, a reduction of almost 250 >> seconds came about from simply using type hints. Is that normal? The >> incrementing of color counts in the original code, was that the boxed math >> you were referring to? >> >> In C#, using generic types consistently meant I never had to worry about >> boxing. I tried using `no.disassemble` to verify if the type hints really >> changed anything, but I only got more confused. >> >> Consider this example code where my-val is supposed to be a long and each >> of the functions is supposed to accept a long and return a long. >> >> (-> my-val >> (func-a) >> (func-b) >> (func-c)) >> >> I will have to add typehints to original value as well to each function's >> input and output, right? Can typehints be added to denote a >> collection/sequence of longs? What about a collection/sequence of records? >> >> >> Finally, can the performance be improved any more? 250secs still feels >> too long compared to the C# version. >> > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.