I was missing type hints on the inner calls to aget in the sum, and changing from aset-double to aset makes it even faster:
(defn sum-fields4 [^"[[D" arr1 ^"[[D" arr2 ^"[[D" result] (let [L (int (alength arr1))] (dotimes [i L] (dotimes [j L] (aset ^doubles (aget result i) j (+ (aget ^doubles (aget arr1 i) j) (aget ^doubles (aget arr2 i) j))))))) It doesn't look like the casts to double aren't necessary though. On my computer on a 1000x1000 array this takes ~30 msec! But strangely replacing the call to aset-double with aset in our fastest gaussian-matrix slows things down a lot. Everything else seems to make sense though. On Sep 21, 11:40 pm, Jason Wolfe <jawo...@berkeley.edu> wrote: > This one is more than 10x faster > > (defn sum-fields3 [^"[[D" arr1 ^"[[D" arr2 ^"[[D" result] > (let [L (int (alength arr1))] > (dotimes [i L] > (let [^doubles a1 (aget arr1 i) > ^doubles a2 (aget arr2 i) > ^doubles r (aget result i)] > (dotimes [j L] > (aset-double r j (+ (double (aget a1 j)) (double (aget a2 > j))))))))) > > Not sure what the problem was exactly with yours. Maybe the multi- > indices versions of aget. > > On Sep 21, 7:43 pm, Ranjit <rjcha...@gmail.com> wrote: > > > I was thinking I needed the Java arrays for interop. At one step in my > > simulation I need to take an FFT of a 2d array, then multiply that > > array with another array, and then inverse FFT. Java array operations > > in Clojure seem so slow though that it will actually be better to > > convert from either a built in data structure or an incanter matrix to > > an array for the fft, then back for the multiplication, then back for > > the IFFT, and then back. > > > The incanter matrices seem like a pretty good choice. This function > > using incanter is about as fast as the imperative type hinted function > > you came up with: > > > (defn gaussian-matrix [L mean std] (matrix (map #(sample-normal %1 > > mean std) (repeat L L)))) > > > ,and adding matrices is fast. Converting from an incanter matrix to an > > array like this: > > > (def B (into-array (map double-array (to-list A)))) > > > takes ~100 msec, and converting back takes a similar amount of time. > > So unfortunately all the converting back and forth means almost a half > > second extra per loop. > > > Not that I want to use this approach anymore, but what other type > > hints could I add to my Java array addition function? The only > > unhinted variables I see left are the indices. > > > Thanks for all your help, > > > -Ranjit > > > On Sep 21, 5:48 pm, Jason Wolfe <jawo...@berkeley.edu> wrote: > > > > I think you're still missing some type hints. I think there are > > > varying degrees of reflective code the compiler can emit, and it > > > doesn't always warn for the intermediate cases. > > > > Do you need to do all this array business for Java interop, or just > > > because you believe it will give you maximum performance? If the > > > latter, I'd recommend trying it with built-in data structures first -- > > > you might be surprised at how good the performance can be. Or, if you > > > want to do lots of matrix-and-vector stuff, perhaps try out the > > > Incanter matrix library or similar Java libraries. > > > > I find writing correctly-hinted Clojure code for dealing with arrays > > > to be clumsier and more difficult than just writing the equivalent > > > Java (one of very few areas where this is the case), but fortunately I > > > only very rarely find the need to do so. But, if that's really what > > > you want to do, I think you should always be able to get speed on par > > > with raw Java. > > > > -Jason > > > > On Sep 21, 1:04 pm, Ranjit <rjcha...@gmail.com> wrote: > > > > > Thanks, I just tried using the random number generator in Incanter as > > > > a replacement and it shaves another ~100 msec off the runtime of the > > > > function. That's better but still noticeably slower than python. > > > > > I also tried applying what I've learned here to writing a function > > > > that adds two arrays which is imperative and I used type hints to get > > > > rid of the reflection warnings: > > > > > (defn sum-fields [^"[[D" arr1 ^"[[D" arr2 ^"[[D" result] > > > > (let [L (alength arr1)] > > > > (dotimes [i L] > > > > (dotimes [j L] > > > > (aset-double ^doubles (aget result i) j (+ (aget arr1 i j) > > > > (aget arr2 i j))))))) > > > > > but it's surprisingly slow compared to numpy again. > > > > > On Sep 21, 2:26 pm, Jason Wolfe <jawo...@berkeley.edu> wrote: > > > > > > FYI I fired up a profiler and more than 2/3 of the runtime of > > > > > gaussian- > > > > > matrix5 is going to the .nextGaussian method of java.util.Random. I > > > > > think there are faster (and higher-quality) drop-in replacements for > > > > > java.util.Random, if that is really important to you. I seem to > > > > > recall there being a good Mersenne Twister implementation around, > > > > > which might fit your bill. > > > > > > -Jason > > > > > > On Sep 21, 6:34 am, Ranjit <rjcha...@gmail.com> wrote: > > > > > > > Yeah, I spoke too soon. All the rows are identical. This is what I > > > > > > meant to do: > > > > > > > (defn gaussian-matrix-for-the-nth-time [L] > > > > > > (into-array (map double-array (repeatedly L #(repeatedly L next- > > > > > > gaussian))))) > > > > > > > and on my computer takes ~800 msecs vs. ~400 msecs for gaussian- > > > > > > matrix5 to make an array of 1000^2 gaussians. > > > > > > > But numpy only takes about 100 msecs to do the same thing on my > > > > > > computer. I'm surprised we can't beat that or at least get close. > > > > > > But > > > > > > maybe next-gaussian is the bottleneck as you say. > > > > > > > On Sep 21, 12:20 am, Jason Wolfe <jawo...@berkeley.edu> wrote: > > > > > > > > On Sep 20, 4:43 pm, Ranjit <rjcha...@gmail.com> wrote: > > > > > > > > > I'm glad you think partition is the problem, because that was > > > > > > > > my guess > > > > > > > > too. But I think I have the answer. This is the fastest version > > > > > > > > I've > > > > > > > > seen so far: > > > > > > > > > (defn gaussian-matrix-final [L] > > > > > > > > (into-array ^doubles (map double-array (repeat L (repeatedly > > > > > > > > L next- > > > > > > > > gaussian))))) > > > > > > > > The ^doubles type hint is wrong (it means array of doubles, not > > > > > > > seq of > > > > > > > arrays of doubles); the compiler is just ignoring it. > > > > > > > > And the reason it's so fast is probably since you're repeating the > > > > > > > same L gaussian values L times here. Doesn't matter how fast it > > > > > > > runs > > > > > > > if it returns the wrong answer :-). Anyway, that might indicate > > > > > > > that > > > > > > > the .nextGaussian is actually the bottleneck in the fastest > > > > > > > versions. > > > > > > > > -Jason > > > > > > > > > If I understand what's going on now, then it looks like the > > > > > > > > only way > > > > > > > > to make this any faster is if next-gaussian could return > > > > > > > > primitives. > > > > > > > > > The for, and doseq macros seems like they're pretty slow. > > > > > > > > > -Ranjit > > > > > > > > > On Sep 20, 3:30 pm, Jason Wolfe <jawo...@berkeley.edu> wrote: > > > > > > > > > > I think partition is slowing you down (but haven't profiled to > > > > > > > > > verify). Here's a functional version that's about 70% as > > > > > > > > > fast as my > > > > > > > > > "5": > > > > > > > > > > (defn gaussian-matrix6 [L] > > > > > > > > > (to-array (for [i (range L)] (into-array Double/TYPE > > > > > > > > > (for [j > > > > > > > > > (range L)] (next-gaussian)))))) > > > > > > > > > > and I'd guess that's about as good as you're going to get, > > > > > > > > > given that > > > > > > > > > this approach is necessarily going to box and unbox the > > > > > > > > > doubles, and > > > > > > > > > create intermediate sequences, rather than stuffing the > > > > > > > > > primitive > > > > > > > > > doubles directly into the result array. > > > > > > > > > > -Jason > > > > > > > > > > On Sep 20, 12:00 pm, Ranjit <rjcha...@gmail.com> wrote: > > > > > > > > > > > Replacing the doseq's with dotimes speeds it up a little > > > > > > > > > > more: > > > > > > > > > > > (defn gaussian-matrix5 [^"[[D" arr] > > > > > > > > > > (dotimes [x (alength arr)] > > > > > > > > > > (dotimes [y (alength (first arr))] > > > > > > > > > > (aset-double ^doubles (aget arr (int x)) (int y) > > > > > > > > > > (next- > > > > > > > > > > gaussian))))) > > > > > > > > > > > but I'm getting reflection warnings on alength. I guess it > > > > > > > > > > doesn't > > > > > > > > > > cause a problem because they're only called once? > > > > > > > > > > > Also adding type hints to the more functional version of my > > > > > > > > > > first > > > > > > > > > > attempt speeds things up quite a bit: > > > > > > > > > > > (defn gaussian-matrix2 [L] > > > > > > > > > > (into-array ^doubles > > > > > > > > > > (map double-array (partition L (repeatedly (* L > > > > > > > > > > L) next- > > > > > > > > > > gaussian))))) > > > > > > > > > > > But it's still about 4x slower than gaussian-matrix5 above. > > > > > > > > > > There must > > > > > > > > > > be a way to improve on the inner loop here that doesn't > > > > > > > > > > require using > > > > > > > > > > indices, right? > > > > > > > > > > > On Sep 20, 12:32 pm, Jason Wolfe <jawo...@berkeley.edu> > > > > > > > > > > wrote: > > > > > > > > > > > > Oops, I found aset-double2 with tab completion and > > > > > > > > > > > figured it was > > > > > > > > > > > build-in. Forgot it was a utility I built some time ago, > > > > > > > > > > > a stub for a > > > > > > > > > > > Java method that does the setting. > > > > > > > > > > > > Also, I got the type hint for the "arr" arg wrong, > > > > > > > > > > > although it didn't > > > > > > > > > > > seem to matter. > > > > > > > > > > > > Here's a fixed version in standard Clojure that's > > > > > > > > > > > basically as fast: > > > > > > > > > > > > user> (defn gaussian-matrix4 [^"[[D" arr ^int L] > > > > > > > > > > > (doseq [x (range L) y (range L)] (aset-double > > > > > > > > > > > ^doubles > > > > > > > > > > > (aget arr (int x)) (int y) (.nextGaussian ^Random r)))) > > > > > > > > > > > #'user/gaussian-matrix4 > > > > > > > > > > > user> (do (microbench (gaussian-matrix3 (make-array > > > > > > > > > > > Double/TYPE 10 > > > > > > > > > > > 10) 10)) (microbench (gaussian-matrix4 (make-array > > > > > > > > > > > Double/TYPE 10 10) > > > > > > > > > > > 10)) ) > > > > > > > > > > > min; avg; max ms: 0.000 ; 0.033 ; 8.837 ( 56828 > > > > > > > > > > > iterations) > > > > > > > > > > > min; avg; max ms: 0.009 ; 0.038 ; 7.132 ( 50579 > > > > > > > > > > > iterations) > > > > > > > > > > > > It seems like you should be able to just use aset-double > > > > > > > > > > > with multiple > > > > > > > > > > > indices (in place of aset-double2), but I can't seem to > > > > > > > > > > > get the type > > > > > > > > > > > hints right. > > > > > > > > > > > > -Jason > > > > > > > > > > > > On Sep 20, 7:36 am, Ranjit <rjcha...@gmail.com> wrote: > > > > > > > > > > > > > Thanks Jason, this is great. > > > > > > > > > > > > > I was confused earlier because I wasn't seeing > > > > > > > > > > > > reflection warnings, > > ... > > read more » -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en