Re: why the big difference in speed?

Jason Wolfe Tue, 21 Sep 2010 14:48:41 -0700

I think you're still missing some type hints.  I think there are
varying degrees of reflective code the compiler can emit, and it
doesn't always warn for the intermediate cases.


Do you need to do all this array business for Java interop, or just
because you believe it will give you maximum performance?  If the
latter, I'd recommend trying it with built-in data structures first --
you might be surprised at how good the performance can be.  Or, if you
want to do lots of matrix-and-vector stuff, perhaps try out the
Incanter matrix library or similar Java libraries.

I find writing correctly-hinted Clojure code for dealing with arrays
to be clumsier and more difficult than just writing the equivalent
Java (one of very few areas where this is the case), but fortunately I
only very rarely find the need to do so.  But, if that's really what
you want to do, I think you should always be able to get speed on par
with raw Java.

-Jason

On Sep 21, 1:04 pm, Ranjit <rjcha...@gmail.com> wrote:
> Thanks, I just tried using the random number generator in Incanter as
> a replacement and it shaves another ~100 msec off the runtime of the
> function. That's better but still noticeably slower than python.
>
> I also tried applying what I've learned here to writing a function
> that adds two arrays which is imperative and I used type hints to get
> rid of the reflection warnings:
>
> (defn sum-fields [^"[[D" arr1 ^"[[D" arr2 ^"[[D" result]
>   (let [L (alength arr1)]
>     (dotimes [i L]
>       (dotimes [j L]
>         (aset-double ^doubles (aget result i) j (+ (aget arr1 i j)
>                              (aget arr2 i j)))))))
>
> but it's surprisingly slow compared to numpy again.
>
> On Sep 21, 2:26 pm, Jason Wolfe <jawo...@berkeley.edu> wrote:
>
> > FYI I fired up a profiler and more than 2/3 of the runtime of gaussian-
> > matrix5 is going to the .nextGaussian method of java.util.Random.  I
> > think there are faster (and higher-quality) drop-in replacements for
> > java.util.Random, if that is really important to you.  I seem to
> > recall there being a good Mersenne Twister implementation around,
> > which might fit your bill.
>
> > -Jason
>
> > On Sep 21, 6:34 am, Ranjit <rjcha...@gmail.com> wrote:
>
> > > Yeah, I spoke too soon. All the rows are identical. This is what I
> > > meant to do:
>
> > > (defn gaussian-matrix-for-the-nth-time [L]
> > >   (into-array (map double-array (repeatedly L #(repeatedly L next-
> > > gaussian)))))
>
> > > and on my computer takes ~800 msecs vs. ~400 msecs for gaussian-
> > > matrix5 to make an array of 1000^2 gaussians.
>
> > > But numpy only takes about 100 msecs to do the same thing on my
> > > computer. I'm surprised we can't beat that or at least get close. But
> > > maybe next-gaussian is the bottleneck as you say.
>
> > > On Sep 21, 12:20 am, Jason Wolfe <jawo...@berkeley.edu> wrote:
>
> > > > On Sep 20, 4:43 pm, Ranjit <rjcha...@gmail.com> wrote:
>
> > > > > I'm glad you think partition is the problem, because that was my guess
> > > > > too. But I think I have the answer. This is the fastest version I've
> > > > > seen so far:
>
> > > > > (defn gaussian-matrix-final [L]
> > > > >   (into-array ^doubles (map double-array (repeat L (repeatedly L next-
> > > > > gaussian)))))
>
> > > > The ^doubles type hint is wrong (it means array of doubles, not seq of
> > > > arrays of doubles); the compiler is just ignoring it.
>
> > > > And the reason it's so fast is probably since you're repeating the
> > > > same L gaussian values L times here.  Doesn't matter how fast it runs
> > > > if it returns the wrong answer :-).   Anyway, that might indicate that
> > > > the .nextGaussian is actually the bottleneck in the fastest versions.
>
> > > > -Jason
>
> > > > > If I understand what's going on now, then it looks like the only way
> > > > > to make this any faster is if next-gaussian could return primitives.
>
> > > > > The for, and doseq macros seems like they're pretty slow.
>
> > > > > -Ranjit
>
> > > > > On Sep 20, 3:30 pm, Jason Wolfe <jawo...@berkeley.edu> wrote:
>
> > > > > > I think partition is slowing you down (but haven't profiled to
> > > > > > verify).  Here's a functional version that's about 70% as fast as my
> > > > > > "5":
>
> > > > > > (defn gaussian-matrix6 [L]
> > > > > >      (to-array (for [i (range L)] (into-array Double/TYPE (for [j
> > > > > > (range L)] (next-gaussian))))))
>
> > > > > > and I'd guess that's about as good as you're going to get, given 
> > > > > > that
> > > > > > this approach is necessarily going to box and unbox the doubles, and
> > > > > > create intermediate sequences, rather than stuffing the primitive
> > > > > > doubles directly into the result array.
>
> > > > > > -Jason
>
> > > > > > On Sep 20, 12:00 pm, Ranjit <rjcha...@gmail.com> wrote:
>
> > > > > > > Replacing the doseq's with dotimes speeds it up a little more:
>
> > > > > > > (defn gaussian-matrix5 [^"[[D" arr]
> > > > > > >   (dotimes [x (alength arr)]
> > > > > > >     (dotimes [y (alength (first arr))]
> > > > > > >       (aset-double ^doubles (aget arr (int x)) (int y) (next-
> > > > > > > gaussian)))))
>
> > > > > > > but I'm getting reflection warnings on alength. I guess it doesn't
> > > > > > > cause a problem because they're only called once?
>
> > > > > > > Also adding type hints to the more functional version of my first
> > > > > > > attempt speeds things up quite a bit:
>
> > > > > > > (defn gaussian-matrix2 [L]
> > > > > > >      (into-array ^doubles
> > > > > > >           (map double-array (partition L (repeatedly (* L L) next-
> > > > > > > gaussian)))))
>
> > > > > > > But it's still about 4x slower than gaussian-matrix5 above. There 
> > > > > > > must
> > > > > > > be a way to improve on the inner loop here that doesn't require 
> > > > > > > using
> > > > > > > indices, right?
>
> > > > > > > On Sep 20, 12:32 pm, Jason Wolfe <jawo...@berkeley.edu> wrote:
>
> > > > > > > > Oops, I found aset-double2 with tab completion and figured it 
> > > > > > > > was
> > > > > > > > build-in.  Forgot it was a utility I built some time ago, a 
> > > > > > > > stub for a
> > > > > > > > Java method that does the setting.
>
> > > > > > > > Also, I got the type hint for the "arr" arg wrong, although it 
> > > > > > > > didn't
> > > > > > > > seem to matter.
>
> > > > > > > > Here's a fixed version in standard Clojure that's basically as 
> > > > > > > > fast:
>
> > > > > > > > user> (defn gaussian-matrix4 [^"[[D" arr ^int L]
> > > > > > > >             (doseq [x (range L) y (range L)] (aset-double 
> > > > > > > > ^doubles
> > > > > > > > (aget arr (int x)) (int y) (.nextGaussian ^Random r))))
> > > > > > > > #'user/gaussian-matrix4
> > > > > > > > user> (do   (microbench (gaussian-matrix3 (make-array 
> > > > > > > > Double/TYPE 10
> > > > > > > > 10) 10)) (microbench (gaussian-matrix4 (make-array Double/TYPE 
> > > > > > > > 10 10)
> > > > > > > > 10)) )
> > > > > > > > min; avg; max ms:  0.000 ; 0.033 ; 8.837    ( 56828  iterations)
> > > > > > > > min; avg; max ms:  0.009 ; 0.038 ; 7.132    ( 50579  iterations)
>
> > > > > > > > It seems like you should be able to just use aset-double with 
> > > > > > > > multiple
> > > > > > > > indices (in place of aset-double2), but I can't seem to get the 
> > > > > > > > type
> > > > > > > > hints right.
>
> > > > > > > > -Jason
>
> > > > > > > > On Sep 20, 7:36 am, Ranjit <rjcha...@gmail.com> wrote:
>
> > > > > > > > > Thanks Jason, this is great.
>
> > > > > > > > > I was confused earlier because I wasn't seeing reflection 
> > > > > > > > > warnings,
> > > > > > > > > but it turns out that was only because I was evaluating the 
> > > > > > > > > function
> > > > > > > > > definitions in the emacs buffer, and the warnings weren't 
> > > > > > > > > visible.
>
> > > > > > > > > I have a question about gaussian-matrix3 though. What is 
> > > > > > > > > "aset-
> > > > > > > > > double2"? Is that a macro that has a type hint for an array of
> > > > > > > > > doubles?
>
> > > > > > > > > Thanks,
>
> > > > > > > > > -Ranjit
> > > > > > > > > On Sep 19, 5:37 pm, Jason Wolfe <jawo...@berkeley.edu> wrote:
>
> > > > > > > > > > Hi Ranjit,
>
> > > > > > > > > > The big perf differences you're seeing are due to 
> > > > > > > > > > reflective calls.
> > > > > > > > > > Getting the Java array bits properly type-hinted is 
> > > > > > > > > > especially tricky,
> > > > > > > > > > since you don't always get good reflection warnings.
>
> > > > > > > > > > Note that aset is only fast for reference types:
>
> > > > > > > > > > user> (doc aset)
> > > > > > > > > > -------------------------
> > > > > > > > > > clojure.core/aset
> > > > > > > > > > ([array idx val] [array idx idx2 & idxv])
> > > > > > > > > >   Sets the value at the index/indices. Works on Java arrays 
> > > > > > > > > > of
> > > > > > > > > >   reference types. Returns val.
>
> > > > > > > > > > So, if you want to speed things up ... here's your starting 
> > > > > > > > > > point:
>
> > > > > > > > > > user> (set! *warn-on-reflection* true)
> > > > > > > > > > true
> > > > > > > > > > user> (import java.util.Random)
> > > > > > > > > > (def r (Random. ))
>
> > > > > > > > > > (defn next-gaussian [] (.nextGaussian r))
>
> > > > > > > > > > (defn gaussian-matrix1 [arr L]
> > > > > > > > > >      (doseq [x (range L) y (range L)] (aset arr x y 
> > > > > > > > > > (next-gaussian))))
>
> > > > > > > > > > (defn gaussian-matrix2 [L]
> > > > > > > > > >      (into-array (map double-array (partition L (repeatedly 
> > > > > > > > > > (* L L)
> > > > > > > > > > next-gaussian)))))
>
> > > > > > > > > > Reflection warning, NO_SOURCE_FILE:1 - reference to field 
> > > > > > > > > > nextGaussian
> > > > > > > > > > can't be resolved.
>
> > > > > > > > > > user> (do  (microbench (gaussian-matrix1 (make-array 
> > > > > > > > > > Double/TYPE 10
> > > > > > > > > > 10) 10)) (microbench (gaussian-matrix2  10)) )
> > > > > > > > > > min; avg; max ms:  2.944 ; 4.693 ; 34.643    ( 424  
> > > > > > > > > > iterations)
> > > > > > > > > > min; avg; max ms:  0.346 ; 0.567 ; 11.006    ( 3491  
> > > > > > > > > > iterations)
>
> > > > > > > > > > ;; Now, we can get rid of the reflection in next-guassian:
>
> > > > > > > > > > user> (defn next-gaussian [] (.nextGaussian #^Random r))
> > > > > > > > > > #'user/next-gaussian
> > > > > > > > > > user> (do  (microbench (gaussian-matrix1 (make-array 
> > > > > > > > > > Double/TYPE 10
> > > > > > > > > > 10) 10)) (microbench (gaussian-matrix2  10)) )
> > > > > > > > > > min; avg; max ms:  2.639 ; 4.194 ; 25.024    ( 475  
> > > > > > > > > > iterations)
> > > > > > > > > > min; avg; max ms:  0.068 ; 0.130 ; 10.766    ( 15104  
> > > > > > > > > > iterations)
> > > > > > > > > > nil
>
> > > > > > > > > > ;; which has cut out the main bottleneck in 
> > > > > > > > > > gaussian-matrix2.
> > > > > > > > > > ;; 1 is still slow because of its array handling.
> > > > > > > > > > ;; here's a fixed version:
>
> > > > > > > > > > user> (defn gaussian-matrix3 [^doubles arr ^int L]
> > > > > > > > > >      (doseq [x (range L) y (range L)] (aset-double2 arr 
> > > > > > > > > > (int x) (int
> > > > > > > > > > y) (.nextGaussian ^Random r))))
> > > > > > > > > > #'user/gaussian-matrix3
>
> > > > > > > > > > user> (do  (microbench (gaussian-matrix1 (make-array 
> > > > > > > > > > Double/TYPE 10
> > > > > > > > > > 10) 10)) (microbench (gaussian-matrix2  10)) (microbench 
> > > > > > > > > > (gaussian-
> > > > > > > > > > matrix3 (make-array Double/TYPE 10 10) 10)) )
> > > > > > > > > > min; avg; max ms:  2.656 ; 4.164 ; 12.752    ( 479  
> > > > > > > > > > iterations)
> > > > > > > > > > min; avg; max ms:
>
> ...
>
> read more »

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: why the big difference in speed?

Reply via email to