Re: Pure-functional N-body benchmark implementation

2009-08-18 Thread Aaron Cohen
On Tue, Aug 18, 2009 at 3:32 PM, Aaron Cohen wrote: > On Tue, Aug 18, 2009 at 11:28 AM, Brad > Beveridge wrote: >> >> On 2009-08-17, at 8:58 PM, FFT wrote: >> >>> On Mon, Aug 17, 2009 at 9:25 AM, Bradbev >>> wrote: Ah, that makes more sense re the "cheating" then.  Your insight for

Re: Pure-functional N-body benchmark implementation

2009-08-18 Thread Aaron Cohen
On Tue, Aug 18, 2009 at 11:28 AM, Brad Beveridge wrote: > > On 2009-08-17, at 8:58 PM, FFT wrote: > >> On Mon, Aug 17, 2009 at 9:25 AM, Bradbev >> wrote: >>> >>> Ah, that makes more sense re the "cheating" then.  Your insight for >>> array range check elimination got me thinking - why can't the >

Re: Pure-functional N-body benchmark implementation

2009-08-18 Thread FFT
On Mon, Aug 17, 2009 at 9:25 AM, Bradbev wrote: > > On Aug 17, 1:32 am, Nicolas Oury wrote: >> I was referring to the rules of the benchmark game. When you benchmark >> language, using another language is not fair. >> >> If you were to do your own program, of course you could use Java. >> However

Re: Pure-functional N-body benchmark implementation

2009-08-18 Thread Brad Beveridge
On 2009-08-17, at 8:58 PM, FFT wrote: > On Mon, Aug 17, 2009 at 9:25 AM, Bradbev > wrote: >> >> On Aug 17, 1:32 am, Nicolas Oury wrote: >>> I was referring to the rules of the benchmark game. When you >>> benchmark >>> language, using another language is not fair. >>> >>> If you were to

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread Aaron Cohen
On Mon, Aug 17, 2009 at 7:45 PM, Mark Engelberg wrote: > > On Mon, Aug 17, 2009 at 9:25 AM, Bradbev wrote: >> I found >> another 2-3x speed up by coercing the indexes with (int x), ie >> (defmacro mass [p] `(double (aget ~p (int 0 > > Which makes me wonder why aget doesn't automatically coerce

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread Mark Engelberg
On Mon, Aug 17, 2009 at 9:25 AM, Bradbev wrote: > I found > another 2-3x speed up by coercing the indexes with (int x), ie > (defmacro mass [p] `(double (aget ~p (int 0 Which makes me wonder why aget doesn't automatically coerce an index to an int. Would an input that can't be coerced to an

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread Nicolas Oury
Seems to mean that I was wrong and that the cost is both in bound check and unpacking the indices, mostly the second one. On Mon, 2009-08-17 at 09:25 -0700, Bradbev wrote: > On Aug 17, 1:32 am, Nicolas Oury wrote: > > I was referring to the rules of the benchmark game. When you benchmark > > la

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread Bradbev
On Aug 17, 1:32 am, Nicolas Oury wrote: > I was referring to the rules of the benchmark game. When you benchmark > language, using another language is not fair. > > If you were to do your own program, of course you could use Java. > However, in the particular circumstance, it is a bit annoying to

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread David Nolen
On Sun, Aug 16, 2009 at 6:50 AM, Nicolas Oury wrote: > > Dear all, > > > The good news: I have a version of the N-body benchmark that goes "as > fast as java". > > The bad news: I am cheating a little bit... You're only cheating if you care about the fantasy world that is microbenchmarks. I thin

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread Nicolas Oury
On this particular example, I think we are a bit further that what Transients currently offers. Even using a mutable primitive Java array results in code 2 or 3 times slower than the Java implementation of the benchmarks. I have no doubt the struct and transients in Clojure will allow to do that a

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread e
i don't know much about this (haven't followed closely, lately), but do the new Transients come into play to somewhat address this? Sounds like they were designed just for this sort of thing: inner-loop optimization and low-level mutation that still works functionally to everything outside... On

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread David Nolen
On Mon, Aug 17, 2009 at 4:32 AM, Nicolas Oury wrote: > > I was referring to the rules of the benchmark game. When you benchmark > language, using another language is not fair. > > If you were to do your own program, of course you could use Java. > However, in the particular circumstance, it is a b

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread Mark Engelberg
Here's what I've learned from following this benchmark thread: >From the various things I've read about Clojure's performance, I've always had this sense that: a) if you have a performance problem, there's probably some inner loop that needs to be optimized, and so b) you can use Clojure's type-h

Re: Pure-functional N-body benchmark implementation

2009-08-17 Thread Nicolas Oury
I was referring to the rules of the benchmark game. When you benchmark language, using another language is not fair. If you were to do your own program, of course you could use Java. However, in the particular circumstance, it is a bit annoying to use Java just to create a data structure type. B

Re: Pure-functional N-body benchmark implementation

2009-08-16 Thread Bradbev
> > Why can't we write programs in Clojure and > drop down to Java if necessary? That's what I find funny about these threads, Clojure's Java interop is good, Java is easy to write performant code in. There is a clear path to getting the best JVM performance possible from a Clojure environment.

Re: Pure-functional N-body benchmark implementation

2009-08-16 Thread Meikel Brandmeyer
Hi, Am 16.08.2009 um 12:50 schrieb Nicolas Oury: The bad news: I am cheating a little bit... Why is this cheating? People wrote programs in C and dropped down to Assembly if necessary. People write programs in Python and drop down to C if necessary. Why can't we write programs in Clojure and

Re: Pure-functional N-body benchmark implementation

2009-08-16 Thread Nicolas Oury
Dear all, The good news: I have a version of the N-body benchmark that goes "as fast as java". The bad news: I am cheating a little bit... As I suspected that a lot of time was spend in the array bound check arithmetic, I replaced #^doubles in the implementation of body by an object implemente

Re: Pure-functional N-body benchmark implementation

2009-08-13 Thread Nicolas Oury
-XX:+AggressiveOpts improves another 5-10%. EscapeAnalysis seems more important than BiasedLocking. I don't have a disassembling module installed. Could someone use the PrintAssembly option and put the asm for the JITed method somewhere. It could be interesting to see it side by side with the

Re: Pure-functional N-body benchmark implementation

2009-08-12 Thread Aaron Cohen
On Wed, Aug 12, 2009 at 4:49 PM, Aaron Cohen wrote: > I'm getting a very significant performance improvement by adding a > couple of JVM parameters (using jdk 1.6.0_14).  They are: > -XX:+DoEscapeAnalysis > -XX:+UseBiasedLocking (I think the -server flag is required for those > two flags to do any

Re: Pure-functional N-body benchmark implementation

2009-08-12 Thread Aaron Cohen
I'm getting a very significant performance improvement by adding a couple of JVM parameters (using jdk 1.6.0_14). They are: -XX:+DoEscapeAnalysis -XX:+UseBiasedLocking (I think the -server flag is required for those two flags to do anything). My runtime with n = 5,000,000 goes from ~7.5 seconds

Re: Pure-functional N-body benchmark implementation

2009-08-12 Thread Nicolas Oury
Hello, I tried to inline everything in the main loop (the updaters loops) and obtained on my machine a 15% speed-up. One of the possible slowdown may come from having arrays and not object. Maybe, each access need to perform a size check on the array. Which is not very costly but not negligible

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Aaron Cohen
On Tue, Aug 11, 2009 at 8:13 PM, Andy Fingerhut wrote: > > On Aug 11, 2:36 pm, Aaron Cohen wrote: >> At that point is it possible you're just paying the price of >> PersistentVector for the "bodies" vector?  Does it improve much if you >> change bodies to an array? > > About 7% faster changing bo

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Andy Fingerhut
On Aug 11, 2:36 pm, Aaron Cohen wrote: > At that point is it possible you're just paying the price of > PersistentVector for the "bodies" vector?  Does it improve much if you > change bodies to an array? About 7% faster changing bodies to a Java array of java.lang.Object's, each of which happens

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread fft1976
On Aug 11, 2:26 pm, Andy Fingerhut wrote: > As always, suggestions or improved versions are welcome. I noticed that when I wrap ~new-mass in (double ...) in this (defmacro set-mass! [p new-mass] `(aset ~p 0 ~new-mass)) and other setters, I get warnings. --~--~-~--~~~--

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Aaron Cohen
On Tue, Aug 11, 2009 at 5:26 PM, Andy Fingerhut wrote: > > In case it matters to anyone, my intent in creating these Clojure > programs to compare their speed to others isn't to try to rip into > Clojure, or start arguments.  It is for me to get my feet wet with > Clojure, and perhaps produce some

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Andy Fingerhut
In case it matters to anyone, my intent in creating these Clojure programs to compare their speed to others isn't to try to rip into Clojure, or start arguments. It is for me to get my feet wet with Clojure, and perhaps produce some examples that others can learn from on what performs well in Clo

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Jonathan Smith
On Aug 11, 2:43 pm, fft1976 wrote: > On Aug 11, 4:50 am, Jonathan Smith wrote: > > > I don't think you have to put *everything* in the let, just your > > constants. (so days per year and solar mass, the bodies themselves). > > How will they "escape" from the LET though? I see that in your code

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread fft1976
On Aug 11, 4:50 am, Jonathan Smith wrote: > I don't think you have to put *everything* in the let, just your > constants. (so days per year and solar mass, the bodies themselves). How will they "escape" from the LET though? I see that in your code everything is inside a LET. That's what I tried

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Jonathan Smith
On Aug 11, 4:42 am, fft1976 wrote: > On Aug 10, 11:42 pm, Jonathan Smith > wrote: > > > The way your code is setup, you will spend a lot of time in funcall > > overhead just because you used a lot of functions instead of doing the > > calculation in bigger chunks. > > I thought, as I understoo

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread fft1976
On Aug 11, 2:25 am, fft1976 wrote: > Hmmm > > I just ran your version #8, and it's almost as slow as mine > (nbody_v2.clj): 53 times slower than Java, but I'm running Clojure 1.0 > and Strike that. I f'ed up the namespaces and was actually measuring my own version. Yours is 8x slower than Ja

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread fft1976
On Aug 11, 12:39 am, Andy Fingerhut wrote: > On Aug 10, 11:50 pm, Christophe Grand wrote: > > > Hi Andy, > > > On Tue, Aug 11, 2009 at 8:15 AM, Andy Fingerhut < > > > andy_finger...@alum.wustl.edu> wrote: > > > I've tried an approach like you suggest, using mutable Java arrays of > > > doubles

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread fft1976
On Aug 11, 12:39 am, Andy Fingerhut wrote: > > http://github.com/jafingerhut/clojure-benchmarks/blob/9dc56d8ff53f0b8... > Why isn't the array-using version as fast as Java? Shouldn't using Java's data structures, mutation and no reflection supposed to be equivalent to using Java? --~--~

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread fft1976
On Aug 10, 11:42 pm, Jonathan Smith wrote: > The way your code is setup, you will spend a lot of time in funcall > overhead just because you used a lot of functions instead of doing the > calculation in bigger chunks. I thought, as I understood from Rich's lectures, JVM inlines whatever it want

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Mark Engelberg
On Tue, Aug 11, 2009 at 12:39 AM, Andy Fingerhut wrote: > Wow, you ain't kiddin.  I changed about 10 lines from my last version, > to avoid using aset-double, using aset and type hints until the > reflection warnings went away, and it sped up by a factor of 10.  I'm > leaving the previous version'

Re: Pure-functional N-body benchmark implementation

2009-08-11 Thread Andy Fingerhut
On Aug 10, 11:50 pm, Christophe Grand wrote: > Hi Andy, > > On Tue, Aug 11, 2009 at 8:15 AM, Andy Fingerhut < > > andy_finger...@alum.wustl.edu> wrote: > > I've tried an approach like you suggest, using mutable Java arrays of > > doubles, macros using aget / aset-double for reading and writing th

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread Christophe Grand
Hi Andy, On Tue, Aug 11, 2009 at 8:15 AM, Andy Fingerhut < andy_finger...@alum.wustl.edu> wrote: > I've tried an approach like you suggest, using mutable Java arrays of > doubles, macros using aget / aset-double for reading and writing these > arrays, and loop/recur everywhere iteration is needed

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread Andy Fingerhut
On Aug 10, 11:33 pm, Mark Engelberg wrote: > On Mon, Aug 10, 2009 at 11:15 PM, Andy > > Fingerhut wrote: > > I suspect I'm doing something wrong in my mutable Java array > > implementation, but I don't see what it could be. > > There still seems to be a lot of boxing and unboxing going on.  For e

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread Jonathan Smith
On Aug 10, 11:08 pm, fft1976 wrote: > On Aug 10, 2:19 pm, Jonathan Smith wrote: > > > 1.) use something mutable > > 2.) unroll all the loops (mapping is a loop) > > 3.) try not to coerce between seq/vec/hash-map too much. > > Are you saying this w.r.t. my code or in general? If the former, be

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread Mark Engelberg
On Mon, Aug 10, 2009 at 11:15 PM, Andy Fingerhut wrote: > I suspect I'm doing something wrong in my mutable Java array > implementation, but I don't see what it could be. There still seems to be a lot of boxing and unboxing going on. For example, in: (let [[momx momy momz] (offset-momentum bodie

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread Andy Fingerhut
On Aug 10, 5:57 pm, Mark Engelberg wrote: > Andy, > > My understanding is that any double that gets stored in a vector or > map is boxed, and therefore, the vast majority of your double > conversions aren't really doing anything, because when you pull them > out of the vector or map, they'll just

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread fft1976
On Aug 10, 2:19 pm, Jonathan Smith wrote: > 1.) use something mutable > 2.) unroll all the loops (mapping is a loop) > 3.) try not to coerce between seq/vec/hash-map too much. Are you saying this w.r.t. my code or in general? If the former, be specific, better yet, show us your code. I avoided (

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread fft1976
On Aug 10, 5:15 pm, Andy Fingerhut wrote: > OK, I've got a new Clojure program for the n-body benchmark, and it is > significantly faster than my previous one -- down from 138 x Java run > time, to 37 x Java run time.  Still room for improvement somewhere > there, I'm sure, including perhaps us

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread Mark Engelberg
Andy, My understanding is that any double that gets stored in a vector or map is boxed, and therefore, the vast majority of your double conversions aren't really doing anything, because when you pull them out of the vector or map, they'll just be Double objects again. I believe that the biggest

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread Isaac Gouy
On Aug 10, 3:00 pm, Andy Fingerhut wrote: > On Aug 10, 2:19 pm, Jonathan Smith wrote: > > > 1.) use something mutable > > 2.) unroll all the loops (mapping is a loop) > > 3.) try not to coerce between seq/vec/hash-map too much. > > > in real world, stuff like theshootoutis pretty useless, as g

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread Andy Fingerhut
OK, I've got a new Clojure program for the n-body benchmark, and it is significantly faster than my previous one -- down from 138 x Java run time, to 37 x Java run time. Still room for improvement somewhere there, I'm sure, including perhaps using Java arrays instead of Clojure vectors. http://g

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread Andy Fingerhut
On Aug 10, 2:19 pm, Jonathan Smith wrote: > 1.) use something mutable > 2.) unroll all the loops (mapping is a loop) > 3.) try not to coerce between seq/vec/hash-map too much. > > in real world, stuff like the shootout is pretty useless, as generally > you'd reach for a better algorithm rather th

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread Jonathan Smith
1.) use something mutable 2.) unroll all the loops (mapping is a loop) 3.) try not to coerce between seq/vec/hash-map too much. in real world, stuff like the shootout is pretty useless, as generally you'd reach for a better algorithm rather than implementing the shackled, crippled, naive algorith

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread Andy Fingerhut
On Aug 10, 11:35 am, fft1976 wrote: > On Aug 10, 4:46 am, Jarkko Oranen wrote: > > > I'm not going to start optimising, > > Somebody'd better! > > You always hear this dogma that one should write "elegant" code first > and optimize later, and when you do that, a few little changes can > make Clo

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread fft1976
On Aug 10, 4:46 am, Jarkko Oranen wrote: > I'm not going to start optimising, Somebody'd better! You always hear this dogma that one should write "elegant" code first and optimize later, and when you do that, a few little changes can make Clojure as fast as Java. Here's your chance to show it

Re: Pure-functional N-body benchmark implementation

2009-08-10 Thread Jarkko Oranen
On Aug 10, 12:41 pm, fft1976 wrote: > I just uploaded to the group an implementation of the n-body benchmark > in Clojure (see nbody_init.clj) > > http://shootout.alioth.debian.org/u32/benchmark.php?test=nbody〈=j... > > My goal was to write a pure-functional version and to avoid any micro- > opti

Pure-functional N-body benchmark implementation

2009-08-10 Thread fft1976
I just uploaded to the group an implementation of the n-body benchmark in Clojure (see nbody_init.clj) http://shootout.alioth.debian.org/u32/benchmark.php?test=nbody&lang=java&box=1 My goal was to write a pure-functional version and to avoid any micro- optimizations. There are no type declaratio