Even without typed racket, this: (define (my-sqrt y) (let loop [(x (/ y 2.0))] (let [(error (- y (* x x)))] (if (< (abs error) epsilon) x (loop (+ x (/ error 2 x)))))))
is much faster than this: (define (my-sqrt y) (let loop [(x (/ y 2))] (let [(error (- y (* x x)))] (if (< (abs error) epsilon) x (loop (+ x (/ error 2 x))))))) Robby On Mon, May 9, 2011 at 9:34 PM, Matthias Felleisen <matth...@ccs.neu.edu> wrote: > > [Sam, Vincent: I have no clue how TR can beat R by two orders of magnitude. > See end of message.] > I wrote a little function to test your two implementations of sqrt on lists > of random numbers: > (define (time-and-compare LARGE) > (define n (build-list LARGE (lambda (_) (random 10)))) > (collect-garbage) (collect-garbage) (collect-garbage) > (define m (time (for-each approved-sqrt n))) > (collect-garbage) (collect-garbage) (collect-garbage) > (define l (time (for-each the-sqrt n))) > (void)) > As you can see below, approved-sqrt is consistently faster on 'short' lists > (100 numbers) than my-sqrt; on 'long' lists (10,000 numbers) the opposite is > true. > > On May 9, 2011, at 7:38 PM, Patrick King wrote: > > Question 1: Is my use of the temporary "error" worth it in terms of > performance (the approved solution calculates (* x x) twice)? If it's not > worth it, how big does the calculation have to become before it does? > > It depends on where in the inner loop your calculations are and how many > your program must perform. See below. > > Can the compiler make such a temporary live only in a register? > > In principle yes. I am not sure about our JIT. > > Question 1a: How creeped are you by my changing the meaning of error for > these few lines? I find myself doing stuff like this, in the name of my idea > of readability, a lot. > > Not one bit. I tend to agree with readability, though I catch myself > thinking in the back of my head that this is also good for performance. (It > shouldn't make a difference.) > > Question 2: The approved solution favored (/ ... (+ x x)) over my (/ ... 2 > x). Cost of division versus readability. Wasn't a conscious decision of mine > at the time, but now I'm curious. How smart is the compiler when presented > with a division by 2? With multiple recalls of the same value? > > The two versions differ in two regards: > 1. Your version flips the +/- calculations on the errors. (I eliminated > this, the performance results stay the same.) > 2. You also replace an addition for duplication with a multiplication. A > compiler would go the other way around (and I may do so too, and think how > silly I am). > I find myself unable to explain the differences in performance. Someone > should probably look at the assembly and check what's going on. > > Here are five measurements for lists of 10,000 numbers: > >> (time-and-compare 10000) > > cpu time: 1196 real time: 1223 gc time: 26 > > cpu time: 1177 real time: 1229 gc time: 27 > >> (time-and-compare 10000) > > cpu time: 1192 real time: 1242 gc time: 27 > > cpu time: 1172 real time: 1250 gc time: 27 > >> (time-and-compare 10000) > > cpu time: 1197 real time: 1308 gc time: 28 > > cpu time: 1172 real time: 1215 gc time: 27 > >> (time-and-compare 10000) > > cpu time: 1198 real time: 1230 gc time: 29 > > cpu time: 1186 real time: 1286 gc time: 29 > >> (time-and-compare 10000) > > cpu time: 1208 real time: 1289 gc time: 26 > > cpu time: 1194 real time: 1303 gc time: 28 > > Now let's look at small values: > >> (time-and-compare 100) > > cpu time: 27 real time: 28 gc time: 0 > > cpu time: 33 real time: 40 gc time: 0 > >> (time-and-compare 100) > > cpu time: 28 real time: 29 gc time: 0 > > cpu time: 33 real time: 34 gc time: 0 > >> (time-and-compare 100) > > cpu time: 27 real time: 30 gc time: 0 > > cpu time: 32 real time: 38 gc time: 0 > >> (time-and-compare 100) > > cpu time: 30 real time: 31 gc time: 0 > > cpu time: 32 real time: 33 gc time: 0 > >> (time-and-compare 100) > > cpu time: 26 real time: 27 gc time: 0 > > cpu time: 34 real time: 33 gc time: 0 > > Now here is the scary part. Typed Racket is a TON faster and it was nearly > no work to get types to work: > >> (time-and-compare 10000) > cpu time: 9 real time: 9 gc time: 0 > cpu time: 17 real time: 18 gc time: 0 > - : Boolean > #t >> (time-and-compare 10000) > cpu time: 24 real time: 34 gc time: 0 > cpu time: 17 real time: 18 gc time: 0 > - : Boolean > #t >> (time-and-compare 10000) > cpu time: 13 real time: 14 gc time: 0 > cpu time: 28 real time: 37 gc time: 0 > - : Boolean > #t >> (time-and-compare 10000) > cpu time: 27 real time: 28 gc time: 0 > cpu time: 17 real time: 18 gc time: 0 > - : Boolean > #t >> (time-and-compare 10000) > cpu time: 24 real time: 29 gc time: 0 > cpu time: 17 real time: 18 gc time: 0 > - : Boolean > > > ;; > ----------------------------------------------------------------------------- > #lang typed/racket > (define epsilon 1e-7) > (: approved-sqrt : Natural -> Real) > (define (approved-sqrt y) > (let loop [(x (/ y 2.0))] > (if (< (abs (- (* x x) y)) epsilon) > x > (loop (- x (/ (- (* x x) y) (+ x x))))))) > (: my-sqrt : Natural -> Real) > (define (my-sqrt y) > (let loop [(x (/ y 2.0))] > (let [(error (- y (* x x)))] > (if (< (abs error) epsilon) > x > (loop (+ x (/ error 2 x))))))) > (: time-and-compare : Natural -> Boolean) > (define (time-and-compare LARGE) > (define n (build-list LARGE (lambda (_) (random 10)))) > (collect-garbage) (collect-garbage) (collect-garbage) > (define m (time (map approved-sqrt n))) > (collect-garbage) (collect-garbage) (collect-garbage) > (define l (time (map my-sqrt n))) > (andmap = m l)) > (time-and-compare 1000) > > > > _________________________________________________ > For list-related administrative tasks: > http://lists.racket-lang.org/listinfo/users > _________________________________________________ For list-related administrative tasks: http://lists.racket-lang.org/listinfo/users