Is it possible that C can't/isn't inlining fabs?
> On Jan 27, 2015, at 5:39 PM, Miles Lubin <[email protected]> wrote: > > I'm working on a microbenchmark and found a surprising result (maybe not?) > that Julia is about 2x faster than the same algorithm hand-coded in C. I > wanted to check if I'm doing anything obviously wrong here before reporting > these results. The timings reproduce across different systems and compiler > options (clang/gcc -O2/-O3). > > The test is just to compute square root using newton's method. The relevant > code is in this gist: https://gist.github.com/mlubin/4994c65c7a2fa90a3c7e. > > On Julia 0.3.5, each function call takes 8.85*10^-8 seconds. The best timing > I've seen from C is 1.61*10^-7 using gcc -O2 -march=native. > > I did my best to check for common mistakes: > - Julia and C use the exact same timing routine with 10,000 repetitions > - Both give the correct answer, and the important code isn't being optimized > away. > > Any ideas as to why Julia is faster on this very simple code? I know that > performance comparisons with runtimes on the order of nanoseconds are > probably not too meaningful, but people still like absolute numbers, and it's > a bit surprising that I can't match the performance of Julia from C. > > Thanks, > Miles
