And the guess how it's arriving at that optimization: Step 0: base code
x := 2.5 res := 0.0 for n := 0; n < NMAX; n++ { for i := 0; i < 4; i++ { res += math.Pow(x, float(i)) } } Step 1: unroll inner-loop x := 2.5 res := 0.0 for n := 0; n < NMAX; n++ { res += math.Pow(x, float(0)) res += math.Pow(x, float(1)) res += math.Pow(x, float(2)) res += math.Pow(x, float(3)) } Step 2: detect that x doesn't change and inline res := 0.0 for n := 0; n < NMAX; n++ { res += math.Pow(2.5, float(0)) res += math.Pow(2.5, float(1)) res += math.Pow(2.5, float(2)) res += math.Pow(2.5, float(3)) } Step 3: compute constant function Pow res := 0.0 for n := 0; n < NMAX; n++ { res += 1.0 res += 2.5 res += 6.25 res += 15.625 } On Friday, 4 August 2017 11:35:43 UTC+3, Egon wrote: > > Use the Assembly Luke. > > https://godbolt.org/g/nGFMbf > > It looks like clang manages to compute a table of powers of x. Which, is > very very impressive. > > Which is roughly https://play.golang.org/p/CZkiJKfe7s -- except clang, > also does inner loop unrolling. > > + Egon > > On Friday, 4 August 2017 11:16:10 UTC+3, Dorival Pedroso wrote: >> >> wait, what?! >> >> the same code: >> >> #include "stdio.h" >> #include "math.h" >> int main() { >> double res = 0.0; >> double x = 2.5; >> int Nmax = 10000000; >> for (int N=0; N<Nmax; N++) { >> for (int i=0; i<20; i++) { >> res += pow(x, i); >> } >> } >> printf("res = %g\n", res); >> } >> >> compiled with Clang is much faster? >> >> yes, I get: >> >> gcc -O2 c-code-sum.c -o c-code-sum_GCC -lm >> time ./c-code-sum_GCC >> res = 6.0633e+14 >> >> real 0m8.210s >> user 0m8.208s >> sys 0m0.000s >> >> and: >> >> clang -O2 c-code-sum.c -o c-code-sum_CLANG -lm >> time ./c-code-sum_CLANG >> res = 6.0633e+14 >> >> real 0m0.157s >> user 0m0.156s >> sys 0m0.004s >> >> So the question might be: what is CLANG doing? (and how to do this with >> Go?...) >> >> >> On Friday, August 4, 2017 at 6:04:01 PM UTC+10, Sebastien Binet wrote: >>> >>> >>> >>> On Fri, Aug 4, 2017 at 9:51 AM, Henrik Johansson <dahan...@gmail.com> >>> wrote: >>> >>>> Actually I get the same as the original program on my mac. >>>> >>>> time ./ccode >>>> sum=606329794183272.375000 >>>> ./ccode 0.17s user 0.00s system 98% cpu 0.170 total >>>> >>>> The Go version -O2 -Wall >>>> time ./pow >>>> sum=6.063297941832724e+14./pow 5.47s user 0.01s system 99% cpu 5.490 >>>> total >>>> >>> >>> interesting :) >>> >>> I have: >>> >>> $> gcc --version >>> gcc (GCC) 7.1.1 20170630 >>> Copyright (C) 2017 Free Software Foundation, Inc. >>> This is free software; see the source for copying conditions. There is >>> NO >>> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR >>> PURPOSE. >>> >>> also interestingly, with clang: >>> $> clang --version >>> clang version 4.0.1 (tags/RELEASE_401/final) >>> Target: x86_64-unknown-linux-gnu >>> Thread model: posix >>> InstalledDir: /usr/bin >>> >>> $> time ./c-code >>> >>> real 0m0.001s >>> user 0m0.001s >>> sys 0m0.000s >>> >>> $> time ./c-code-sum >>> sum=606329794183272.375000 >>> >>> real 0m0.228s >>> user 0m0.227s >>> sys 0m0.000s >>> >>> I haven't looked at the assembly to see what's really going on for the >>> GCC version... >>> >>> -s >>> >>> -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.