Thanks, Peter. I actually don't mind a larger code if the performance can be improved. Also, we probably do computations more often with smaller integers (at least in my applications); that's why I overlooked the case greater than 10...
But now I'm using your (+Michael Jones and others) suggestions: // PowP computes real raised to positive integer xⁿ func PowP(x float64, n uint32) (r float64) { if n == 0 { return 1.0 } if n == 1 { return x } if n == 2 { return x * x } if n == 3 { return x * x * x } if n == 4 { r = x * x return r * r } if n == 5 { r = x * x return r * r * x } if n == 6 { r = x * x * x return r * r } if n == 7 { r = x * x * x return r * r * x } if n == 8 { r = x * x * x * x return r * r } if n == 9 { r = x * x * x return r * r * r } if n == 10 { r = x * x * x return r * r * r * x } r = 1.0 for i := n; i > 0; i >>= 1 { if i&1 == 1 { r *= x } x *= x } return } The output of the benchmark is: BenchmarkPowP10-32 50000000 26.4 ns/op BenchmarkPowP10std-32 5000000 245 ns/op BenchmarkPowP20-32 20000000 79.2 ns/op BenchmarkPowP20std-32 3000000 556 ns/op BenchmarkPowP50-32 5000000 255 ns/op BenchmarkPowP50std-32 1000000 1566 ns/op BenchmarkPowP100-32 2000000 616 ns/op BenchmarkPowP100std-32 500000 3594 ns/op BenchmarkPowP200-32 1000000 1365 ns/op BenchmarkPowP200std-32 200000 9184 ns/op So, the speedups for each N are (more or less): N=10: 9x N=20: 7x N=50: 6x N=100: 5x N=200: 6x (files here: https://gist.github.com/cpmech/9f871df5b59fa8407221b0d3fb361a3d) I'm thinking about looking at Go-ASM to see if I can improve this further. I may try to optimize the math.Sin() function too... Thanks also to Michael Jones. On Friday, August 4, 2017 at 10:36:16 PM UTC+10, peterGo wrote: > > Dorival Pedroso, > > PowP has a lot of code. PowG is simpler and it is modestly slower for > small values of n and much faster for larger values of n. > > func PowG(x float64, n uint32) float64 { > y := 1.0 > for i := n; i > 0; i >>= 1 { > if i&1 == 1 { > y *= x > } > x *= x > } > return y > } > > > BenchmarkPowP1 1000000000 2.89 ns/op > BenchmarkPowP5 100000000 13.6 ns/op > BenchmarkPowP10 50000000 29.2 ns/op > BenchmarkPowP20 20000000 89.6 ns/op > BenchmarkPowP200 200000 11666 ns/op > BenchmarkPowG1 1000000000 2.66 ns/op > BenchmarkPowG5 100000000 17.9 ns/op > BenchmarkPowG10 30000000 45.1 ns/op > BenchmarkPowG20 20000000 97.7 ns/op > BenchmarkPowG200 300000 3972 ns/op > > Peter > > On Friday, August 4, 2017 at 10:08:36 AM UTC, Dorival Pedroso wrote: >> >> Since my ASM skills are limited, for positive integers, I'm planning on >> using: >> >> // PowP computes real raised to positive integer xⁿ >> func PowP(x float64, n uint32) (r float64) { >> if n == 0 { >> return 1.0 >> } >> if n == 1 { >> return x >> } >> if n == 2 { >> return x * x >> } >> if n == 3 { >> return x * x * x >> } >> if n == 4 { >> r = x * x >> return r * r >> } >> if n == 5 { >> r = x * x >> return r * r * x >> } >> if n == 6 { >> r = x * x * x >> return r * r >> } >> if n == 7 { >> r = x * x * x >> return r * r * x >> } >> if n == 8 { >> r = x * x * x * x >> return r * r >> } >> if n == 9 { >> r = x * x * x >> return r * r * r >> } >> if n == 10 { >> r = x * x * x >> return r * r * r * x >> } >> r = x * x * x >> r = r * r * r * x >> var i uint32 >> for i = 11; i <= 20; i++ { >> r *= x >> if n == i { >> return >> } >> } >> return math.Pow(x, float64(n)) >> } >> >> which gives a speedup of 5-8 times for small positive integers (<20). the >> case of negative integers could be handled too. >> >> output: >> BenchmarkPowP10-32 50000000 30.4 ns/op >> BenchmarkPowP10std-32 5000000 247 ns/op >> BenchmarkPowP20-32 20000000 98.2 ns/op >> BenchmarkPowP20std-32 3000000 561 ns/op >> BenchmarkPowP200-32 200000 10145 ns/op >> BenchmarkPowP200std-32 200000 9231 ns/op >> >> all files are here: >> https://gist.github.com/cpmech/9f871df5b59fa8407221b0d3fb361a3d >> >> I'm also looking at the Cephes library: http://www.netlib.org/cephes/ >> for ideas >> >> maybe we could have a similar function in Go (and many more optimised >> ones, especially for complex numbers...) >> >> cheers >> d >> >> >> >> On Friday, August 4, 2017 at 4:20:41 PM UTC+10, Dorival Pedroso wrote: >>> >>> I've noticed that this C code: >>> >>> #include "math.h" >>> int main() { >>> double x = 2.5; >>> int Nmax = 10000000; >>> for (int N=0; N<Nmax; N++) { >>> for (int i=0; i<20; i++) { >>> pow(x, i); >>> } >>> } >>> } >>> >>> can run up to 50x faster than this Go code: >>> >>> package main >>> >>> import "math" >>> >>> func main() { >>> x := 2.5 >>> Nmax := 10000000 >>> for N := 0; N < Nmax; N++ { >>> for i := 0; i < 20; i++ { >>> math.Pow(x, float64(i)) >>> } >>> } >>> } >>> >>> The C code was compiled with: gcc -O2 ccode.c -o ccode -lm >>> then run with time ./ccode >>> >>> The Go code was compiled with: go build gcode.go >>> then run with time ./gcode >>> >>> I've used the time command on Linux (Ubuntu) to get some estimate. >>> >>> So the question is: how can we make the Go code faster? >>> >> -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.