As Ian has pointed out, software and hardware optimizations may distort the results of microbenchmarks.
If I run the original benchmarks using go1.18 and go1.19 the Copy and CopyG ns/op results are reversed. # Original: https://go.dev/play/p/m1ClnbdbdWi ~/x$ go1.18 version && go1.18 test x_0_test.go -bench=. go version go1.18.1 linux/amd64 BenchmarkCopy-8 3820009 310.7 ns/op BenchmarkCopyG-8 7552230 158.3 ns/op ~/x$ go version && go test x_0_test.go -bench=. go version devel go1.19-a11a885cb5 Mon Apr 18 23:57:00 2022 +0000 linux/amd64 BenchmarkCopy-8 7577499 158.2 ns/op BenchmarkCopyG-8 3870822 309.7 ns/op If I run the benchmarks with a ResetTimer() using go1.18 and go1.19 the Copy and CopyG ns/op results are effectively the same. # ResetTimer: https://go.dev/play/p/hansq5ARrSh ~/x$ go1.18 version && go1.18 test x_1_test.go -bench=. go version go1.18.1 linux/amd64 BenchmarkCopy-8 7581037 158.0 ns/op BenchmarkCopyG-8 7590849 157.9 ns/op ~/x$ go version && go test x_1_test.go -bench=. go version devel go1.19-a11a885cb5 Mon Apr 18 23:57:00 2022 +0000 linux/amd64 BenchmarkCopy-8 7525525 158.5 ns/op BenchmarkCopyG-8 7521787 158.5 ns/op If I run the original benchmarks on a Celeron N3450 using go1.18 and go1.19 the Copy and CopyG ns/op results are effectively the same. I run benchmarks on a Celeron N3450 because Intel disables many hardware optimizations on cheap hardware. # Original: https://go.dev/play/p/m1ClnbdbdWi $ go1.18 version && go1.18 test x_0_test.go -bench=. go version go1.18.1 linux/amd64 BenchmarkCopy-4 2773934 428.0 ns/op BenchmarkCopyG-4 2783043 429.7 ns/op $ go version && go test x_0_test.go -bench=. go version devel go1.19-a11a885cb5 Mon Apr 18 23:57:00 2022 +0000 linux/amd64 BenchmarkCopy-4 2781676 428.6 ns/op BenchmarkCopyG-4 2765179 429.0 ns/op Peter On Tuesday, April 19, 2022 at 12:37:48 AM UTC-4 Ian Lance Taylor wrote: > On Mon, Apr 18, 2022 at 1:14 PM Feng Tian <fen...@gmail.com> wrote: > > > > Hi, I have the following simple benchmark code, > > > > https://go.dev/play/p/m1ClnbdbdWi > > > > I run this on my laptop since Go playground does not run benchmark code. > The strange thing is that Copy of float64 is slower than copy using > generics. I can imagine generics may add no overhead, but how can it be > faster? > > > > ftian@DESKTOP-16FCU43:~/tmp$ go test -bench=. > > goos: linux > > goarch: amd64 > > pkg: a > > cpu: 11th Gen Intel(R) Core(TM) i7-11370H @ 3.30GHz > > BenchmarkCopy-8 5693944 221.7 ns/op > > BenchmarkCopyG-8 8885454 137.1 ns/op > > PASS > > ok a 2.838s > > The numbers for this kind of micro-benchmark can be deceptive. For > example, they can be highly affected by alignment of the instruction > loop. I don't know exactly what is happening for you. I compiled the > code with "go test -c" and disassembled it: both benchmark functions > contained exactly the same instructions. > > Ian > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/2347d30b-c517-450e-97e2-361b6f01929fn%40googlegroups.com.