As Ian has pointed out, software and hardware optimizations may distort the 
results of microbenchmarks.

If I run the original benchmarks using go1.18 and go1.19 the Copy and CopyG 
ns/op results are reversed.

# Original: https://go.dev/play/p/m1ClnbdbdWi

~/x$ go1.18 version && go1.18 test x_0_test.go -bench=.
go version go1.18.1 linux/amd64
BenchmarkCopy-8     3820009    310.7 ns/op
BenchmarkCopyG-8    7552230    158.3 ns/op

~/x$ go version && go test x_0_test.go -bench=.
go version devel go1.19-a11a885cb5 Mon Apr 18 23:57:00 2022 +0000 
linux/amd64
BenchmarkCopy-8     7577499    158.2 ns/op
BenchmarkCopyG-8    3870822    309.7 ns/op

If I run the benchmarks with a ResetTimer() using go1.18 and go1.19 the 
Copy and CopyG ns/op results are effectively the same.

# ResetTimer: https://go.dev/play/p/hansq5ARrSh

~/x$ go1.18 version && go1.18 test x_1_test.go -bench=.
go version go1.18.1 linux/amd64
BenchmarkCopy-8     7581037    158.0 ns/op
BenchmarkCopyG-8    7590849    157.9 ns/op

~/x$ go version && go test x_1_test.go -bench=.
go version devel go1.19-a11a885cb5 Mon Apr 18 23:57:00 2022 +0000 
linux/amd64
BenchmarkCopy-8     7525525    158.5 ns/op
BenchmarkCopyG-8    7521787    158.5 ns/op

If I run the original benchmarks on a Celeron N3450 using go1.18 and go1.19 
the Copy and CopyG ns/op results are effectively the same. I run benchmarks 
on a Celeron N3450 because Intel disables many hardware optimizations on 
cheap hardware.
 
# Original: https://go.dev/play/p/m1ClnbdbdWi

$ go1.18 version && go1.18 test x_0_test.go -bench=.
go version go1.18.1 linux/amd64
BenchmarkCopy-4         2773934           428.0 ns/op
BenchmarkCopyG-4        2783043           429.7 ns/op

$  go version && go test x_0_test.go -bench=.
go version devel go1.19-a11a885cb5 Mon Apr 18 23:57:00 2022 +0000 
linux/amd64
BenchmarkCopy-4         2781676           428.6 ns/op
BenchmarkCopyG-4        2765179           429.0 ns/op


Peter

On Tuesday, April 19, 2022 at 12:37:48 AM UTC-4 Ian Lance Taylor wrote:

> On Mon, Apr 18, 2022 at 1:14 PM Feng Tian <fen...@gmail.com> wrote:
> >
> > Hi, I have the following simple benchmark code,
> >
> > https://go.dev/play/p/m1ClnbdbdWi
> >
> > I run this on my laptop since Go playground does not run benchmark code. 
> The strange thing is that Copy of float64 is slower than copy using 
> generics. I can imagine generics may add no overhead, but how can it be 
> faster?
> >
> > ftian@DESKTOP-16FCU43:~/tmp$ go test -bench=.
> > goos: linux
> > goarch: amd64
> > pkg: a
> > cpu: 11th Gen Intel(R) Core(TM) i7-11370H @ 3.30GHz
> > BenchmarkCopy-8 5693944 221.7 ns/op
> > BenchmarkCopyG-8 8885454 137.1 ns/op
> > PASS
> > ok a 2.838s
>
> The numbers for this kind of micro-benchmark can be deceptive. For
> example, they can be highly affected by alignment of the instruction
> loop. I don't know exactly what is happening for you. I compiled the
> code with "go test -c" and disassembled it: both benchmark functions
> contained exactly the same instructions.
>
> Ian
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/2347d30b-c517-450e-97e2-361b6f01929fn%40googlegroups.com.

Reply via email to