This is not a serious problem, but it surprised me. (By the way, how can I post a message with code formatting?)
I'd like to create table-driven benchmarks: https://blog.golang.org/subtests. To that end, I started with code from *The Go Programming Language*: // PopCount is based on an example from chapter 2 of THE GO PROGRAMMING LANGUAGE. // Copyright © 2016 Alan A. A. Donovan & Brian W. Kernighan. // License: https://creativecommons.org/licenses/by-nc-sa/4.0/ package popcount // pc[i] is the population count of i. var pc [256]byte func init() { for i := range pc { pc[i] = pc[i/2] + byte(i&1) } } // PopCount returns the population count (number of set bits) of x. func PopCount(x uint64) int { return int(pc[byte(x>>(0*8))] + pc[byte(x>>(1*8))] + pc[byte(x>>(2*8))] + pc[byte(x>>(3*8))] + pc[byte(x>>(4*8))] + pc[byte(x>>(5*8))] + pc[byte(x>>(6*8))] + pc[byte(x>>(7*8))]) } I then wrote a simple test: // popcount_simple_test.go package popcount import "testing" func BenchmarkPopCountSimple(b *testing.B) { sum := 0 // Avoid dead code elimination. for i := 0; i < b.N; i++ { sum += PopCount(0x1234567890abcdef) } } Because I was paranoid about dead code elimination, I wrote an identical test that calls the function many times (inside the loop to call it b.N times, which should make no difference if the hypothetically-still-dead code is being eliminated): // popcount_slow_simple_test.go package popcount import "testing" func BenchmarkPopCountSlowSimple(b *testing.B) { sum := 0 // Avoid dead code elimination. for i := 0; i < b.N; i++ { // Exactly as before, but call the function many times. for j:= 0; j < 1_000_000; j++ { sum += PopCount(0x1234567890abcdef) } } } Then I wrote an "almost simple" test that uses testing.B.Run() but is hardwired to call the function being benchmarked: // popcount_almost_simple_test.go package popcount import "testing" func BenchmarkPopCountAlmostSimple(b *testing.B) { b.Run("BenchmarkPopCountAlmostSimple", func(b *testing.B) { sum := 0 // Avoid dead code elimination. for i := 0; i < b.N; i++ { sum += PopCount(0x1234567890abcdef) } }) } Finally, I wrote a "nearly-simple" test that passes arguments to Run() that could have come from a table: // popcount_nearly_simple.go package popcount import "testing" func BenchmarkPopCountNearlySimple(b *testing.B) { f := PopCount name := "PopCountNearlySimple" b.Run(name, func(b *testing.B) { sum := 0 // Avoid dead code elimination. for i := 0; i < b.N; i++ { sum += f(0x1234567890abcdef) } }) } The simple and almost-simple results are nearly identical (happily, the slow results are not), but the nearly-simple results are an order of magnitude slower: $ go version go version go1.16.4 windows/amd64 $ go test -cpu=1 -bench=. goos: windows goarch: amd64 pkg: gopl.io/popcount cpu: Intel(R) Core(TM) i5 CPU 760 @ 2.80GHz BenchmarkPopCountAlmostSimple/BenchmarkPopCountAlmostSimple 1000000000 0.6102 ns/op BenchmarkPopCountNearlySimple/PopCountNearlySimple 194925662 6.197 ns/op BenchmarkPopCountSimple 1000000000 0.5994 ns/op BenchmarkPopCountSlowSimple 1953 606194 ns/op PASS ok gopl.io/popcount 4.534s After reading this article: https://medium.com/a-journey-with-go/go-inlining-strategy-limitation-6b6d7fc3b1be I ran: $ go test -cpu=1 -bench=. -gcflags=-m 2>&1 | egrep 'inlining call to PopCount' .\popcount_almost_simple_test.go:9:19: inlining call to PopCount .\popcount_simple_test.go:8:18: inlining call to PopCount .\popcount_slow_simple_test.go:10:19: inlining call to PopCount That makes sense. 0.6 ns is less than a function call. The simple and almost-simple benchmarks can inline the call to the hardwired function, but the nearly-simple benchmark can't. In practice, this isn't much of a problem. Any function small enough to be inlined is unlikely to be a performance bottleneck. If it ever is, a non-table-driven benchmark can still measure it. Hope this helps. --PSRC P.S.: Out of curiosity, how can I post a message with fancy code examples like this one? https://groups.google.com/g/golang-nuts/c/5DgtH2Alt_I/m/hlsqdRSGAgAJ -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAG%3D3cjmT5CO3sTptR-dy3ByQB65EHGhYJCYuY_ZuQS52EOaJXQ%40mail.gmail.com.
// PopCount is based on an example from chapter 2 of THE GO PROGRAMMING LANGUAGE. // Copyright © 2016 Alan A. A. Donovan & Brian W. Kernighan. // License: https://creativecommons.org/licenses/by-nc-sa/4.0/ package popcount // pc[i] is the population count of i. var pc [256]byte func init() { for i := range pc { pc[i] = pc[i/2] + byte(i&1) } } // PopCount returns the population count (number of set bits) of x. func PopCount(x uint64) int { return int(pc[byte(x>>(0*8))] + pc[byte(x>>(1*8))] + pc[byte(x>>(2*8))] + pc[byte(x>>(3*8))] + pc[byte(x>>(4*8))] + pc[byte(x>>(5*8))] + pc[byte(x>>(6*8))] + pc[byte(x>>(7*8))]) }
package popcount import "testing" func BenchmarkPopCountAlmostSimple(b *testing.B) { b.Run("BenchmarkPopCountAlmostSimple", func(b *testing.B) { sum := 0 // Avoid dead code elimination. for i := 0; i < b.N; i++ { sum += PopCount(0x1234567890abcdef) } }) }
package popcount import "testing" func BenchmarkPopCountSlowSimple(b *testing.B) { sum := 0 // Avoid dead code elimination. for i := 0; i < b.N; i++ { // Exactly as before, but call the function many times. for j:= 0; j < 1_000_000; j++ { sum += PopCount(0x1234567890abcdef) } } }
package popcount import "testing" func BenchmarkPopCountSimple(b *testing.B) { sum := 0 // Avoid dead code elimination. for i := 0; i < b.N; i++ { sum += PopCount(0x1234567890abcdef) } }
package popcount import "testing" func BenchmarkPopCountNearlySimple(b *testing.B) { f := PopCount name := "PopCountNearlySimple" b.Run(name, func(b *testing.B) { sum := 0 // Avoid dead code elimination. for i := 0; i < b.N; i++ { sum += f(0x1234567890abcdef) } }) }