Your latest benchmarks are invalid. In your benchmarks, replace b.StartTimer() with b.ResetTimer().
Simply run benchmarks. Don't run race detectors. Don't run profilers. For example, $ go version go version devel +504deee Sun Jul 16 03:57:11 2017 +0000 linux/amd64 $ go test -run=! -bench=. -benchmem -cpu=1,2,4,8 pubsub_test.go goos: linux goarch: amd64 BenchmarkPubSubPrimitiveChannelsMultiple 20000 86328 ns/op 61 B/op 2 allocs/op BenchmarkPubSubPrimitiveChannelsMultiple-2 30000 50844 ns/op 54 B/op 2 allocs/op BenchmarkPubSubPrimitiveChannelsMultiple-4 10000 112833 ns/op 83 B/op 2 allocs/op BenchmarkPubSubPrimitiveChannelsMultiple-8 10000 160011 ns/op 88 B/op 2 allocs/op BenchmarkPubSubWaitGroupMultiple 100000 21231 ns/op 40 B/op 2 allocs/op BenchmarkPubSubWaitGroupMultiple-2 10000 107165 ns/op 46 B/op 2 allocs/op BenchmarkPubSubWaitGroupMultiple-4 20000 73235 ns/op 43 B/op 2 allocs/op BenchmarkPubSubWaitGroupMultiple-8 20000 82917 ns/op 42 B/op 2 allocs/op PASS ok command-line-arguments 15.481s $ Peter On Sunday, July 16, 2017 at 9:51:38 PM UTC-4, Zohaib Sibte Hassan wrote: > > Thanks for pointing issues out I updated my code to get rid of race > conditions (nothing critical I was always doing reader-writer race). Anyhow > I updated my code on > https://gist.github.com/maxpert/f3c405c516ba2d4c8aa8b0695e0e054e. Still > doesn't explain the new results: > > $> go test -race -run=! -bench=. -benchmem -cpu=1,2,4,8 > -cpuprofile=cpu.out -memprofile=mem.out pubsub_test.go > BenchmarkPubSubPrimitiveChannelsMultiple 50 21121694 ns/op > 8515 B/op 39 allocs/op > BenchmarkPubSubPrimitiveChannelsMultiple-2 100 19302372 ns/op > 4277 B/op 20 allocs/op > BenchmarkPubSubPrimitiveChannelsMultiple-4 50 22674769 ns/op > 8182 B/op 35 allocs/op > BenchmarkPubSubPrimitiveChannelsMultiple-8 50 21201533 ns/op > 8469 B/op 38 allocs/op > BenchmarkPubSubWaitGroupMultiple 3000 501804 ns/op > 63 B/op 2 allocs/op > BenchmarkPubSubWaitGroupMultiple-2 200 15417944 ns/op > 407 B/op 6 allocs/op > BenchmarkPubSubWaitGroupMultiple-4 300 5010273 ns/op > 231 B/op 4 allocs/op > BenchmarkPubSubWaitGroupMultiple-8 200 5444634 ns/op > 334 B/op 5 allocs/op > PASS > ok command-line-arguments 21.775s > > So far my testing shows channels are slower for pubsub scenario. I tried > looking into pprof dumps of memory and CPU and it's not making sense to me. > What am I missing here? > > On Sunday, July 16, 2017 at 10:27:04 AM UTC-7, peterGo wrote: >> >> When you have data races the results are undefined. >> >> $ go version >> go version devel +dd81c37 Sat Jul 15 05:43:45 2017 +0000 linux/amd64 >> $ go test -race -run=! -bench=. -benchmem -cpu=1,2,4,8 pubsub_test.go >> ================== >> WARNING: DATA RACE >> Read at 0x00c4200140c0 by goroutine 18: >> command-line-arguments.BenchmarkPubSubPrimitiveChannelsMultiple() >> /home/peter/gopath/src/nuts/pubsub_test.go:59 +0x51d >> testing.(*B).runN() >> /home/peter/go/src/testing/benchmark.go:141 +0x12a >> testing.(*B).run1.func1() >> /home/peter/go/src/testing/benchmark.go:214 +0x6b >> >> Previous write at 0x00c4200140c0 by goroutine 57: >> [failed to restore the stack] >> >> Goroutine 18 (running) created at: >> testing.(*B).run1() >> /home/peter/go/src/testing/benchmark.go:207 +0x8c >> testing.(*B).Run() >> /home/peter/go/src/testing/benchmark.go:513 +0x482 >> testing.runBenchmarks.func1() >> /home/peter/go/src/testing/benchmark.go:417 +0xa7 >> testing.(*B).runN() >> /home/peter/go/src/testing/benchmark.go:141 +0x12a >> testing.runBenchmarks() >> /home/peter/go/src/testing/benchmark.go:423 +0x86d >> testing.(*M).Run() >> /home/peter/go/src/testing/testing.go:928 +0x51e >> main.main() >> command-line-arguments/_test/_testmain.go:46 +0x1d3 >> >> Goroutine 57 (finished) created at: >> command-line-arguments.BenchmarkPubSubPrimitiveChannelsMultiple() >> /home/peter/gopath/src/nuts/pubsub_test.go:40 +0x290 >> testing.(*B).runN() >> /home/peter/go/src/testing/benchmark.go:141 +0x12a >> testing.(*B).run1.func1() >> /home/peter/go/src/testing/benchmark.go:214 +0x6b >> ================== >> --- FAIL: BenchmarkPubSubPrimitiveChannelsMultiple >> benchmark.go:147: race detected during execution of benchmark >> ================== >> WARNING: DATA RACE >> Read at 0x00c42000c030 by goroutine 1079: >> command-line-arguments.BenchmarkPubSubWaitGroupMultiple.func1() >> /home/peter/gopath/src/nuts/pubsub_test.go:76 +0x9e >> >> Previous write at 0x00c42000c030 by goroutine 7: >> command-line-arguments.BenchmarkPubSubWaitGroupMultiple() >> /home/peter/gopath/src/nuts/pubsub_test.go:101 +0x475 >> testing.(*B).runN() >> /home/peter/go/src/testing/benchmark.go:141 +0x12a >> testing.(*B).run1.func1() >> /home/peter/go/src/testing/benchmark.go:214 +0x6b >> >> Goroutine 1079 (running) created at: >> command-line-arguments.BenchmarkPubSubWaitGroupMultiple() >> /home/peter/gopath/src/nuts/pubsub_test.go:93 +0x2e6 >> testing.(*B).runN() >> /home/peter/go/src/testing/benchmark.go:141 +0x12a >> testing.(*B).run1.func1() >> /home/peter/go/src/testing/benchmark.go:214 +0x6b >> >> Goroutine 7 (running) created at: >> testing.(*B).run1() >> /home/peter/go/src/testing/benchmark.go:207 +0x8c >> testing.(*B).Run() >> /home/peter/go/src/testing/benchmark.go:513 +0x482 >> testing.runBenchmarks.func1() >> /home/peter/go/src/testing/benchmark.go:417 +0xa7 >> testing.(*B).runN() >> /home/peter/go/src/testing/benchmark.go:141 +0x12a >> testing.runBenchmarks() >> /home/peter/go/src/testing/benchmark.go:423 +0x86d >> testing.(*M).Run() >> /home/peter/go/src/testing/testing.go:928 +0x51e >> main.main() >> command-line-arguments/_test/_testmain.go:46 +0x1d3 >> ================== >> ================== >> WARNING: DATA RACE >> Write at 0x00c42000c030 by goroutine 7: >> command-line-arguments.BenchmarkPubSubWaitGroupMultiple() >> /home/peter/gopath/src/nuts/pubsub_test.go:101 +0x475 >> testing.(*B).runN() >> /home/peter/go/src/testing/benchmark.go:141 +0x12a >> testing.(*B).run1.func1() >> /home/peter/go/src/testing/benchmark.go:214 +0x6b >> >> Previous read at 0x00c42000c030 by goroutine 1078: >> command-line-arguments.BenchmarkPubSubWaitGroupMultiple.func1() >> /home/peter/gopath/src/nuts/pubsub_test.go:76 +0x9e >> >> Goroutine 7 (running) created at: >> testing.(*B).run1() >> /home/peter/go/src/testing/benchmark.go:207 +0x8c >> testing.(*B).Run() >> /home/peter/go/src/testing/benchmark.go:513 +0x482 >> testing.runBenchmarks.func1() >> /home/peter/go/src/testing/benchmark.go:417 +0xa7 >> testing.(*B).runN() >> /home/peter/go/src/testing/benchmark.go:141 +0x12a >> testing.runBenchmarks() >> /home/peter/go/src/testing/benchmark.go:423 +0x86d >> testing.(*M).Run() >> /home/peter/go/src/testing/testing.go:928 +0x51e >> main.main() >> command-line-arguments/_test/_testmain.go:46 +0x1d3 >> >> Goroutine 1078 (running) created at: >> command-line-arguments.BenchmarkPubSubWaitGroupMultiple() >> /home/peter/gopath/src/nuts/pubsub_test.go:93 +0x2e6 >> testing.(*B).runN() >> /home/peter/go/src/testing/benchmark.go:141 +0x12a >> testing.(*B).run1.func1() >> /home/peter/go/src/testing/benchmark.go:214 +0x6b >> ================== >> ================== >> WARNING: DATA RACE >> Read at 0x00c4200140c8 by goroutine 7: >> command-line-arguments.BenchmarkPubSubWaitGroupMultiple() >> /home/peter/gopath/src/nuts/pubsub_test.go:109 +0x51d >> testing.(*B).runN() >> /home/peter/go/src/testing/benchmark.go:141 +0x12a >> testing.(*B).run1.func1() >> /home/peter/go/src/testing/benchmark.go:214 +0x6b >> >> Previous write at 0x00c4200140c8 by goroutine 175: >> sync/atomic.AddInt64() >> /home/peter/go/src/runtime/race_amd64.s:276 +0xb >> command-line-arguments.BenchmarkPubSubWaitGroupMultiple.func1() >> /home/peter/gopath/src/nuts/pubsub_test.go:88 +0x19a >> >> Goroutine 7 (running) created at: >> testing.(*B).run1() >> /home/peter/go/src/testing/benchmark.go:207 +0x8c >> testing.(*B).Run() >> /home/peter/go/src/testing/benchmark.go:513 +0x482 >> testing.runBenchmarks.func1() >> /home/peter/go/src/testing/benchmark.go:417 +0xa7 >> testing.(*B).runN() >> /home/peter/go/src/testing/benchmark.go:141 +0x12a >> testing.runBenchmarks() >> /home/peter/go/src/testing/benchmark.go:423 +0x86d >> testing.(*M).Run() >> /home/peter/go/src/testing/testing.go:928 +0x51e >> main.main() >> command-line-arguments/_test/_testmain.go:46 +0x1d3 >> >> Goroutine 175 (finished) created at: >> command-line-arguments.BenchmarkPubSubWaitGroupMultiple() >> /home/peter/gopath/src/nuts/pubsub_test.go:93 +0x2e6 >> testing.(*B).runN() >> /home/peter/go/src/testing/benchmark.go:141 +0x12a >> testing.(*B).run1.func1() >> /home/peter/go/src/testing/benchmark.go:214 +0x6b >> ================== >> --- FAIL: BenchmarkPubSubWaitGroupMultiple >> benchmark.go:147: race detected during execution of benchmark >> FAIL >> exit status 1 >> FAIL command-line-arguments 0.726s >> $ >> >> Peter >> >> On Sunday, July 16, 2017 at 10:20:21 AM UTC-4, Zohaib Sibte Hassan wrote: >>> >>> I have been spending my day over implementing an efficient PubSub >>> system. I had implemented one before using channels, and I wanted to >>> benchmark that against sync.Cond. Here is the quick and dirty test that I >>> put together >>> https://gist.github.com/maxpert/f3c405c516ba2d4c8aa8b0695e0e054e. Now >>> my confusion starts when I change GOMAXPROCS to test how it would perform >>> on my age old Raspberry Pi. Here are results: >>> >>> mxp@carbon:~/repos/raspchat/src/sibte.so/rascore$ GOMAXPROCS=8 go test >>> -run none -bench Multiple -cpuprofile=cpu.out -memprofile=mem.out -benchmem >>> BenchmarkPubSubPrimitiveChannelsMultiple-8 10000 165419 ns/op >>> 92 B/op 2 allocs/op >>> BenchmarkPubSubWaitGroupMultiple-8 10000 204685 ns/op >>> 53 B/op 2 allocs/op >>> PASS >>> ok sibte.so/rascore 3.749s >>> mxp@carbon:~/repos/raspchat/src/sibte.so/rascore$ GOMAXPROCS=4 go test >>> -run none -bench Multiple -cpuprofile=cpu.out -memprofile=mem.out -benchmem >>> BenchmarkPubSubPrimitiveChannelsMultiple-4 20000 101704 ns/op >>> 60 B/op 2 allocs/op >>> BenchmarkPubSubWaitGroupMultiple-4 10000 204039 ns/op >>> 52 B/op 2 allocs/op >>> PASS >>> ok sibte.so/rascore 5.087s >>> mxp@carbon:~/repos/raspchat/src/sibte.so/rascore$ GOMAXPROCS=2 go test >>> -run none -bench Multiple -cpuprofile=cpu.out -memprofile=mem.out -benchmem >>> BenchmarkPubSubPrimitiveChannelsMultiple-2 30000 51255 ns/op >>> 54 B/op 2 allocs/op >>> BenchmarkPubSubWaitGroupMultiple-2 20000 60871 ns/op >>> 43 B/op 2 allocs/op >>> PASS >>> ok sibte.so/rascore 4.022s >>> mxp@carbon:~/repos/raspchat/src/sibte.so/rascore$ GOMAXPROCS=1 go test >>> -run none -bench Multiple -cpuprofile=cpu.out -memprofile=mem.out -benchmem >>> BenchmarkPubSubPrimitiveChannelsMultiple 20000 79534 ns/op >>> 61 B/op 2 allocs/op >>> BenchmarkPubSubWaitGroupMultiple 100000 19066 ns/op >>> 40 B/op 2 allocs/op >>> PASS >>> ok sibte.so/rascore 4.502s >>> >>> I tried multiple times and results are consistent. I am using Go 1.8, >>> Linux x64, 8GB RAM. I have multiple questions: >>> >>> >>> - Why do channels perform worst than sync.Cond in single core >>> results? Context switching is same if anything it should perform worst. >>> - As I increase the max procs the sync.Cond results go down which >>> might be explainable, but what is up with channels? 20k to 30k to 20k to >>> 10k :( I have a i5 with 4 cores, so it should have peaked at 4 procs >>> (pst. >>> I tried 3 as well it's consistent). >>> >>> I am still suspicious I am not making some kind of mistake in code. Any >>> ideas? >>> >>> - Thanks >>> >> -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.