Thanks for the detailed thoughtful answer! Statistically you are correct. 95% of Go features are reasonably fast, and will get 99% job done very fast. But can we call the benchmark on the rest 5% slow part micro benchmarking?
This can easily become an objective discussion based on different experiences. We can get 99% people supporting that this is a micro benchmarking. However that would not help anything. Indeed this seems an ostrich approach. We are talking about Go language here, not any specific application. A fair benchmark can be measuring differences against other languages, showing the gap for Go to catch up. Anything we feel slow should be tracked so that we can see the performance improvement over time and newbies can have a correct expectation of the cost. Chris On Thu, Feb 2, 2017 at 3:26 PM, Michael Jones <michael.jo...@gmail.com> wrote: > Insight here feels tenuous. > > It is rare that a well-written "real" program would be measurably > influenced by something this small. As it happens, I often write such rare > programs (where a 2x faster math.Log() really might make the whole program > run 1.9x faster). Even as a poster child for it, I think it is very > uncommon. To get here you can never read or write data, transcode data, > communicate with devices, processes, or users, or do anything else so > typical of real programs. > > Benchmarks that measure one single thing in the absence of everything else > are often misleading. They hide the effect of the "rest of the time" as > outlined above and worse, they are immune to the good and bad of the > reality of real computers. For example, integer divide is slow, but is also > often infrequent. If you do such a divide every 100 cycles, the nature of > modern CPUs is that the delay will be hidden in overlapped execution of the > instruction stream and at worst, other instruction streams. A nothing but > integer divide micro benchmark will block on the 19-cycle or whatever > completion rate. That's the good. On the other hand, a "write over and over > to the same memory" micro benchmark will run fast thanks to multiple > layers of caching while the same thing on a real-program scale will have > cache contention effects that could be 1/20th the throughput. (This is the > same as measuring highway drive time at 3am vs 8am.) > > The most meaningful benchmark measures a real use case. In this situation > the resulting measurements directly interpret application performance. In > any synthetic benchmark, however, you can often be unsure how to apply the > result to the real world. The smaller and more focused the benchmark, the > less easy it is to learn from the result. > > Even simple statistical inference is subtly difficult; consider the fact > that the average person has less than two legs. If this is hard for people, > then how much harder to properly understand the whole-program meaning of a > 10x slowdown in type assertions? > > Michael > > On Thu, Feb 2, 2017 at 1:55 PM, Chris Lu <chris...@gmail.com> wrote: > >> Cool! Upgrading to 1.8.rc3 shows great improvement! I wish all problems >> can be resolved by upgrading. :) Here is the before and after result. >> >> chris$ go test -bench=. >> testing: warning: no tests to run >> BenchmarkAssertion-8 200000000 9.57 ns/op >> BenchmarkAssertionOK-8 200000000 9.02 ns/op >> BenchmarkBare-8 1000000000 2.09 ns/op >> BenchmarkIface-8 50000000 27.0 ns/op >> BenchmarkReflect-8 200000000 9.00 ns/op >> PASS >> ok _/Users/chris/tmp/test 11.980s >> >> chris$ go version >> go version go1.8rc3 darwin/amd64 >> >> chris$ go test -bench=. >> BenchmarkAssertion-8 1000000000 2.67 ns/op >> BenchmarkAssertionOK-8 1000000000 2.32 ns/op >> BenchmarkBare-8 1000000000 2.10 ns/op >> BenchmarkIface-8 50000000 26.0 ns/op >> BenchmarkReflect-8 200000000 8.80 ns/op >> PASS >> ok _/Users/chris/tmp/test 11.761s >> >> >> The time taken of the original type casting code shrinks from 8 seconds >> to 474ms! >> >> count=1073741824 time taken=474.843325ms >> count=1073741824 time taken=308.870452ms >> >> >> On Thu, Feb 2, 2017 at 1:31 PM, Ian Lance Taylor <i...@golang.org> wrote: >> >>> On Thu, Feb 2, 2017 at 1:04 PM, Chris Lu <chris...@gmail.com> wrote: >>> > >>> > I am trying to build a generic distributed map reduce system similar to >>> > Spark. Without generics, the APIs pass data via interface{}. For >>> example, a >>> > reducer is written this way: >>> > >>> > func sum(x, y interface{}) (interface{}, error) { >>> > >>> > return x.(uint64) + y.(uint64), nil >>> > >>> > } >>> > >>> > >>> > To be more generic, this framework also support LuaJIT. >>> > >>> > There is a noticeable difference in terms of performance difference. >>> LuaJIT >>> > is faster than pure Go. The profiling of pure Go showed the assertE2T >>> and >>> > assertI2T cost a non-trivial amount of time. >>> >>> Note that in the 1.8 release assertE2T and assertI2T no longer exist. >>> They were removed by https://golang.org/cl/32313. That should speed >>> up these cases; assertE2T and assertI2T were trivial, but in some >>> cases they did call typedmemmove. When your interface values store >>> non-pointers, and the code is inlined as it is in 1.8, the calls to >>> typedmemmove disappear. >>> >>> You might want to retry your benchmarks with the 1.8 release candidate >>> to see if you can observe any real difference. >>> >>> Ian >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "golang-nuts" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to golang-nuts+unsubscr...@googlegroups.com. >> For more options, visit https://groups.google.com/d/optout. >> > > > > -- > Michael T. Jones > michael.jo...@gmail.com > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.