Superb! Thank you Ian. That is indeed insightful. In my case (as can be seen in the commit message showing the benchstat output), allocations for small ASCII test cases have a delta of +860%, which is 8 times more bytes allocated. You've said:
> Then you have to figure out why they are different. This is exactly what I was trying to figure out how I would be able to do, and more specifically, if there's an easy way to find out. On Tuesday, 10 October 2017 15:38:58 UTC+2, Ian Lance Taylor wrote: > > On Tue, Oct 10, 2017 at 12:50 AM, Gabriel Aszalos > <gabriel...@gmail.com <javascript:>> wrote: > > > > I would love to find out the answer to this. Even if you don't know the > > answer but know how to investigate into it (using pprof or some tracing > > flags), I would also appreciate being guided in the right direction and > I > > would love to embark on the journey of finding out myself. > > > > What I'm basically saying is that I'd be more interested to find out the > way > > in which I can tell why one is faster than the other, as opposed to > hearing > > just the final answer. Hope that makes sense. > > Start by writing a standalone x_test.go program that provides both > versions of the code and uses the benchmark framework to measure both. > When you can repeat the issue with -bench=., run it with -benchmem to > print the memory allocations. If they are different, that is probably > the cause. Then you have to figure out why they are different. If > they are not different, you'll need to look at the generated code. On > GNU/Linux, the system's perf tool can help you identify which parts of > the code take more time. For a larger program I would suggest pprof, > but pprof is better at pointing you at a specific function then it is > at identifying which part of the function is slow. > > Ian > > > > On Thursday, 5 October 2017 15:22:47 UTC+2, Marvin Stenger wrote: > >> > >> I can reproduce the numbers. The only think I'm seeing is that the > spans > >> array is allocated on the stack. Not sure though if this is the only > reason. > >> > >> Am Donnerstag, 5. Oktober 2017 13:13:56 UTC+2 schrieb Gabriel Aszalos: > >>> > >>> I was playing around with the implementation of FieldsFunc from the > bytes > >>> package and I was wondering how it would affect the benchmarks to > disregard > >>> the extra slice that was used there to calculate offsets. It only made > sense > >>> that it would make things faster. > >>> > >>> To my amusement (although expected), it didn't. But I'm quite curious > why > >>> one is faster than the other and if this reveals any good practices > when > >>> working with similar algorithms. The benchmark and diff I am talking > about > >>> can be viewed here: > >>> > >>> > >>> > https://github.com/gbbr/go/commit/2f6e92bc746fa232f2f2aea66dae3fa0c04700a5?diff=split > > >>> > >>> Many thanks for looking! > >> > >> > > -- > > You received this message because you are subscribed to the Google > Groups > > "golang-nuts" group. > > To unsubscribe from this group and stop receiving emails from it, send > an > > email to golang-nuts...@googlegroups.com <javascript:>. > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.