Hi, thanks again for all the explanations.
I kept digging, and now i m having the same behavior as this thread https://groups.google.com/forum/#!topic/golang-nuts/P1EFc4kMBfo I run test with memprofile enabled, then use pprof to see the results, but the numbers are all 0. $ go test -v -bench=. -cpuprofile=cpu.out -memprofile mem.out -benchmem - short stream/* === RUN TestReader --- PASS: TestReader (0.00s) === RUN TestReader2 --- PASS: TestReader2 (0.00s) BenchmarkByLine-4 100000 14654 ns/op 2248 B/op 39 allocs/op PASS ok command-line-arguments 1.635s $ go tool pprof stream.test mem.out Entering interactive mode (type "help" for commands) (pprof) top10 0 of 0 total ( 0%) flat flat% sum% cum cum% 0 0% 0% 0 0% bytes.(*Buffer).WriteString 0 0% 0% 0 0% bytes.(*Buffer).grow 0 0% 0% 0 0% bytes.Split 0 0% 0% 0 0% bytes.genSplit 0 0% 0% 0 0% bytes.makeSlice 0 0% 0% 0 0% command-line-arguments.(* Readable).Close 0 0% 0% 0 0% command-line-arguments.(* Readable).Pipe 0 0% 0% 0 0% command-line-arguments.(* Readable).Push 0 0% 0% 0 0% command-line-arguments.(* Readable).Read 0 0% 0% 0 0% command-line-arguments.(* Readable).ReadClose I tried to put b.ReportAllocs() in front of the benchmark loop. It did not help. At some points I added variations in the benchmark $ go test -v -bench=. -cpuprofile=cpu.out -memprofile mem.out -benchmem - short stream/* === RUN TestReader --- PASS: TestReader (0.00s) === RUN TestReader2 --- PASS: TestReader2 (0.00s) BenchmarkReader1-4 500000 2337 ns/op 976 B/op 12 allocs/op BenchmarkByLine-4 100000 16560 ns/op 2248 B/op 39 allocs/op BenchmarkByLine2-4 1000000 2113 ns/op 848 B/op 13 allocs/op PASS ok command-line-arguments 5.153s which gives $ go tool pprof stream.test mem.out Entering interactive mode (type "help" for commands) (pprof) top50 512.20kB of 512.20kB total ( 100%) Dropped 29 nodes (cum <= 2.56kB) flat flat% sum% cum cum% 512.20kB 100% 100% 512.20kB 100% runtime.malg 0 0% 100% 512.20kB 100% runtime.allocm 0 0% 100% 512.20kB 100% runtime.newm 0 0% 100% 512.20kB 100% runtime.startTheWorldWithSema 0 0% 100% 512.20kB 100% runtime.systemstack $ go tool pprof -inuse_objects stream.test mem.out Entering interactive mode (type "help" for commands) (pprof) top100 1260 of 1260 total ( 100%) Dropped 29 nodes (cum <= 6) flat flat% sum% cum cum% 1260 100% 100% 1260 100% runtime.malg 0 0% 100% 1260 100% runtime.allocm 0 0% 100% 1260 100% runtime.newm 0 0% 100% 1260 100% runtime.startTheWorldWithSema 0 0% 100% 1260 100% runtime.systemstack I m unclear because the tested func is definitely allocating, // BytesByLine emits chunks of []byte split by line (\n). func BytesByLine() *Transform { var buf []byte var EOL = []byte("\n") byLine := func () [][]byte { ret := make([][]byte, 0) lines := bytes.Split(buf, EOL) isEOL := len(buf)>0 && buf[len(buf)-1:][0]==EOL[0] for i, line := range lines { if i==len(lines)-1 { if isEOL && len(line)>0 { buf=buf[:0] ret = append(ret, line) } else { buf = line } } else { ret = append(ret, line) } } return ret } var TT TransformFunc TT = func (s Stream, data interface{}) { if val, ok := data.([]byte); ok { buf = append(buf, val...) for _, line := range byLine() { s.Push(line) } } else { s.Emit(errors.New("Value must be of type []byte")) } } var FF FlushFunc FF = func (s Stream) { for _, line := range byLine() { s.Push(line) } if len(buf)>0 { s.Push(buf) } } return NewTransformFlush(TT, FF) } Le mardi 20 septembre 2016 18:37:55 UTC+2, Ian Lance Taylor a écrit : > > On Tue, Sep 20, 2016 at 8:17 AM, mhhcbon <cpasmabo...@gmail.com > <javascript:>> wrote: > > > > on my way to learn to use pprof, i now face some intriguing results with > > lots of runtime.* calls, > > > > can you help me to understand what those call are and how i should care > > about ? > > > > (pprof) top30 > > 8720ms of 14380ms total (60.64%) > > Dropped 118 nodes (cum <= 71.90ms) > > Showing top 30 nodes out of 143 (cum >= 3800ms) > > flat flat% sum% cum cum% > > 720ms 5.01% 5.01% 1050ms 7.30% runtime.scanobject > > 700ms 4.87% 9.87% 3640ms 25.31% runtime.gentraceback > > 670ms 4.66% 14.53% 1780ms 12.38% runtime.pcvalue > > 490ms 3.41% 17.94% 490ms 3.41% runtime.readvarint > > 420ms 2.92% 20.86% 910ms 6.33% runtime.step > > 420ms 2.92% 23.78% 420ms 2.92% runtime/internal/atomic.Xchg > > 380ms 2.64% 26.43% 380ms 2.64% runtime.heapBitsForObject > > 340ms 2.36% 28.79% 1580ms 10.99% runtime.mallocgc > > 310ms 2.16% 30.95% 360ms 2.50% runtime.gopark > > 270ms 1.88% 32.82% 270ms 1.88% runtime.heapBitsSetType > > 260ms 1.81% 34.63% 510ms 3.55% runtime.scanblock > > 250ms 1.74% 36.37% 250ms 1.74% runtime/internal/atomic.Cas > > 240ms 1.67% 38.04% 4040ms 28.09% > > github.com/mh-cbon/chan/stream.(*Transform).Push > > 240ms 1.67% 39.71% 1210ms 8.41% runtime.schedule > > 230ms 1.60% 41.31% 230ms 1.60% runtime.newdefer > > 220ms 1.53% 42.84% 320ms 2.23% runtime.deferreturn > > 210ms 1.46% 44.30% 260ms 1.81% bytes.genSplit > > 210ms 1.46% 45.76% 260ms 1.81% runtime.greyobject > > 210ms 1.46% 47.22% 6080ms 42.28% runtime.systemstack > > 200ms 1.39% 48.61% 210ms 1.46% runtime.acquireSudog > > 200ms 1.39% 50.00% 1760ms 12.24% runtime.scanframeworker > > 190ms 1.32% 51.32% 790ms 5.49% sync.(*Mutex).Lock > > 180ms 1.25% 52.57% 180ms 1.25% runtime.gogo > > 180ms 1.25% 53.82% 180ms 1.25% runtime.memmove > > 170ms 1.18% 55.01% 790ms 5.49% runtime.chanrecv > > 170ms 1.18% 56.19% 530ms 3.69% runtime.lock > > 170ms 1.18% 57.37% 170ms 1.18% runtime.usleep > > 160ms 1.11% 58.48% 220ms 1.53% runtime.siftdownTimer > > 160ms 1.11% 59.60% 160ms 1.11% runtime/internal/atomic.Load > > 150ms 1.04% 60.64% 3800ms 26.43% main.main.func2 > > > > Also, I m not totally comfortable with flat Vs cum meanings, > > and those basics are not covered in > > https://blog.golang.org/profiling-go-programs > > > > Can you take a moment to explain those two notions ? > > All the top functions in this profile are being run by the garbage > collector. > > The "flat" numbers are the time used by just that function. The "cum" > numbers are the time used by that function and all the functions that > it calls. > > > > One more Q. In this pprof output > > > > (pprof) list Push > > Total: 14.38s > > ROUTINE ======================== > > github.com/mh-cbon/chan/stream.(*Readable).Push in > > /home/mh-cbon/gow/src/github.com/mh-cbon/chan/stream/reader.go > > 0 1.65s (flat, cum) 11.47% of Total > > . . 79:func (s *Readable) Push(b interface{}) { > > . . 80: for s.Paused() { > > . . 81: <- time.After(time.Millisecond * 1) > > . . 82: } > > . . 83: s.writeMutex.Lock() > > . 10ms 84: defer s.writeMutex.Unlock() > > . . 85: for _, o := range s.Pipes { > > . 1.64s 86: o.Write(b) > > . . 87: } > > . . 88:} > > . . 89:func (s *Readable) Catch(fn ErrorFunc) > Stream { > > . . 90: s.E = append(s.E, fn) > > . . 91: return s > > ROUTINE ======================== > > github.com/mh-cbon/chan/stream.(*Transform).Push in > > /home/mh-cbon/gow/src/github.com/mh-cbon/chan/stream/transform.go > > 240ms 4.13s (flat, cum) 28.72% of Total > > . . 72: } else { > > . . 73: s.Push(b) > > . . 74: } > > . . 75:} > > . . 76:// Push writes data down in the stream. > > 40ms 40ms 77:func (s *Transform) Push(b interface{}) { > > . . 78: // s.pipeMutex.Lock() > > . . 79: // defer s.pipeMutex.Unlock() > > 190ms 190ms 80: for _, o := range s.Pipes { > > 10ms 3.90s 81: o.Write(b) > > . . 82: } > > . . 83:} > > . . 84:func (s *Transform) Catch(fn ErrorFunc) > Stream > > { > > . . 85: s.E = append(s.E, fn) > > . . 86: return s > > > > > > > > The tow methods are very similar, but Transform.Push shows 190ms for the > for > > loop, where Readable.Push does not show anything. > > How to understand this ? > > Without more information, I would simply assume that Transform.Push is > being called more often. The times are the total amount of time that > the program is spending in that function. > > Ian > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.