Go routines in a waiting state should not be consuming CPU. Are you certain 
they are not in constant transition from waiting to processing - this could 
show up as high CPU usage while everything looks blocks. 

I would use pprof - github.com/robaho/goanalyzer might be of assistance here to 
see the actual work being done. 

> On Aug 24, 2020, at 9:10 AM, Siddhesh Divekar <siddhesh.dive...@gmail.com> 
> wrote:
> 
> 
> Hi Ian,
> 
> Thanks for replying. 
> 
> We have a go server running which handles user requests & collects data from 
> various sources like gcp's cloud sql and big query.
> We are using shopify's sarama library to do kafka operations.
> 
> There are seeing lots of go routines in waiting state for several minutes. 
> Over the period of time around 587 goroutines have been spun up.
> 
> We see that two go routines are stuck on gcp big query and we are using wait 
> groups there.
> However, it's not clear why that would cause all other go routines to get 
> hung & make cpu go high.
> 
> goroutine 3332131 [semacquire, 79 minutes]:
> sync.runtime_Semacquire(0xc001c4fcf8)
>         /usr/local/go/src/runtime/sema.go:56 +0x42
> sync.(*WaitGroup).Wait(0xc001c4fcf0)
>         /usr/local/go/src/sync/waitgroup.go:130 +0x64
> git.fusion.io/fusionio/fusion/controller.git/stats.(*InsMgr).runParallelQuery(0xc001b54d40,
>  0xc002912c00, 0x330e1b0, 0xf, 0xc002912cf0, 0x3)
>         /builds/fusionio/fusion/controller/stats/ins_mgr.go:488 +0x1d7
> git.fusion.io/fusionio/fusion/controller.git/stats.(*InsMgr).GetMainUi(0xc001b54d40,
>  0xc002912db8, 0xc001870e68, 0x746121, 0xc0010fcaf8, 0x17)
>         /builds/fusionio/fusion/controller/stats/ins_mgr.go:567 +0xa0d
> git.fusion.io/fusionio/fusion/controller.git/stats.(*Prefetcher).fetchMainUiTeamInterval(0xc001b56780,
>  0xc002356810, 0x24, 0x32f7b78, 0x5)
>         /builds/fusionio/fusion/controller/stats/prefetcher.go:77 +0xf2
> created by 
> git.fusion.io/fusionio/fusion/controller.git/stats.(*Prefetcher).prefetchStats
>         /builds/fusionio/fusion/controller/stats/prefetcher.go:100 +0xd8
> 
> 
> goroutine 3332149 [semacquire, 79 minutes]:
> sync.runtime_Semacquire(0xc0015ede48)
>         /usr/local/go/src/runtime/sema.go:56 +0x42
> sync.(*WaitGroup).Wait(0xc0015ede40)
>         /usr/local/go/src/sync/waitgroup.go:130 +0x64
> git.fusion.io/fusionio/fusion/controller.git/stats.(*InsMgr).runParallelQuery(0xc001b54d40,
>  0xc00249dc00, 0x330e1b0, 0xf, 0xc00249dcf0, 0x3)
>         /builds/fusionio/fusion/controller/stats/ins_mgr.go:488 +0x1d7
> git.fusion.io/fusionio/fusion/controller.git/stats.(*InsMgr).GetMainUi(0xc001b54d40,
>  0xc00249ddb8, 0xc003200668, 0xc00407a520, 0xc003200590, 0x46ee97)
>         /builds/fusionio/fusion/controller/stats/ins_mgr.go:567 +0xa0d
> git.fusion.io/fusionio/fusion/controller.git/stats.(*Prefetcher).fetchMainUiTeamInterval(0xc001b56780,
>  0xc002356ba0, 0x24, 0x32f7b78, 0x5)
>         /builds/fusionio/fusion/controller/stats/prefetcher.go:77 +0xf2
> created by 
> git.fusion.io/fusionio/fusion/controller.git/stats.(*Prefetcher).prefetchStats
>         /builds/fusionio/fusion/controller/stats/prefetcher.go:100 +0xd8
> 
> I found the link below which kind of co-relates to our scenario.
> https://stackoverflow.com/questions/42238695/goroutine-in-io-wait-state-for-long-time
> 
> Most of the go routines in the backtrace are in a net/http package so our 
> suspicion is that above bug in our code might be causing that.
> Even the bigquery is getting hung in net/http.
> 
> We are using go version 1.13.8 & are running on gcp kubernetes cluster on 
> ubuntu 18.04 docker.
> 
> go env
> GO111MODULE=""
> GOARCH="amd64"
> GOBIN=""
> GOCACHE="/root/.cache/go-build"
> GOENV="/root/.config/go/env"
> GOEXE=""
> GOFLAGS=""
> GOHOSTARCH="amd64"
> GOHOSTOS="linux"
> GONOPROXY=""
> GONOSUMDB=""
> GOOS="linux"
> GOPATH="/root/go"
> GOPRIVATE=""
> GOPROXY="https://proxy.golang.org,direct";
> GOROOT="/usr/local/go"
> GOSUMDB="sum.golang.org"
> GOTMPDIR=""
> GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
> GCCGO="gccgo"
> AR="ar"
> CC="gcc"
> CXX="g++"
> CGO_ENABLED="1"
> GOMOD="/builds/prosimoio/prosimo/pdash/go.mod"
> CGO_CFLAGS="-g -O2"
> CGO_CPPFLAGS=""
> CGO_CXXFLAGS="-g -O2"
> CGO_FFLAGS="-g -O2"
> CGO_LDFLAGS="-g -O2"
> PKG_CONFIG="pkg-config"
> GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 
> -fdebug-prefix-map=/tmp/go-build048009048=/tmp/go-build 
> -gno-record-gcc-switches"
> 
> Let me know if any other information is needed.
> 
> 
>> On Sat, Aug 22, 2020 at 12:30 PM Ian Lance Taylor <i...@golang.org> wrote:
>> On Sat, Aug 22, 2020 at 12:06 PM Siddhesh Divekar
>> <siddhesh.dive...@gmail.com> wrote:
>> >
>> > We saw an issue with our process running in k8s on ubuntu 18.04.
>> > Most of the go routines are stuck for several minutes in http/http2 net 
>> > code.
>> >
>> > Have you seen similar issues ?
>> >
>> > goroutine 2800143 [select, 324 minutes]: 
>> > net/http.(*persistConn).readLoop(0xc00187d440) 
>> > /usr/local/go/src/net/http/transport.go:2032 +0x999 created by 
>> > net/http.(*Transport).dialConn 
>> > /usr/local/go/src/net/http/transport.go:1580 +0xb0d goroutine 2738894 [IO 
>> > wait, 352 minutes]: internal/poll.runtime_pollWait(0x7f5b61b280c0, 0x72, 
>> > 0xffffffffffffffff) /usr/local/go/src/runtime/netpoll.go:184 +0x55 
>> > internal/poll.(*pollDesc).wait(0xc0017e7e18, 0x72, 0x1000, 0x1000, 
>> > 0xffffffffffffffff) /usr/local/go/src/internal/poll/fd_poll_runtime.go:87 
>> > +0x45 internal/poll.(*pollDesc).waitRead(...) 
>> > /usr/local/go/src/internal/poll/fd_poll_runtime.go:92 
>> > internal/poll.(*FD).Read(0xc0017e7e00, 0xc0044a9000, 0x1000, 0x1000, 0x0, 
>> > 0x0, 0x0) /usr/local/go/src/internal/poll/fd_unix.go:169 +0x1cf 
>> > net.(*netFD).Read(0xc0017e7e00, 0xc0044a9000, 0x1000, 0x1000, 
>> > 0xc0026359e8, 0x49d7fd, 0xc0017e7e00) /usr/local/go/src/net/fd_unix.go:202 
>> > +0x4f net.(*conn).Read(0xc0000db8b8, 0xc0044a9000, 0x1000, 0x1000, 0x0, 
>> > 0x0, 0x0) /usr/local/go/src/net/net.go:184 +0x68 
>> > net/http.(*connReader).Read(0xc004a4fec0, 0xc0044a9000, 0x1000, 0x1000, 
>> > 0x0, 0x0, 0x0) /usr/local/go/src/net/http/server.go:785 +0xf4 
>> > bufio.(*Reader).fill(0xc003f1a360) /usr/local/go/src/bufio/bufio.go:100 
>> > +0x103 bufio.(*Reader).Peek(0xc003f1a360, 0x4, 0x0, 0x0, 0x0, 0x0, 
>> > 0xc002635ad0) /usr/local/go/src/bufio/bufio.go:138 +0x4f 
>> > net/http.(*conn).readRequest(0xc0028e1d60, 0x393ed20, 0xc0024e9780, 0x0, 
>> > 0x0, 0x0) /usr/local/go/src/net/http/server.go:962 +0xb3b 
>> > net/http.(*conn).serve(0xc0028e1d60, 0x393ed20, 0xc0024e9780) 
>> > /usr/local/go/src/net/http/server.go:1817 +0x6d4 created by 
>> > net/http.(*Server).Serve /usr/local/go/src/net/http/server.go:2928 +0x384
>> >
>> > Is there a know issue or something obvious from the backtrace here.
>> 
>> It's entirely normal for goroutines to sit in pollWait if they are
>> waiting for network I/O.  There may be reasons why this is incorrect
>> for your program, but you'll have to tell us those reasons.
>> 
>> Also, along with those reasons, please tell us the version of Go and
>> the exact environment that you are running.  Thanks.
>> 
>> Ian
> 
> 
> -- 
> -Siddhesh.
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/CAMjfk%2BggC%2BwgwM_%2BM5ML0SKD3qJphCrif%3D4c2AqB9v6n%2Btw5Jw%40mail.gmail.com.
> <backtrace.txt>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/A3102E99-2CDA-40EF-BEDF-8E5DF3DF3349%40ix.netcom.com.

Reply via email to