The tcp protocol allows the connection to wait for hours. Go routines stuck in wait do not burn CPU. Are the clients local or remote (over internet)?
> On Aug 24, 2020, at 10:29 PM, Siddhesh Divekar <siddhesh.dive...@gmail.com> > wrote: > > > Robert, > > We will do the profiling next time we hit the issue again & see what is > happening. > This was the first time we saw the issue & don't want to get rid of http2 > advantages without making sure it's the actual culprit. > > Do you think in the meanwhile we should do what the discussion below suggests > anyways ? > https://stackoverflow.com/questions/42238695/goroutine-in-io-wait-state-for-long-time > > >> On Mon, Aug 24, 2020 at 5:37 PM Robert Engels <reng...@ix.netcom.com> wrote: >> I think it is too hard to tell with the limited information. It could be >> exhausted connections or it could be thrashing (the claim of high cpu) >> >> I think you want to run profiling capture prior to hitting the stick state - >> you should be able to detect what is happening. >> >> If the referenced issue is related I would assume you should be able to >> connect by forcing http/1. >> >> You can also try disabling http/2 and see if your issue goes away. >> >>> On Aug 24, 2020, at 6:15 PM, Siddhesh Divekar <siddhesh.dive...@gmail.com> >>> wrote: >>> >>> Hi Robert, >>> >>> Sorry I missed your earlier response. >>> >>> From what we saw our UI was blocked and since everything was unresponsive >>> we had to recover the system by sending sig abort. >>> >>>> On Mon, Aug 24, 2020 at 4:11 PM Siddhesh Divekar >>>> <siddhesh.dive...@gmail.com> wrote: >>>> Looking at the no. of go routines we have does this apply to this issue ? >>>> https://github.com/golang/go/issues/27044 >>>> >>>>> On Mon, Aug 24, 2020 at 12:54 PM Robert Engels <reng...@ix.netcom.com> >>>>> wrote: >>>>> Go routines in a waiting state should not be consuming CPU. Are you >>>>> certain they are not in constant transition from waiting to processing - >>>>> this could show up as high CPU usage while everything looks blocks. >>>>> >>>>> I would use pprof - github.com/robaho/goanalyzer might be of assistance >>>>> here to see the actual work being done. >>>>> >>>>>>> On Aug 24, 2020, at 9:10 AM, Siddhesh Divekar >>>>>>> <siddhesh.dive...@gmail.com> wrote: >>>>>>> >>>>>> >>>>>> Hi Ian, >>>>>> >>>>>> Thanks for replying. >>>>>> >>>>>> We have a go server running which handles user requests & collects data >>>>>> from various sources like gcp's cloud sql and big query. >>>>>> We are using shopify's sarama library to do kafka operations. >>>>>> >>>>>> There are seeing lots of go routines in waiting state for several >>>>>> minutes. >>>>>> Over the period of time around 587 goroutines have been spun up. >>>>>> >>>>>> We see that two go routines are stuck on gcp big query and we are using >>>>>> wait groups there. >>>>>> However, it's not clear why that would cause all other go routines to >>>>>> get hung & make cpu go high. >>>>>> >>>>>> goroutine 3332131 [semacquire, 79 minutes]: >>>>>> sync.runtime_Semacquire(0xc001c4fcf8) >>>>>> /usr/local/go/src/runtime/sema.go:56 +0x42 >>>>>> sync.(*WaitGroup).Wait(0xc001c4fcf0) >>>>>> /usr/local/go/src/sync/waitgroup.go:130 +0x64 >>>>>> git.fusion.io/fusionio/fusion/controller.git/stats.(*InsMgr).runParallelQuery(0xc001b54d40, >>>>>> 0xc002912c00, 0x330e1b0, 0xf, 0xc002912cf0, 0x3) >>>>>> /builds/fusionio/fusion/controller/stats/ins_mgr.go:488 +0x1d7 >>>>>> git.fusion.io/fusionio/fusion/controller.git/stats.(*InsMgr).GetMainUi(0xc001b54d40, >>>>>> 0xc002912db8, 0xc001870e68, 0x746121, 0xc0010fcaf8, 0x17) >>>>>> /builds/fusionio/fusion/controller/stats/ins_mgr.go:567 +0xa0d >>>>>> git.fusion.io/fusionio/fusion/controller.git/stats.(*Prefetcher).fetchMainUiTeamInterval(0xc001b56780, >>>>>> 0xc002356810, 0x24, 0x32f7b78, 0x5) >>>>>> /builds/fusionio/fusion/controller/stats/prefetcher.go:77 +0xf2 >>>>>> created by >>>>>> git.fusion.io/fusionio/fusion/controller.git/stats.(*Prefetcher).prefetchStats >>>>>> /builds/fusionio/fusion/controller/stats/prefetcher.go:100 +0xd8 >>>>>> >>>>>> >>>>>> goroutine 3332149 [semacquire, 79 minutes]: >>>>>> sync.runtime_Semacquire(0xc0015ede48) >>>>>> /usr/local/go/src/runtime/sema.go:56 +0x42 >>>>>> sync.(*WaitGroup).Wait(0xc0015ede40) >>>>>> /usr/local/go/src/sync/waitgroup.go:130 +0x64 >>>>>> git.fusion.io/fusionio/fusion/controller.git/stats.(*InsMgr).runParallelQuery(0xc001b54d40, >>>>>> 0xc00249dc00, 0x330e1b0, 0xf, 0xc00249dcf0, 0x3) >>>>>> /builds/fusionio/fusion/controller/stats/ins_mgr.go:488 +0x1d7 >>>>>> git.fusion.io/fusionio/fusion/controller.git/stats.(*InsMgr).GetMainUi(0xc001b54d40, >>>>>> 0xc00249ddb8, 0xc003200668, 0xc00407a520, 0xc003200590, 0x46ee97) >>>>>> /builds/fusionio/fusion/controller/stats/ins_mgr.go:567 +0xa0d >>>>>> git.fusion.io/fusionio/fusion/controller.git/stats.(*Prefetcher).fetchMainUiTeamInterval(0xc001b56780, >>>>>> 0xc002356ba0, 0x24, 0x32f7b78, 0x5) >>>>>> /builds/fusionio/fusion/controller/stats/prefetcher.go:77 +0xf2 >>>>>> created by >>>>>> git.fusion.io/fusionio/fusion/controller.git/stats.(*Prefetcher).prefetchStats >>>>>> /builds/fusionio/fusion/controller/stats/prefetcher.go:100 +0xd8 >>>>>> >>>>>> I found the link below which kind of co-relates to our scenario. >>>>>> https://stackoverflow.com/questions/42238695/goroutine-in-io-wait-state-for-long-time >>>>>> >>>>>> Most of the go routines in the backtrace are in a net/http package so >>>>>> our suspicion is that above bug in our code might be causing that. >>>>>> Even the bigquery is getting hung in net/http. >>>>>> >>>>>> We are using go version 1.13.8 & are running on gcp kubernetes cluster >>>>>> on ubuntu 18.04 docker. >>>>>> >>>>>> go env >>>>>> GO111MODULE="" >>>>>> GOARCH="amd64" >>>>>> GOBIN="" >>>>>> GOCACHE="/root/.cache/go-build" >>>>>> GOENV="/root/.config/go/env" >>>>>> GOEXE="" >>>>>> GOFLAGS="" >>>>>> GOHOSTARCH="amd64" >>>>>> GOHOSTOS="linux" >>>>>> GONOPROXY="" >>>>>> GONOSUMDB="" >>>>>> GOOS="linux" >>>>>> GOPATH="/root/go" >>>>>> GOPRIVATE="" >>>>>> GOPROXY="https://proxy.golang.org,direct" >>>>>> GOROOT="/usr/local/go" >>>>>> GOSUMDB="sum.golang.org" >>>>>> GOTMPDIR="" >>>>>> GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64" >>>>>> GCCGO="gccgo" >>>>>> AR="ar" >>>>>> CC="gcc" >>>>>> CXX="g++" >>>>>> CGO_ENABLED="1" >>>>>> GOMOD="/builds/prosimoio/prosimo/pdash/go.mod" >>>>>> CGO_CFLAGS="-g -O2" >>>>>> CGO_CPPFLAGS="" >>>>>> CGO_CXXFLAGS="-g -O2" >>>>>> CGO_FFLAGS="-g -O2" >>>>>> CGO_LDFLAGS="-g -O2" >>>>>> PKG_CONFIG="pkg-config" >>>>>> GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 >>>>>> -fdebug-prefix-map=/tmp/go-build048009048=/tmp/go-build >>>>>> -gno-record-gcc-switches" >>>>>> >>>>>> Let me know if any other information is needed. >>>>>> >>>>>> >>>>>>> On Sat, Aug 22, 2020 at 12:30 PM Ian Lance Taylor <i...@golang.org> >>>>>>> wrote: >>>>>>> On Sat, Aug 22, 2020 at 12:06 PM Siddhesh Divekar >>>>>>> <siddhesh.dive...@gmail.com> wrote: >>>>>>> > >>>>>>> > We saw an issue with our process running in k8s on ubuntu 18.04. >>>>>>> > Most of the go routines are stuck for several minutes in http/http2 >>>>>>> > net code. >>>>>>> > >>>>>>> > Have you seen similar issues ? >>>>>>> > >>>>>>> > goroutine 2800143 [select, 324 minutes]: >>>>>>> > net/http.(*persistConn).readLoop(0xc00187d440) >>>>>>> > /usr/local/go/src/net/http/transport.go:2032 +0x999 created by >>>>>>> > net/http.(*Transport).dialConn >>>>>>> > /usr/local/go/src/net/http/transport.go:1580 +0xb0d goroutine 2738894 >>>>>>> > [IO wait, 352 minutes]: >>>>>>> > internal/poll.runtime_pollWait(0x7f5b61b280c0, 0x72, >>>>>>> > 0xffffffffffffffff) /usr/local/go/src/runtime/netpoll.go:184 +0x55 >>>>>>> > internal/poll.(*pollDesc).wait(0xc0017e7e18, 0x72, 0x1000, 0x1000, >>>>>>> > 0xffffffffffffffff) >>>>>>> > /usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x45 >>>>>>> > internal/poll.(*pollDesc).waitRead(...) >>>>>>> > /usr/local/go/src/internal/poll/fd_poll_runtime.go:92 >>>>>>> > internal/poll.(*FD).Read(0xc0017e7e00, 0xc0044a9000, 0x1000, 0x1000, >>>>>>> > 0x0, 0x0, 0x0) /usr/local/go/src/internal/poll/fd_unix.go:169 +0x1cf >>>>>>> > net.(*netFD).Read(0xc0017e7e00, 0xc0044a9000, 0x1000, 0x1000, >>>>>>> > 0xc0026359e8, 0x49d7fd, 0xc0017e7e00) >>>>>>> > /usr/local/go/src/net/fd_unix.go:202 +0x4f >>>>>>> > net.(*conn).Read(0xc0000db8b8, 0xc0044a9000, 0x1000, 0x1000, 0x0, >>>>>>> > 0x0, 0x0) /usr/local/go/src/net/net.go:184 +0x68 >>>>>>> > net/http.(*connReader).Read(0xc004a4fec0, 0xc0044a9000, 0x1000, >>>>>>> > 0x1000, 0x0, 0x0, 0x0) /usr/local/go/src/net/http/server.go:785 +0xf4 >>>>>>> > bufio.(*Reader).fill(0xc003f1a360) >>>>>>> > /usr/local/go/src/bufio/bufio.go:100 +0x103 >>>>>>> > bufio.(*Reader).Peek(0xc003f1a360, 0x4, 0x0, 0x0, 0x0, 0x0, >>>>>>> > 0xc002635ad0) /usr/local/go/src/bufio/bufio.go:138 +0x4f >>>>>>> > net/http.(*conn).readRequest(0xc0028e1d60, 0x393ed20, 0xc0024e9780, >>>>>>> > 0x0, 0x0, 0x0) /usr/local/go/src/net/http/server.go:962 +0xb3b >>>>>>> > net/http.(*conn).serve(0xc0028e1d60, 0x393ed20, 0xc0024e9780) >>>>>>> > /usr/local/go/src/net/http/server.go:1817 +0x6d4 created by >>>>>>> > net/http.(*Server).Serve /usr/local/go/src/net/http/server.go:2928 >>>>>>> > +0x384 >>>>>>> > >>>>>>> > Is there a know issue or something obvious from the backtrace here. >>>>>>> >>>>>>> It's entirely normal for goroutines to sit in pollWait if they are >>>>>>> waiting for network I/O. There may be reasons why this is incorrect >>>>>>> for your program, but you'll have to tell us those reasons. >>>>>>> >>>>>>> Also, along with those reasons, please tell us the version of Go and >>>>>>> the exact environment that you are running. Thanks. >>>>>>> >>>>>>> Ian >>>>>> >>>>>> >>>>>> -- >>>>>> -Siddhesh. >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "golang-nuts" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>>> an email to golang-nuts+unsubscr...@googlegroups.com. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/golang-nuts/CAMjfk%2BggC%2BwgwM_%2BM5ML0SKD3qJphCrif%3D4c2AqB9v6n%2Btw5Jw%40mail.gmail.com. >>>>>> <backtrace.txt> >>>> >>>> >>>> -- >>>> -Siddhesh. >>> >>> >>> -- >>> -Siddhesh. > > > -- > -Siddhesh. -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/08C2763C-9853-492B-B651-C5DDF8219A9A%40ix.netcom.com.