That should allow your server to clean up “dead” clients. Typically you use 
this in conjunction with a ‘keep alive’ in the protocol.

I am doubtful that a bunch of dead clients hanging around would cause a CPU 
spike. You really don’t have too many Go routines/connections involved (I’ve 
worked with 1000’s of live connections).

I would look at something else… I am guessing your synchronization is incorrect 
and your threads are blocking, and you have a few that are spinning expecting 
something to happen that never will (because the source of the event is blocked 
on the mutex/lock).



> On Aug 26, 2020, at 10:09 AM, Siddhesh Divekar <siddhesh.dive...@gmail.com> 
> wrote:
> 
> Robert,
> 
> I assume we can safely add these timeouts based on what we expect
> should be a reasonable timeout for our clients ?
> 
> s.ReadTimeout = expTimeOut * time.Second
> s.WriteTimeout = expTimeOut * time.Second
> 
> On Tue, Aug 25, 2020 at 1:14 PM Siddhesh Divekar <siddhesh.dive...@gmail.com 
> <mailto:siddhesh.dive...@gmail.com>> wrote:
> Both servers and data sources are in the cloud.
> I would not say a lot of data, it's precomputed data which shouldn't take 
> that long.
> 
> 
> On Tue, Aug 25, 2020 at 11:25 AM Robert Engels <reng...@ix.netcom.com 
> <mailto:reng...@ix.netcom.com>> wrote:
> Are you transferring a lot of data? Are the servers non-cloud hosted? 
> 
> You could be encountering “tcp stalls”. 
> 
>> On Aug 25, 2020, at 9:24 AM, Siddhesh Divekar <siddhesh.dive...@gmail.com 
>> <mailto:siddhesh.dive...@gmail.com>> wrote:
>> 
>> 
>> Clients are over the internet.
>> 
>> On Tue, Aug 25, 2020 at 3:25 AM Robert Engels <reng...@ix.netcom.com 
>> <mailto:reng...@ix.netcom.com>> wrote:
>> The tcp protocol allows the connection to wait for hours. Go routines stuck 
>> in wait do not burn CPU. Are the clients local or remote (over internet)?
>> 
>>> On Aug 24, 2020, at 10:29 PM, Siddhesh Divekar <siddhesh.dive...@gmail.com 
>>> <mailto:siddhesh.dive...@gmail.com>> wrote:
>>> 
>>> 
>>> Robert,
>>> 
>>> We will do the profiling next time we hit the issue again & see what is 
>>> happening.
>>> This was the first time we saw the issue & don't want to get rid of http2 
>>> advantages without making sure it's the actual culprit.
>>> 
>>> Do you think in the meanwhile we should do what the discussion below 
>>> suggests anyways ?
>>> https://stackoverflow.com/questions/42238695/goroutine-in-io-wait-state-for-long-time
>>>  
>>> <https://stackoverflow.com/questions/42238695/goroutine-in-io-wait-state-for-long-time>
>>> 
>>> 
>>> On Mon, Aug 24, 2020 at 5:37 PM Robert Engels <reng...@ix.netcom.com 
>>> <mailto:reng...@ix.netcom.com>> wrote:
>>> I think it is too hard to tell with the limited information. It could be 
>>> exhausted connections or it could be thrashing (the claim of high cpu)
>>> 
>>> I think you want to run profiling capture prior to hitting the stick state 
>>> - you should be able to detect what is happening. 
>>> 
>>> If the referenced issue is related I would assume you should be able to 
>>> connect by forcing http/1. 
>>> 
>>> You can also try disabling http/2 and see if your issue goes away. 
>>> 
>>> On Aug 24, 2020, at 6:15 PM, Siddhesh Divekar <siddhesh.dive...@gmail.com 
>>> <mailto:siddhesh.dive...@gmail.com>> wrote:
>>> 
>>>> Hi Robert,
>>>> 
>>>> Sorry I missed your earlier response.
>>>> 
>>>> From what we saw our UI was blocked and since everything was unresponsive
>>>> we had to recover the system by sending sig abort. 
>>>> 
>>>> On Mon, Aug 24, 2020 at 4:11 PM Siddhesh Divekar 
>>>> <siddhesh.dive...@gmail.com <mailto:siddhesh.dive...@gmail.com>> wrote:
>>>> Looking at the no. of go routines we have does this apply to this issue ?
>>>> https://github.com/golang/go/issues/27044 
>>>> <https://github.com/golang/go/issues/27044>
>>>> 
>>>> On Mon, Aug 24, 2020 at 12:54 PM Robert Engels <reng...@ix.netcom.com 
>>>> <mailto:reng...@ix.netcom.com>> wrote:
>>>> Go routines in a waiting state should not be consuming CPU. Are you 
>>>> certain they are not in constant transition from waiting to processing - 
>>>> this could show up as high CPU usage while everything looks blocks. 
>>>> 
>>>> I would use pprof - github.com/robaho/goanalyzer 
>>>> <http://github.com/robaho/goanalyzer> might be of assistance here to see 
>>>> the actual work being done. 
>>>> 
>>>>> On Aug 24, 2020, at 9:10 AM, Siddhesh Divekar <siddhesh.dive...@gmail.com 
>>>>> <mailto:siddhesh.dive...@gmail.com>> wrote:
>>>>> 
>>>>> 
>>>>> Hi Ian,
>>>>> 
>>>>> Thanks for replying. 
>>>>> 
>>>>> We have a go server running which handles user requests & collects data 
>>>>> from various sources like gcp's cloud sql and big query.
>>>>> We are using shopify's sarama library to do kafka operations.
>>>>> 
>>>>> There are seeing lots of go routines in waiting state for several 
>>>>> minutes. 
>>>>> Over the period of time around 587 goroutines have been spun up.
>>>>> 
>>>>> We see that two go routines are stuck on gcp big query and we are using 
>>>>> wait groups there.
>>>>> However, it's not clear why that would cause all other go routines to get 
>>>>> hung & make cpu go high.
>>>>> 
>>>>> goroutine 3332131 [semacquire, 79 minutes]:
>>>>> sync.runtime_Semacquire(0xc001c4fcf8)
>>>>>         /usr/local/go/src/runtime/sema.go:56 +0x42
>>>>> sync.(*WaitGroup).Wait(0xc001c4fcf0)
>>>>>         /usr/local/go/src/sync/waitgroup.go:130 +0x64
>>>>> git.fusion.io/fusionio/fusion/controller.git/stats.(*InsMgr).runParallelQuery(0xc001b54d40
>>>>>  
>>>>> <http://git.fusion.io/fusionio/fusion/controller.git/stats.(*InsMgr).runParallelQuery(0xc001b54d40>,
>>>>>  0xc002912c00, 0x330e1b0, 0xf, 0xc002912cf0, 0x3)
>>>>>         /builds/fusionio/fusion/controller/stats/ins_mgr.go:488 +0x1d7
>>>>> git.fusion.io/fusionio/fusion/controller.git/stats.(*InsMgr).GetMainUi(0xc001b54d40
>>>>>  
>>>>> <http://git.fusion.io/fusionio/fusion/controller.git/stats.(*InsMgr).GetMainUi(0xc001b54d40>,
>>>>>  0xc002912db8, 0xc001870e68, 0x746121, 0xc0010fcaf8, 0x17)
>>>>>         /builds/fusionio/fusion/controller/stats/ins_mgr.go:567 +0xa0d
>>>>> git.fusion.io/fusionio/fusion/controller.git/stats.(*Prefetcher).fetchMainUiTeamInterval(0xc001b56780
>>>>>  
>>>>> <http://git.fusion.io/fusionio/fusion/controller.git/stats.(*Prefetcher).fetchMainUiTeamInterval(0xc001b56780>,
>>>>>  0xc002356810, 0x24, 0x32f7b78, 0x5)
>>>>>         /builds/fusionio/fusion/controller/stats/prefetcher.go:77 +0xf2
>>>>> created by 
>>>>> git.fusion.io/fusionio/fusion/controller.git/stats.(*Prefetcher).prefetchStats
>>>>>  
>>>>> <http://git.fusion.io/fusionio/fusion/controller.git/stats.(*Prefetcher).prefetchStats>
>>>>>         /builds/fusionio/fusion/controller/stats/prefetcher.go:100 +0xd8
>>>>> 
>>>>> 
>>>>> goroutine 3332149 [semacquire, 79 minutes]:
>>>>> sync.runtime_Semacquire(0xc0015ede48)
>>>>>         /usr/local/go/src/runtime/sema.go:56 +0x42
>>>>> sync.(*WaitGroup).Wait(0xc0015ede40)
>>>>>         /usr/local/go/src/sync/waitgroup.go:130 +0x64
>>>>> git.fusion.io/fusionio/fusion/controller.git/stats.(*InsMgr).runParallelQuery(0xc001b54d40
>>>>>  
>>>>> <http://git.fusion.io/fusionio/fusion/controller.git/stats.(*InsMgr).runParallelQuery(0xc001b54d40>,
>>>>>  0xc00249dc00, 0x330e1b0, 0xf, 0xc00249dcf0, 0x3)
>>>>>         /builds/fusionio/fusion/controller/stats/ins_mgr.go:488 +0x1d7
>>>>> git.fusion.io/fusionio/fusion/controller.git/stats.(*InsMgr).GetMainUi(0xc001b54d40
>>>>>  
>>>>> <http://git.fusion.io/fusionio/fusion/controller.git/stats.(*InsMgr).GetMainUi(0xc001b54d40>,
>>>>>  0xc00249ddb8, 0xc003200668, 0xc00407a520, 0xc003200590, 0x46ee97)
>>>>>         /builds/fusionio/fusion/controller/stats/ins_mgr.go:567 +0xa0d
>>>>> git.fusion.io/fusionio/fusion/controller.git/stats.(*Prefetcher).fetchMainUiTeamInterval(0xc001b56780
>>>>>  
>>>>> <http://git.fusion.io/fusionio/fusion/controller.git/stats.(*Prefetcher).fetchMainUiTeamInterval(0xc001b56780>,
>>>>>  0xc002356ba0, 0x24, 0x32f7b78, 0x5)
>>>>>         /builds/fusionio/fusion/controller/stats/prefetcher.go:77 +0xf2
>>>>> created by 
>>>>> git.fusion.io/fusionio/fusion/controller.git/stats.(*Prefetcher).prefetchStats
>>>>>  
>>>>> <http://git.fusion.io/fusionio/fusion/controller.git/stats.(*Prefetcher).prefetchStats>
>>>>>         /builds/fusionio/fusion/controller/stats/prefetcher.go:100 +0xd8
>>>>> 
>>>>> I found the link below which kind of co-relates to our scenario.
>>>>> https://stackoverflow.com/questions/42238695/goroutine-in-io-wait-state-for-long-time
>>>>>  
>>>>> <https://stackoverflow.com/questions/42238695/goroutine-in-io-wait-state-for-long-time>
>>>>> 
>>>>> Most of the go routines in the backtrace are in a net/http package so our 
>>>>> suspicion is that above bug in our code might be causing that.
>>>>> Even the bigquery is getting hung in net/http.
>>>>> 
>>>>> We are using go version 1.13.8 & are running on gcp kubernetes cluster on 
>>>>> ubuntu 18.04 docker.
>>>>> 
>>>>> go env
>>>>> GO111MODULE=""
>>>>> GOARCH="amd64"
>>>>> GOBIN=""
>>>>> GOCACHE="/root/.cache/go-build"
>>>>> GOENV="/root/.config/go/env"
>>>>> GOEXE=""
>>>>> GOFLAGS=""
>>>>> GOHOSTARCH="amd64"
>>>>> GOHOSTOS="linux"
>>>>> GONOPROXY=""
>>>>> GONOSUMDB=""
>>>>> GOOS="linux"
>>>>> GOPATH="/root/go"
>>>>> GOPRIVATE=""
>>>>> GOPROXY="https://proxy.golang.org <https://proxy.golang.org/>,direct"
>>>>> GOROOT="/usr/local/go"
>>>>> GOSUMDB="sum.golang.org <http://sum.golang.org/>"
>>>>> GOTMPDIR=""
>>>>> GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
>>>>> GCCGO="gccgo"
>>>>> AR="ar"
>>>>> CC="gcc"
>>>>> CXX="g++"
>>>>> CGO_ENABLED="1"
>>>>> GOMOD="/builds/prosimoio/prosimo/pdash/go.mod"
>>>>> CGO_CFLAGS="-g -O2"
>>>>> CGO_CPPFLAGS=""
>>>>> CGO_CXXFLAGS="-g -O2"
>>>>> CGO_FFLAGS="-g -O2"
>>>>> CGO_LDFLAGS="-g -O2"
>>>>> PKG_CONFIG="pkg-config"
>>>>> GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 
>>>>> -fdebug-prefix-map=/tmp/go-build048009048=/tmp/go-build 
>>>>> -gno-record-gcc-switches"
>>>>> 
>>>>> Let me know if any other information is needed.
>>>>> 
>>>>> 
>>>>> On Sat, Aug 22, 2020 at 12:30 PM Ian Lance Taylor <i...@golang.org 
>>>>> <mailto:i...@golang.org>> wrote:
>>>>> On Sat, Aug 22, 2020 at 12:06 PM Siddhesh Divekar
>>>>> <siddhesh.dive...@gmail.com <mailto:siddhesh.dive...@gmail.com>> wrote:
>>>>> >
>>>>> > We saw an issue with our process running in k8s on ubuntu 18.04.
>>>>> > Most of the go routines are stuck for several minutes in http/http2 net 
>>>>> > code.
>>>>> >
>>>>> > Have you seen similar issues ?
>>>>> >
>>>>> > goroutine 2800143 [select, 324 minutes]: 
>>>>> > net/http.(*persistConn).readLoop(0xc00187d440) 
>>>>> > /usr/local/go/src/net/http/transport.go:2032 +0x999 created by 
>>>>> > net/http.(*Transport).dialConn 
>>>>> > /usr/local/go/src/net/http/transport.go:1580 +0xb0d goroutine 2738894 
>>>>> > [IO wait, 352 minutes]: internal/poll.runtime_pollWait(0x7f5b61b280c0, 
>>>>> > 0x72, 0xffffffffffffffff) /usr/local/go/src/runtime/netpoll.go:184 
>>>>> > +0x55 internal/poll.(*pollDesc).wait(0xc0017e7e18, 0x72, 0x1000, 
>>>>> > 0x1000, 0xffffffffffffffff) 
>>>>> > /usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x45 
>>>>> > internal/poll.(*pollDesc).waitRead(...) 
>>>>> > /usr/local/go/src/internal/poll/fd_poll_runtime.go:92 
>>>>> > internal/poll.(*FD).Read(0xc0017e7e00, 0xc0044a9000, 0x1000, 0x1000, 
>>>>> > 0x0, 0x0, 0x0) /usr/local/go/src/internal/poll/fd_unix.go:169 +0x1cf 
>>>>> > net.(*netFD).Read(0xc0017e7e00, 0xc0044a9000, 0x1000, 0x1000, 
>>>>> > 0xc0026359e8, 0x49d7fd, 0xc0017e7e00) 
>>>>> > /usr/local/go/src/net/fd_unix.go:202 +0x4f 
>>>>> > net.(*conn).Read(0xc0000db8b8, 0xc0044a9000, 0x1000, 0x1000, 0x0, 0x0, 
>>>>> > 0x0) /usr/local/go/src/net/net.go:184 +0x68 
>>>>> > net/http.(*connReader).Read(0xc004a4fec0, 0xc0044a9000, 0x1000, 0x1000, 
>>>>> > 0x0, 0x0, 0x0) /usr/local/go/src/net/http/server.go:785 +0xf4 
>>>>> > bufio.(*Reader).fill(0xc003f1a360) /usr/local/go/src/bufio/bufio.go:100 
>>>>> > +0x103 bufio.(*Reader).Peek(0xc003f1a360, 0x4, 0x0, 0x0, 0x0, 0x0, 
>>>>> > 0xc002635ad0) /usr/local/go/src/bufio/bufio.go:138 +0x4f 
>>>>> > net/http.(*conn).readRequest(0xc0028e1d60, 0x393ed20, 0xc0024e9780, 
>>>>> > 0x0, 0x0, 0x0) /usr/local/go/src/net/http/server.go:962 +0xb3b 
>>>>> > net/http.(*conn).serve(0xc0028e1d60, 0x393ed20, 0xc0024e9780) 
>>>>> > /usr/local/go/src/net/http/server.go:1817 +0x6d4 created by 
>>>>> > net/http.(*Server).Serve /usr/local/go/src/net/http/server.go:2928 
>>>>> > +0x384
>>>>> >
>>>>> > Is there a know issue or something obvious from the backtrace here.
>>>>> 
>>>>> It's entirely normal for goroutines to sit in pollWait if they are
>>>>> waiting for network I/O.  There may be reasons why this is incorrect
>>>>> for your program, but you'll have to tell us those reasons.
>>>>> 
>>>>> Also, along with those reasons, please tell us the version of Go and
>>>>> the exact environment that you are running.  Thanks.
>>>>> 
>>>>> Ian
>>>>> 
>>>>> 
>>>>> -- 
>>>>> -Siddhesh.
>>>>> 
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google Groups 
>>>>> "golang-nuts" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>>> email to golang-nuts+unsubscr...@googlegroups.com 
>>>>> <mailto:golang-nuts+unsubscr...@googlegroups.com>.
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/golang-nuts/CAMjfk%2BggC%2BwgwM_%2BM5ML0SKD3qJphCrif%3D4c2AqB9v6n%2Btw5Jw%40mail.gmail.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/golang-nuts/CAMjfk%2BggC%2BwgwM_%2BM5ML0SKD3qJphCrif%3D4c2AqB9v6n%2Btw5Jw%40mail.gmail.com?utm_medium=email&utm_source=footer>.
>>>>> <backtrace.txt>
>>>> 
>>>> 
>>>> -- 
>>>> -Siddhesh.
>>>> 
>>>> 
>>>> -- 
>>>> -Siddhesh.
>>> 
>>> 
>>> -- 
>>> -Siddhesh.
>> 
>> 
>> -- 
>> -Siddhesh.
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to golang-nuts+unsubscr...@googlegroups.com 
>> <mailto:golang-nuts+unsubscr...@googlegroups.com>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/golang-nuts/CAMjfk%2BgQDHshxmGFiUFh6Ket9fp%2BwMuhgH6-FksyzMp9wz%2BXug%40mail.gmail.com
>>  
>> <https://groups.google.com/d/msgid/golang-nuts/CAMjfk%2BgQDHshxmGFiUFh6Ket9fp%2BwMuhgH6-FksyzMp9wz%2BXug%40mail.gmail.com?utm_medium=email&utm_source=footer>.
> 
> 
> -- 
> -Siddhesh.
> 
> 
> -- 
> -Siddhesh.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golang-nuts+unsubscr...@googlegroups.com 
> <mailto:golang-nuts+unsubscr...@googlegroups.com>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/CAMjfk%2BgHmUTo-f4u2NOtvzuZ-VH7708%3DXBcM02c76KDCCPiY%3Dw%40mail.gmail.com
>  
> <https://groups.google.com/d/msgid/golang-nuts/CAMjfk%2BgHmUTo-f4u2NOtvzuZ-VH7708%3DXBcM02c76KDCCPiY%3Dw%40mail.gmail.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/51026787-ADAF-4B65-AB4A-07F7A178E57D%40ix.netcom.com.

Reply via email to