Although what you are saying makes a lot of sense(thanks!), I see that my program is just using a 5 second timeout. Is it possible that it can still lead to this performance profile?
On Tuesday, October 18, 2016 at 10:22:09 AM UTC-7, rhys.h...@gmail.com wrote: > > Does your program set a very large Timeout on its mdns requests (maybe > tens of hours long)? > > It looks like your program is consuming a lot of CPU cycles on managing > timers. On the left half of the flame graph, lots of CPU cycles are spent > in runtime.timerproc. Time here indicates a large number of active timers > (from time.NewTimer or time.After). The CPU cycles attributed to > runtime.sysmon in the right half of the flame graph are a side effect of > the runtime.timerproc goroutine doing a large number of short sleeps. > > So why are there a large number of active timers in your process? It looks > like the mdns package has a bug wherein it sets a timeout on operations, > but never cancels that timeout if the operation completes successfully. > Instead of using time.After, the mdns package should use time.NewTimer and > then defer a call to Stop: > https://github.com/hashicorp/mdns/blob/9d85cf22f9f8d53cb5c81c1b2749f438b2ee333f/client.go#L235 > > The default timeout is one second, but it seems likely that your process > specifies a much larger timeout—likely a couple of days to match how long > it takes before the CPU usage levels out. > > You can fix this behavior in your program by using a smaller timeout, so > the "leaked" timers are released sooner, so there's a smaller number active > at any time. The mdns package should also be changed to clean up its timer > before the query method returns, via time.NewTimer and defer Stop. > > On Monday, October 17, 2016 at 11:51:52 AM UTC-7, Abhay Bothra wrote: >> >> We are using Hashicorp's mdns library (https://github.com/hashicorp/mdns) >> for node discovery, with a frequency of 1 mdns query / minute. The CPU >> consumption by the process increases very gradually over a couple of days, >> going from 2-3% to 20-30% over 3-4 days. From the runtime instrumentation >> we have done, the number of go-routines seems to be fairly static. >> >> The attached flame-graph from the pprof output suggests that a log of CPU >> is being spent on runtime.goexit and runtime.mstart. To me this seems to >> suggest that we are starting very short lived go-routines. >> - Is it fair to blame lots of short-lived go-routines for this? >> - What else can lead to this sort of behavior? >> - How should be go about instrumenting our code in order to be able to >> get a root cause? >> >> Really appreciate any help. >> >> Thanks! >> >> -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.