I'm not sure. You mention that there's only a single mDNS query per minute, so the five-second timeouts would never overlap.
Looking at the profile again, a lot of the time seems to be spent on the locks involved in sleeping for some amount of time. This makes it seem like there are a lot of active timers. The mDNS package might be a red herring, but timers still sound like they're heavily involved. Maybe there's another, heavier, user of timers in the program? At the very least, there's probably a timer that triggers the mDNS call every minute. Does the program happen to be open source? On Tuesday, October 18, 2016 at 5:53:48 PM UTC-7, Abhay Bothra wrote: > > Although what you are saying makes a lot of sense(thanks!), I see that my > program is just using a 5 second timeout. Is it possible that it can still > lead to this performance profile? > > On Tuesday, October 18, 2016 at 10:22:09 AM UTC-7, rhys.h...@gmail.com > wrote: >> >> Does your program set a very large Timeout on its mdns requests (maybe >> tens of hours long)? >> >> It looks like your program is consuming a lot of CPU cycles on managing >> timers. On the left half of the flame graph, lots of CPU cycles are spent >> in runtime.timerproc. Time here indicates a large number of active timers >> (from time.NewTimer or time.After). The CPU cycles attributed to >> runtime.sysmon in the right half of the flame graph are a side effect of >> the runtime.timerproc goroutine doing a large number of short sleeps. >> >> So why are there a large number of active timers in your process? It >> looks like the mdns package has a bug wherein it sets a timeout on >> operations, but never cancels that timeout if the operation completes >> successfully. Instead of using time.After, the mdns package should use >> time.NewTimer and then defer a call to Stop: >> https://github.com/hashicorp/mdns/blob/9d85cf22f9f8d53cb5c81c1b2749f438b2ee333f/client.go#L235 >> >> The default timeout is one second, but it seems likely that your process >> specifies a much larger timeout—likely a couple of days to match how long >> it takes before the CPU usage levels out. >> >> You can fix this behavior in your program by using a smaller timeout, so >> the "leaked" timers are released sooner, so there's a smaller number active >> at any time. The mdns package should also be changed to clean up its timer >> before the query method returns, via time.NewTimer and defer Stop. >> >> On Monday, October 17, 2016 at 11:51:52 AM UTC-7, Abhay Bothra wrote: >>> >>> We are using Hashicorp's mdns library (https://github.com/hashicorp/mdns) >>> for node discovery, with a frequency of 1 mdns query / minute. The CPU >>> consumption by the process increases very gradually over a couple of days, >>> going from 2-3% to 20-30% over 3-4 days. From the runtime instrumentation >>> we have done, the number of go-routines seems to be fairly static. >>> >>> The attached flame-graph from the pprof output suggests that a log of >>> CPU is being spent on runtime.goexit and runtime.mstart. To me this seems >>> to suggest that we are starting very short lived go-routines. >>> - Is it fair to blame lots of short-lived go-routines for this? >>> - What else can lead to this sort of behavior? >>> - How should be go about instrumenting our code in order to be able to >>> get a root cause? >>> >>> Really appreciate any help. >>> >>> Thanks! >>> >>> -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.