I'm not sure. You mention that there's only a single mDNS query per minute, 
so the five-second timeouts would never overlap.

Looking at the profile again, a lot of the time seems to be spent on the 
locks involved in sleeping for some amount of time. This makes it seem like 
there are a lot of active timers.

The mDNS package might be a red herring, but timers still sound like 
they're heavily involved. Maybe there's another, heavier, user of timers in 
the program? At the very least, there's probably a timer that triggers the 
mDNS call every minute.

Does the program happen to be open source?

On Tuesday, October 18, 2016 at 5:53:48 PM UTC-7, Abhay Bothra wrote:
>
> Although what you are saying makes a lot of sense(thanks!), I see that my 
> program is just using a 5 second timeout. Is it possible that it can still 
> lead to this performance profile? 
>
> On Tuesday, October 18, 2016 at 10:22:09 AM UTC-7, rhys.h...@gmail.com 
> wrote:
>>
>> Does your program set a very large Timeout on its mdns requests (maybe 
>> tens of hours long)?
>>
>> It looks like your program is consuming a lot of CPU cycles on managing 
>> timers. On the left half of the flame graph, lots of CPU cycles are spent 
>> in runtime.timerproc. Time here indicates a large number of active timers 
>> (from time.NewTimer or time.After). The CPU cycles attributed to 
>> runtime.sysmon in the right half of the flame graph are a side effect of 
>> the runtime.timerproc goroutine doing a large number of short sleeps.
>>
>> So why are there a large number of active timers in your process? It 
>> looks like the mdns package has a bug wherein it sets a timeout on 
>> operations, but never cancels that timeout if the operation completes 
>> successfully. Instead of using time.After, the mdns package should use 
>> time.NewTimer and then defer a call to Stop: 
>> https://github.com/hashicorp/mdns/blob/9d85cf22f9f8d53cb5c81c1b2749f438b2ee333f/client.go#L235
>>
>> The default timeout is one second, but it seems likely that your process 
>> specifies a much larger timeout—likely a couple of days to match how long 
>> it takes before the CPU usage levels out.
>>
>> You can fix this behavior in your program by using a smaller timeout, so 
>> the "leaked" timers are released sooner, so there's a smaller number active 
>> at any time. The mdns package should also be changed to clean up its timer 
>> before the query method returns, via time.NewTimer and defer Stop.
>>
>> On Monday, October 17, 2016 at 11:51:52 AM UTC-7, Abhay Bothra wrote:
>>>
>>> We are using Hashicorp's mdns library (https://github.com/hashicorp/mdns) 
>>> for node discovery, with a frequency of 1 mdns query / minute. The CPU 
>>> consumption by the process increases very gradually over a couple of days, 
>>> going from 2-3% to 20-30% over 3-4 days. From the runtime instrumentation 
>>> we have done, the number of go-routines seems to be fairly static.
>>>
>>> The attached flame-graph from the pprof output suggests that a log of 
>>> CPU is being spent on runtime.goexit and runtime.mstart. To me this seems 
>>> to suggest that we are starting very short lived go-routines.
>>> - Is it fair to blame lots of short-lived go-routines for this?
>>> - What else can lead to this sort of behavior?
>>> - How should be go about instrumenting our code in order to be able to 
>>> get a root cause?
>>>
>>> Really appreciate any help.
>>>
>>> Thanks!
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to