Well, that bug is closed as fixed and I'm using Go 1.9 which includes
the mutex fairness patch.

The problem I'm seeing isn't even unfairness among the users of a
mutex. The problem is the delay between unlock and some other
goroutine successfully acquiring the lock: the aggregate time where no
goroutine has the lock is large.

On Mon, Oct 16, 2017 at 1:13 PM, Ian Davis <m...@iandavis.com> wrote:
> This sounds like https://github.com/golang/go/issues/13086
>
> On Mon, 16 Oct 2017, at 09:01 PM, Caleb Spare wrote:
>> I have a server which is doing log processing and I'm trying to
>> improve throughput. The bottleneck is access to a shared resource
>> protected by a sync.Mutex. The issue is that even though there are
>> nearly always worker goroutines blocked on Lock, only about 10% of the
>> time (in aggregate) is spent running the code inside the critical
>> section.
>>
>> To be concrete, I have 300 worker goroutines, and they collectively
>> spend about 280 seconds / second blocked on the mutex, and about 100
>> ms / second inside (as measured from Lock returning to Unlock
>> returning).
>>
>> To investigate this, I added some instrumentation around sync.Mutex. I
>> log events when a Lock is called, when Lock returns (acquire lock),
>> and when Unlock returns. This shows me that, typically, my code takes
>> around 5-15 μs between acquire and unlock, but it takes tens or
>> hundreds of μs between unlock and some other goroutine acquiring the
>> lock.
>>
>> Here's a short section of logged events. The number in square brackets
>> is the goroutine ID; the timestamp is μs since the trace started.
>>
>> [209]    73018.628   acquire lock (+38.798)
>> [209]    73028.707   unlock
>> [298]    73049.222   call lock
>> [199]    73068.748   call lock
>> [214]    73153.182   call lock
>> [355]    73259.473   call lock
>> [157]    73308.336   call lock
>> [256]    73339.965   call lock
>> [389]    73517.476   call lock
>> [325]    73541.077   acquire lock (+512.370)
>> [325]    73545.650   unlock
>> [396]    73581.819   call lock
>> [174]    73645.743   call lock
>> [224]    73708.319   call lock
>> [292]    73801.945   call lock
>> [285]    73855.717   call lock
>> [357]    73915.478   call lock
>> [276]    74013.148   acquire lock (+467.498)
>>
>> Here you can see that goroutine 209 held the lock for only about 10 μs
>> before returning from mu.Unlock at t=73028.707 but the next goroutine
>> (325) didn't return from mu.Lock until t=73541.077, more than 500 μs
>> after that. Goroutine 325 returned from mu.Unlock after about 5 μs,
>> but then it took 467 μs for another goroutine (276) to return from
>> mu.Lock.
>> What is the best way to debug (and, hopefully, fix) this issue? I'm
>> not really sure how to diagnose scheduling problems. I looked at
>> GODEBUG=schedtrace=1000 output but I'm not sure how to interpret it.
>> Here's a few seconds of that output for this server:
>>
>> 2017/10/16 10:38:00 SCHED 253295ms: gomaxprocs=36 idleprocs=0
>> threads=56 spinningthreads=18 idlethreads=15 runqueue=1 [0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0]
>> 2017/10/16 10:38:01 SCHED 254305ms: gomaxprocs=36 idleprocs=1
>> threads=56 spinningthreads=18 idlethreads=14 runqueue=0 [0 0 0 0 0 0 0
>> 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
>> 2017/10/16 10:38:02 SCHED 255307ms: gomaxprocs=36 idleprocs=2
>> threads=56 spinningthreads=18 idlethreads=15 runqueue=0 [0 1 0 0 0 0 0
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
>> 2017/10/16 10:38:03 SCHED 256311ms: gomaxprocs=36 idleprocs=1
>> threads=56 spinningthreads=18 idlethreads=13 runqueue=0 [0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0]
>> 2017/10/16 10:38:04 SCHED 257313ms: gomaxprocs=36 idleprocs=1
>> threads=56 spinningthreads=17 idlethreads=13 runqueue=0 [0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
>> 2017/10/16 10:38:05 SCHED 258318ms: gomaxprocs=36 idleprocs=1
>> threads=56 spinningthreads=18 idlethreads=15 runqueue=0 [0 0 0 0 0 0 0
>> 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
>> 2017/10/16 10:38:06 SCHED 259318ms: gomaxprocs=36 idleprocs=3
>> threads=56 spinningthreads=17 idlethreads=17 runqueue=0 [0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
>> 2017/10/16 10:38:07 SCHED 260323ms: gomaxprocs=36 idleprocs=1
>> threads=56 spinningthreads=18 idlethreads=14 runqueue=0 [0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
>> 2017/10/16 10:38:08 SCHED 261324ms: gomaxprocs=36 idleprocs=2
>> threads=56 spinningthreads=18 idlethreads=15 runqueue=0 [0 0 0 0 0 0 0
>> 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
>> 2017/10/16 10:38:09 SCHED 262327ms: gomaxprocs=36 idleprocs=0
>> threads=56 spinningthreads=18 idlethreads=15 runqueue=1 [0 0 1 0 0 0 0
>> 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
>> 2017/10/16 10:38:10 SCHED 263333ms: gomaxprocs=36 idleprocs=1
>> threads=56 spinningthreads=18 idlethreads=14 runqueue=0 [0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
>> 2017/10/16 10:38:11 SCHED 264333ms: gomaxprocs=36 idleprocs=4
>> threads=56 spinningthreads=18 idlethreads=15 runqueue=0 [0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
>> 2017/10/16 10:38:12 SCHED 265336ms: gomaxprocs=36 idleprocs=3
>> threads=56 spinningthreads=19 idlethreads=14 runqueue=1 [0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
>> 2017/10/16 10:38:13 SCHED 266339ms: gomaxprocs=36 idleprocs=0
>> threads=56 spinningthreads=14 idlethreads=15 runqueue=1 [0 0 0 0 0 0 1
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
>>
>> Any ideas?
>>
>> Thanks in advance!
>> Caleb
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to golang-nuts+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golang-nuts+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to