On Mon, Feb 19, 2018 at 4:56 AM, Chris Hopkins <cbehopk...@gmail.com> wrote:
>
> I would have expected the compiler to allowed to change:
> if len(l.shared) > 0 { # the racy check for non-emptiness
>   l.Lock()
>   last := len(l.shared) - 1
> to
> tmp := len(l.shared)
> if tmp > 0 { # the racy check for non-emptiness
>   l.Lock()
>   last := tmp - 1
>
> I'd be interested if this were a legal optimisation. Were it so, then you
> could get illegal behaviour.

I don't think this is allowed.  I don't think the compiler can assume
that a load before a lock and a load after a lock return the same
value.

Ian



> On Friday, 16 February 2018 14:57:59 UTC, Carlo Alberto Ferraris wrote:
>>
>> Also, keep in mind, "being there nothing in the pool" is the common state
>> of pools immediately after every GC.
>>
>> On Friday, February 16, 2018 at 12:05:27 PM UTC+9, Kevin Malachowski
>> wrote:
>>>
>>> If there is likely nothing in the Pool, then maybe it's better to not use
>>> one at all. Can you compare the internal workload with an implementation
>>> where all callers just call the `New` function directly? What's the purpose
>>> of using a pooled memory if there's often nothing in the pool?
>>>
>>> On Wednesday, February 14, 2018 at 3:56:34 PM UTC-8, Carlo Alberto
>>> Ferraris wrote:
>>>>
>>>> In an attempt to reduce the time pprof says is spent in sync.Pool
>>>> (internal workloads, sorry) I modified Get and getSlow to skip locking the
>>>> per-P shared pools if the pools are likely to be empty. This yields
>>>> promising results, but I'm not sure the approach is sound since the check I
>>>> do is inherently racy.
>>>>
>>>> As a (artificial and contrived) benchmark, I'm using this:
>>>>
>>>> func BenchmarkPoolUnderflow(b *testing.B) {
>>>>   var p Pool
>>>>   b.RunParallel(func(pb *testing.PB) {
>>>>     for pb.Next() {
>>>>       p.Put(1)
>>>>       p.Get()
>>>>       p.Get()
>>>>     }
>>>>   })
>>>> }
>>>>
>>>> This is meant to simulate a pool in which more or objects are Get() than
>>>> are Put() (it wouldn't make much sense to simulate a pool in which we only
>>>> get, so to keep things simple I opted for a 2:1 ratio)
>>>>
>>>> The change I applied to Get and getSlow is the following. Starting from
>>>> the current pattern of:
>>>>
>>>> l := ... # per-P poolLocal
>>>> l.Lock()
>>>> last := len(l.shared) - 1
>>>> if last >= 0 {
>>>>   x = l.shared[last]
>>>>   l.shared = l.shared[:last]
>>>> }
>>>> l.Unlock()
>>>>
>>>> I add a check (not protected by the mutex, that is the expensive op
>>>> we're trying to skip if it's not necessary) to see if the pool is likely to
>>>> be non-empty:
>>>>
>>>> l := ... # per-P poolLocal
>>>> if len(l.shared) > 0 { # the racy check for non-emptiness
>>>>   l.Lock()
>>>>   last := len(l.shared) - 1
>>>>   if last >= 0 {
>>>>     x = l.shared[last]
>>>>     l.shared = l.shared[:last]
>>>>   }
>>>>   l.Unlock()
>>>> }
>>>>
>>>> I know I should not call this a benign race, but in this case I don't
>>>> see how this can lead to problems. If the racy check gets it right, then
>>>> it's almost a net win. If if it gets it wrong, either we do what we do now
>>>> (i.e. we lock, just to find an empty pool), or we skip an otherwise
>>>> non-empty pool - thereby failing to immediately return an otherwise 
>>>> reusable
>>>> object (note that 1. there is a per-P shared pool for each P, so I'd
>>>> estimate the chances of this happening on all of them to be pretty low and
>>>> 2. the Pool documentation explicitly say that Get is allowed to treat the
>>>> pool as empty). Also note that the race detector is already explicitly
>>>> disabled in all sync.Pool methods.
>>>>
>>>> The reason I'm asking is to understand whether my reasoning is sound
>>>> and, regardless, if anybody has suggestions about how to do this in a 
>>>> better
>>>> way.
>>>>
>>>> The current results (of the approach above, plus some refactoring to
>>>> recover some lost performance on the other benchmarks) on my laptop are the
>>>> following:
>>>>
>>>>     name             old time/op  new time/op  delta
>>>>     Pool-4           14.5ns ± 3%  14.2ns ± 2%   -1.64%  (p=0.023 n=9+10)
>>>>     PoolOverflow-4   1.99µs ±12%  1.78µs ± 1%  -10.62%  (p=0.000 n=10+8)
>>>>     PoolUnderflow-4   152ns ± 6%    30ns ± 1%  -80.00%  (p=0.000 n=10+8)
>>>>
>>>> (the first two benchmarks are already part of sync.Pool, the last one is
>>>> the one I described above)
>>>>
>>>> Any feedback is welcome. If this is deemed safe I'm going to submit a
>>>> CL.
>>>>
>>>> Carlo
>>>>
>>>>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to