Tu cast pola s ktorou pracujes mas v L1/ L2 cache. V momente ked 
pristupujes k inej casti pola ktora je niekolko megabajtov vzdialena tak 
procesor musi natiahnut tie udaje z ram do cache. Neviem presne cisla ale 
trva to zhruba tych 300 strojovych cyklov . Vypadok L2 neviem kolko trva, 
je to v manualoch.

On Wednesday, August 3, 2016 at 4:24:08 PM UTC+2, ondrej...@gmail.com wrote:
>
> Downgrading to 1.6.3, I'm also getting consistent benchmark results. I'll 
> try 1.7 on my Mac at home later today, to see if it's a 1.7 thing or a 
> Windows thing or...?
>
> On Wednesday, 3 August 2016 14:55:20 UTC+1, C Banning wrote:
>>
>> PS - that's with Go v1.6.
>>
>> On Wednesday, August 3, 2016 at 7:49:49 AM UTC-6, C Banning wrote:
>>>
>>> On MacBook Pro, 2.6 GHz Intel Core i7, 8 GB 1600 MHz memory, running OS 
>>> X 10.11.6, your benchmarks look pretty consistent:
>>>
>>>
>>> BenchmarkStart-4      2000000000         1.45 ns/op
>>>
>>> BenchmarkEnd-4        2000000000         1.47 ns/op
>>>
>>> BenchmarkHereThere-4  2000000000         1.46 ns/op
>>>
>>> BenchmarkStartEnd-4   2000000000         1.46 ns/op
>>>
>>> BenchmarkEndStart-4   2000000000         1.46 ns/op
>>>
>>> BenchmarkFirst-4      2000000000         0.59 ns/op
>>>
>>> BenchmarkSecond-4     2000000000         0.59 ns/op
>>>
>>> BenchmarkLast-4       2000000000         0.59 ns/op
>>>
>>> BenchmarkPenultimate-4 2000000000         0.58 ns/op
>>>
>>> On Wednesday, August 3, 2016 at 5:56:32 AM UTC-6, Ondrej wrote:
>>>>
>>>> I wanted to see if there was a difference when loading values from a 
>>>> large-ish slice (10000 elements) - to see if caches, locality and other 
>>>> things had any meaningful impacts. Whilst individual value loading (just a 
>>>> single element) seemed to be equally fast regardless of element position 
>>>> (see bench of First, Second, Last, Penultimate), when combining loading of 
>>>> various values, there seem to be almost a 2.5x difference between loading 
>>>> first four values and loading last four values (first two benchmarks).
>>>> Loading the same values, just in different order, also yields different 
>>>> execution times. But alternating loading (0, n, 1, n-1) seems to be faster 
>>>> than loading first two values and last two values.
>>>>
>>>> (Setting the test slice to be an array instead wipes all differences 
>>>> between benchmarks.)
>>>>
>>>> Can anyone point me to a resource - be it Go specific or on computer 
>>>> science principles - that would explain these large differences?
>>>>
>>>> Thanks!
>>>>
>>>> https://play.golang.org/p/oMqDvXI9YW
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to