Hello,

By writing a code for unit normal scaling I've found a big differences 
related to where a function used in broadcast is defined, *globally* vs* 
locally*. Consider functions below:

function scun2!(A)
    shift = mean( A, 1)
    stretch = std(A, 1)
    
    f(a, b, c) = (a - b) / c           # defined locally
    broadcast!(f, A, A, shift, stretch)
    
    shift, stretch
end

f_scun(a, b, c) = (a - b) / c          # defined globally
function scun3!(A)
    shift = mean( A)
    stretch = std(A, 1)
    
    broadcast!(f_scun, A, A, shift, stretch)
    
    shift, stretch
end

Resulting performance is:

R2 = copy(T)

@time sh2, sc2 = scun2!(R2);

  0.035527 seconds (19.51 k allocations: 967.273 KB)


R3 = copy(T)

@time sh3, sc3 = scun3!(R3);

  0.009705 seconds (54 allocations: 17.547 KB)


How can be explained, that if f_scun is defined outside the function the 
performance is 3.6 times better (number of allocations is also large)? I'm 
using Julia 0.4.3

Thank you,
Igor




Reply via email to