I’m trying to optimize my code, and since one of the bottle necks is 
garbage collection, I’ve been trying to get rid of unnecessary memory 
allocation.

In the process, I ended up dealing with this case:

let X = Int64[1,2,3,14]


function sub2(X::Vector{Int64}, nNodes::Int64)
    N = length(X)::Int64
    index = X[N]::Int64
    stride = 1::Int64
    for k in (N-1):-1:1
        stride = stride::Int64 * nNodes::Int64
        index  += ((X[k]-1)::Int64 * stride::Int64)::Int64
    end
    return index::Int64
end

#force compiling
sub2(X,1)

#profile
@time for j in 1:1e7 sub2(X,21) end
@time for j in 1:1e7 sub2(X,22) end

end

The result is:
elapsed time: 0.318801734 seconds (96 bytes allocated)
elapsed time: 0.434234715 seconds (160000096 bytes allocated, 11.08% gc 
time)

I’m trying to understand why calling the function with n=21 (or lower) does 
not allocate much memory, whereas calling it with n=22 (or higher) is 
causing a lot of memory to be allocated (with the value of X I picked, 
sub2(X,21) =497 and sub2(X,22)=542. I tired with different values of X, and 
if the return value is larger than 512 it ends up allocating more memory, 
if it's less than 512 it does not)

I did my best to make sure types are asserted to avoid type instability. 

This does not have a large effect in the big picture, but I am trying to 
wrap my head around how / when memory is allocated.

FWIW, here is the output of running using —track-allocation=all

        - let X = Int64[1,2,3,14]
        - 
        - 
        - function sub2(X::Vector{Int64}, nNodes::Int64)
160151688     N = length(X)::Int64
        0     index = X[N]::Int64
        0     stride = 1::Int64
        0     for k in (N-1):-1:1
        0         stride = stride::Int64 * nNodes::Int64
        0         index  += ((X[k]-1)::Int64 * stride::Int64)::Int64
        -     end
        0     return index::Int64
        - end
        - 
        - #force compiling
        - sub2(X,1)
        - 
        - #profile
        - @time for j in 1:1e7 sub2(X,21) end
        - @time for j in 1:1e7 sub2(X,22) end
        - 
        - end
test.jl.mem (END)​

Reply via email to