I am encountering an unexpected amount of storage allocation in the 
cfactor!{A::HBlkDiag) method in the MixedModels package.  See
https://github.com/dmbates/MixedModels.jl/blob/master/src/cfactor.jl  for 
the code.

An HBlkDiag matrix is a homogeneous block diagonal matrix where 
"homogeneous" refers to the fact that all the diagonal blocks are square 
and of the same size.  Because of this homogeneity the blocks can be stored 
in an r by r by k array where r is the size of each of the square block and 
k is the number of such blocks.

On entry to this method the blocks are symmetric, positive semi-definite. 
 I want to overwrite the upper triangle of each of these blocks with its 
Cholesky factor.  I call LAPACK.potrf! directly because I don't want 
cholfact! to throw a non-positive-definite error.  The strange thing to me 
is that when I monitor the storage allocation, I get a huge amount of 
storage being allocated in the line with that call.  This may be because 
LAPACK.potrf! returns a tuple of the original matrix and an Int (the info 
code) but I'm not sure.

To see an example of this unusual amount of allocation try the following 
code with julia --track-allocation=user

using Feather, MixedModels
cd(Pkg.dir("MixedModels", "test", "data"))
sleepstudy = Feather.read("sleepstudy.feather", nullable=false)
fm1 = fit!(lmm(Reaction ~ 1 + Days + (1 + Days | Subject), sleepstudy))
Profile.clear_malloc_data()
devs, vars, betas, thetas = bootstrap(10_000, fm1)

I get

        - function cfactor!{T}(A::HBlkDiag{T})
        -     Aa = A.arr
        0     r, s, t = size(Aa)
        0     if r ≠ s
        0         throw(ArgumentError("HBlkDiag matrix A must be square"))
        -     end
 94428000     scm = Array(T, (r, r))
        0     for k in 1 : t  # FIXME: Lots of allocations in this loop
        0         for j in 1 : r, i in 1 : j
        0             scm[i, j] = Aa[i, j, k]
        -         end
566568000         LAPACK.potrf!('U', scm)
        0         for j in 1 : r, i in 1 : j
        0             Aa[i, j, k] = scm[i, j]
        -         end
        -     end
 10492000     UpperTriangular(A)
        - end

In this case the HBlkDiag matrix being decomposed has is 36 by 36 
consisting of 18 2 by 2 diagonal blocks.  scm is a scratch 2 by 2 
matrixthat is overwritten in sequence by the upper triangle of each of the 
original 2 by 2 blocks and passed to LAPACK.potrf!

Reply via email to