I moved this discussion to https://discourse.julialang.org where it is
easier to format the code chunks.
On Monday, October 31, 2016 at 2:04:02 PM UTC-5, Douglas Bates wrote:
>
> I am encountering an unexpected amount of storage allocation in the
> cfactor!{A::HBlkDiag) method in the MixedModels package. See
> https://github.com/dmbates/MixedModels.jl/blob/master/src/cfactor.jl for
> the code.
>
> An HBlkDiag matrix is a homogeneous block diagonal matrix where
> "homogeneous" refers to the fact that all the diagonal blocks are square
> and of the same size. Because of this homogeneity the blocks can be stored
> in an r by r by k array where r is the size of each of the square block and
> k is the number of such blocks.
>
> On entry to this method the blocks are symmetric, positive semi-definite.
> I want to overwrite the upper triangle of each of these blocks with its
> Cholesky factor. I call LAPACK.potrf! directly because I don't want
> cholfact! to throw a non-positive-definite error. The strange thing to me
> is that when I monitor the storage allocation, I get a huge amount of
> storage being allocated in the line with that call. This may be because
> LAPACK.potrf! returns a tuple of the original matrix and an Int (the info
> code) but I'm not sure.
>
> To see an example of this unusual amount of allocation try the following
> code with julia --track-allocation=user
>
> using Feather, MixedModels
> cd(Pkg.dir("MixedModels", "test", "data"))
> sleepstudy = Feather.read("sleepstudy.feather", nullable=false)
> fm1 = fit!(lmm(Reaction ~ 1 + Days + (1 + Days | Subject), sleepstudy))
> Profile.clear_malloc_data()
> devs, vars, betas, thetas = bootstrap(10_000, fm1)
>
> I get
>
> - function cfactor!{T}(A::HBlkDiag{T})
> - Aa = A.arr
> 0 r, s, t = size(Aa)
> 0 if r ≠ s
> 0 throw(ArgumentError("HBlkDiag matrix A must be square"))
> - end
> 94428000 scm = Array(T, (r, r))
> 0 for k in 1 : t # FIXME: Lots of allocations in this loop
> 0 for j in 1 : r, i in 1 : j
> 0 scm[i, j] = Aa[i, j, k]
> - end
> 566568000 LAPACK.potrf!('U', scm)
> 0 for j in 1 : r, i in 1 : j
> 0 Aa[i, j, k] = scm[i, j]
> - end
> - end
> 10492000 UpperTriangular(A)
> - end
>
> In this case the HBlkDiag matrix being decomposed has is 36 by 36
> consisting of 18 2 by 2 diagonal blocks. scm is a scratch 2 by 2
> matrixthat is overwritten in sequence by the upper triangle of each of the
> original 2 by 2 blocks and passed to LAPACK.potrf!
>