On Friday, March 18, 2016 at 10:49:37 AM UTC-5, Christina Castellani wrote:
>
> I apologize for the beginner nature of my question but I am not
> understanding the file structure in Julia.
>
> I am wanting to run lmm on a very large file (470000x2000), I would like
> to run one lmm on each of the 470000 observations.
>
Those reading this list may need to know that the lmm function is from the
MixedModels package.
There is an unexported function in that package called refit! with signature
refit!(m::LinearMixedModel, y)
which takes an existing LinearMixedModel object, replaces the response and
refits the model with this response. Because it is unexported you must
call it as
MixedModels.refit!(m, y)
Because Julia uses column-major ordering it would be an advantage to
transpose the betas matrix. In fact, I might suggest transposing into a
memory-mapped array so that you can, in later runs, simply open and memory
map the file. That operation is more-or-less instantaneous.
If you install the first column of the transposed betas array as "betas" in
the covarsInput DataFrame you should be able to fit the first of these
models. After that you cycle over the columns of the transposed betas with
refit!.
Do note that the '!' in the name "refit!" means that it modifies the model.
This is good news because it saves on memory allocation and garbage
collection. It is bad news in that you must make sure to save whatever
results you want before the next refit!
> I have loaded in my matrix and my covariates to Julia using the following
> code:
> betas = readtable("betas.txt", separator='\t')
>
> covarsInput=readtable("covarsInputJulia.csv")
>
> covarsInput[:CNValues]=PooledDataArray(covarsInput[:CNValues])
> covarsInput[:Age]=PooledDataArray(covarsInput[:Age])
> covarsInput[:Sex]=PooledDataArray(covarsInput[:Sex])
> covarsInput[:ChipID]=PooledDataArray(covarsInput[:ChipID])
>
>
> I am then running an lmm, in this fashion:
> m=fit!(lmm(betas~CNValues+Age+Sex+(1|ChipID1911),covarsInput))
>
> This will not run because covarsInput is the dataframe being specified and
> betas is not in covarsInput. I have no idea how to remedy this?
> So a) how do I get all my files in the same "dataframe" even if one is a
> large matrix, and
> b) how do I get the lmm to run on each probe in betas
>
> Any help you can provide is greatly appreciated, I am very new to this and
> pretty confused!
>
> Thank you!
>