Re: [R] Optimization algorithm to be applied to S4 classes - specifically sparse matrices

spencerg Fri, 15 May 2009 09:40:52 -0700

Dear Doug, et al.:

What would you recommend for analyzing a longitudinal abundancesurvey of 22 species, when the species were not selected at random? Aprominent scientist tried to tell me that mixed-effects modeling isinappropriate in that case because the species were selectedpurposefully not at random.

My response is that even in that case, one should still usemixed-effects modeling, because it will tend to produce more appropriateestimates for the deviations of individual species from the average ofall species -- potentially much lower variance with slight bias -- thannaive ordinary least squares. The estimated variance components willnot represent the between-species variance for the actual population ofall hypothetical species of the particular type, but will represent thebetween-species variability in a hypothetical population from which theselected species might be considered a random sample.


     Best Wishes,
     Spencer Graves

p.s. I appreciate very much Doug's comment on this. I thought aboutadding something like that to my reply but didn't feel I could affordthe time then.


Douglas Bates wrote:

On Wed, May 13, 2009 at 5:21 PM,  <avraham.ad...@guycarp.com> wrote:

Hello.

I am trying to optimize a set of parameters using /optim/ in which the
actual function to be minimized contains matrix multiplication and is of
the form:

SUM ((A%*%X - B)^2)

where A is a matrix and X and B are vectors, with X as parameter vector.


As Spencer Graves pointed out, what you are describing here is a
linear least squares problem, which has a direct (i.e. non-iterative)
solution.  A comparison of the speed of various ways of solving such a
system is given in one of the vignettes in the Matrix package.

This has worked well so far. Recently, I was given a data set A of size
360440 x 1173, which could not be handled as a normal matrix. I brought it
into 'R' as a sparse matrix (dgCMatrix - using sparseMatrix from the Matrix
package), and the formulæ and gradient work, but /optim/ returns an error
of the form "no method for coercing this S4 class to a vector".


If you just want the least squares solution X then

X <- solve(crossprod(A), crossprod(A, B))

will likely be the fastest method where A is the sparse matrix.

I do feel obligated to point out that the least squares solution for
such large systems is rarely a sensible solution to the underlying
problem.  If you have over 1000 columns in A and it is very sparse
then likely at least parts of A are based on indicator columns for a
categorical variable.  In such situations a model with random effects
for the category is often preferable to the fixed-effects model you
are fitting.

After briefly looking into methods and classes, I realize I am in way over
my head. Is there any way I could use /optim/ or another optimization
algorithm, on sparse matrices?

Thank you very much,

--Avraham Adler
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Optimization algorithm to be applied to S4 classes - specifically sparse matrices

Reply via email to