Dear useRs:
I'm not sure if it's the correct place to ask but I'll try it out. I've been
reading about how to perform Principal Component Analysis (PCA) in
microarrays (see [1]) and there's something that I don't get it. Basically
it's related with performing PCA over data sets which number of variables is
greater than the number of samples. For example in the paper mentioned
above, the number of variables (genes) and samples (tumors) is 8538 and 104,
respectively. My understanding is that, in PCA, the number of samples (n)
must be greater than the number of variables (p) and its goal is to seek k
components, such as k<p and the variance in this new data set be
maximized. Am I wrong?  Could somebody please tell me how is possible to
perform PCA when the number of variables is greater than the number of
samples and how to do it in R?   I'm really confused.  In R I've tried
"prcomp" and "princomp" but they didn't work.

I'm using Win XP SP2, Intel Core- 2 Duo 2.4 GHz and R 2.7.0 Patched.


Thanks in advance,


Jorge Ivan Velez



[1] Ringnér, M.  What is principal components analysis? Nature Biotechnology
 26, 303 - 304 (2008),
http://www.nature.com/nbt/journal/v26/n3/full/nbt0308-303.html

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to