Dear useRs: I'm not sure if it's the correct place to ask but I'll try it out. I've been reading about how to perform Principal Component Analysis (PCA) in microarrays (see [1]) and there's something that I don't get it. Basically it's related with performing PCA over data sets which number of variables is greater than the number of samples. For example in the paper mentioned above, the number of variables (genes) and samples (tumors) is 8538 and 104, respectively. My understanding is that, in PCA, the number of samples (n) must be greater than the number of variables (p) and its goal is to seek k components, such as k<p and the variance in this new data set be maximized. Am I wrong? Could somebody please tell me how is possible to perform PCA when the number of variables is greater than the number of samples and how to do it in R? I'm really confused. In R I've tried "prcomp" and "princomp" but they didn't work.
I'm using Win XP SP2, Intel Core- 2 Duo 2.4 GHz and R 2.7.0 Patched. Thanks in advance, Jorge Ivan Velez [1] Ringnér, M. What is principal components analysis? Nature Biotechnology 26, 303 - 304 (2008), http://www.nature.com/nbt/journal/v26/n3/full/nbt0308-303.html [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.