On Jan 31, 2010, at 2:19 PM, Zoppoli, Gabriele (NIH/NCI) [G] wrote:

Hello,

here is the problem:

I want to demonstrate that, on average, the Pearson's correlations of a specified subset of genes from a huge list (>18,000 columns) are higher than any randomly chosen subset of that list. I would therefore like to do a number of tests between that specified subset and randomly chosen ones from the "mother" list.

How could I do that? What would be an appropriate statistical test?

You should construct a small dataset that resembles your data and post that. You are currently (and in the past) using terms that have specific meaning in R ("list", "columns", subset, "header") but I don't think you have enough experience in R to use them for unambiguous communication with experienced users. For example, in the current question, it really has no unambiguous meaning to say that "genes" have "correlations". Some measurements regarding genes might, but you have not indicated what sort of measurements you performed. Use R expressions to exemplify what the data looks like. That removes ambiguities.

--
David.


Thank you for your help!


Gabriele Zoppoli, MD
Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of Genova, Genova, Italy
Guest Researcher, LMP, NCI, NIH, Bethesda MD

Work: 301-451-8575
Mobile: 301-204-5642
Email: zoppo...@mail.nih.gov
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to