On Jan 31, 2010, at 2:19 PM, Zoppoli, Gabriele (NIH/NCI) [G] wrote:
Hello,
here is the problem:
I want to demonstrate that, on average, the Pearson's correlations
of a specified subset of genes from a huge list (>18,000 columns)
are higher than any randomly chosen subset of that list. I would
therefore like to do a number of tests between that specified subset
and randomly chosen ones from the "mother" list.
How could I do that? What would be an appropriate statistical test?
You should construct a small dataset that resembles your data and post
that. You are currently (and in the past) using terms that have
specific meaning in R ("list", "columns", subset, "header") but I
don't think you have enough experience in R to use them for
unambiguous communication with experienced users. For example, in the
current question, it really has no unambiguous meaning to say that
"genes" have "correlations". Some measurements regarding genes might,
but you have not indicated what sort of measurements you performed.
Use R expressions to exemplify what the data looks like. That removes
ambiguities.
--
David.
Thank you for your help!
Gabriele Zoppoli, MD
Ph.D. Fellow, Experimental and Clinical Oncology and Hematology,
University of Genova, Genova, Italy
Guest Researcher, LMP, NCI, NIH, Bethesda MD
Work: 301-451-8575
Mobile: 301-204-5642
Email: zoppo...@mail.nih.gov
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.