Hi I think/hope there will be a simple solution to this but google-ing has provided no answers (probably not using the right words)
I have a long data frame of >2 000 000 rows, and 6 columns. Across this there are 24 000 combinations of gene in a column (n=12000) and gender in a column (n=2... obviously). I want to create 2 new columns in the data frame that on each row gives, in one column the mean value (of gene expression, in the column called "value") for that row's gene&gender combination, and in the other column the standard deviation for the gene&gender combination. Any suggestions? Rob Example of the top of the data frame: gene variable value gender line rep 1 CG10000 X208.F1.30456 4.758010 Female 208 1 2 CG10000 X365.F2.30478 4.915395 Female 365 2 3 CG10000 X799.F2.30509 4.641636 Female 799 2 4 CG10000 X306.M2.32650 4.550676 Male 306 2 5 CG10000 X712.M2.30830 4.633811 Male 712 2 6 CG10000 X732.M2.30504 4.857564 Male 732 2 7 CG10000 X707.F1.31120 5.104165 Female 707 1 8 CG10000 X514.F2.30493 4.730814 Female 514 2 -- View this message in context: http://r.789695.n4.nabble.com/Mean-of-matched-data-tp4636856.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.