I think this does what you want using two packages, plyr and reshape2 that you may have to install. If so install.packages("plyr", "reshape2") should do the trick. library(plyr) library(reshape2) # using supplied file 'myfile" from below time0total = sum(myfile[,2]) mydata <- myfile[, 2:10] md1 <- melt(mydata, id = "Time_zero") ddply(md1, .(variable, value), summarise, sum = sum(Time_zero)/time0total)
John Kane Kingston ON Canada -----Original Message----- From: z...@cornell.edu Sent: Tue, 24 Jul 2012 10:25:21 -0400 To: jrkrid...@inbox.com Subject: Re: [R] How to do the same thing for all levels of a column? Hi John, Thank you for the tips. My apologies about the unreadable sample data... So here is the output of the sample data, and hopefully it works this time :) myfile <- structure(list(Proteins = structure(1:4, .Label = c("p1", "p2", "p3", "p4"), class = "factor"), Time_zero = c(0.0050723, 0.0002731, 9.76e-05, 0.0002077), X1 = structure(c(1L, 3L, 1L, 2L), .Label = c("L", "R", "T"), class = "factor"), X2 = structure(c(1L, 1L, 2L, 1L ), .Label = c("E", "M"), class = "factor"), X3 = structure(c(2L, 1L, 2L, 2L), .Label = c("N", "Y"), class = "factor"), X4 = structure(c(1L, 2L, 3L, 2L), .Label = c("I", "L", "Q"), class = "factor"), X5 = structure(c(1L, 2L, 1L, 1L), .Label = c("I", "V"), class = "factor"), X6 = structure(c(1L, 1L, 1L, 2L), .Label = c("P", "S"), class = "factor"), X7 = structure(c(1L, 3L, 2L, 2L), .Label = c("D", "E", "G"), class = "factor"), X8 = structure(c(1L, 1L, 2L, 1L), .Label = c("A", "C"), class = "factor")), .Names = c("Proteins", "Time_zero", "X1", "X2", "X3", "X4", "X5", "X6", "X7", "X8"), row.names = c(NA, 4L), class = "data.frame") And here is my original question: Basically, I have a bunch of protein sequences composed of different amino acid residues, and each residue is represented by an uppercase letter. I want to calculate the ratio of different amino acid residues at each position of the proteins. If I name this table as myfile.txt, I have the following scripts to calculate the ratio of each amino acid residue at position 1: # showing levels of the 3rd column, which means the types of residues >myfile[,3] # calculating the ratio of L >list=c(which(myfile[,3]=="L")) >time0total=sum(myfile[,2]) >AA_L=0 >for (i in 1:length(list)){AA_L=sum(myfile[list[[i]],2]+AA_L)} >ratio_L=AA_L/time0total So how can I write a script to do the same thing for the other two levels (T and R) in column 3, and also do this for every column that contains amino acid residues? Thanks a lot! Regards, Zhao 2012/7/24 John Kane <[1]jrkrid...@inbox.com> First thing is to supply the data in a useable format. As is it is essenatially unreadable. All R-beginners do this. :) Have a look at the dput function (?dput) for a good way to supply sample data in an email. If you have a large dataset probably a few dozen lines of data would be fine. Something like dput(head(mydata)) should be fine. Just copy and paste the output into your email. Welcome to R. I think you will like it. John Kane Kingston ON Canada > -----Original Message----- > From: [2]z...@cornell.edu > Sent: Mon, 23 Jul 2012 18:01:11 -0400 > To: [3]r-help@r-project.org > Subject: [R] How to do the same thing for all levels of a column? > > Dear all, > > > > I am a R beginner, and I am looking for a way to do the same thing for > all > levels of a column in a table. > > > > Basically, I have a bunch of protein sequences composed of different > amino > acid residues, and each residue is represented by an uppercase letter. I > want to calculate the ratio of different amino acid residues at each > position of the proteins. Here is an example table: > > Proteins > > Time_zero > > 1 > > 2 > > 3 > > 4 > > 5 > > 6 > > 7 > > 8 > > p1 > > 0.0050723 > > L > > E > > Y > > I > > I > > P > > D > > A > > p2 > > 0.0002731 > > T > > E > > N > > L > > V > > P > > G > > A > > p3 > > 9.757E-05 > > L > > M > > Y > > Q > > I > > P > > E > > C > > p4 > > 0.0002077 > > R > > E > > Y > > L > > I > > S > > E > > A > > > > If I name this table as myfile.txt, I have the following scripts to > calculate the ratio of each amino acid residue at position 1: > > # showing levels of the 3rd column, which means the types of residues > > >myfile[,3] > > > > # calculating the ratio of L > > >list=c(which(myfile[,3]=="L")) > > >time0total=sum(myfile[,2]) > > >AA_L=0 > > >for (i in 1:length(list)){AA_L=sum(myfile[list[[i]],2]+AA_L)} > > >ratio_L=AA_L/time0total > > > > So how can I write a script to do the same thing for the other two levels > (T and R) in column 3, and also do this for every column that contains > amino acid residues? > > > > Many thanks for any help you could give me on this topic! :) > > > > Regards, > > Zhao > -- > Zhao JIN > Ph.D. Candidate > Ruth Ley Lab > 467 Biotech > Field of Microbiology, Cornell University > Lab: 607.255.4954 > Cell: 412.889.3675 > > [[alternative HTML version deleted]] > > ______________________________________________ > [4]R-help@r-project.org mailing list > [5]https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > [6]http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ____________________________________________________________ FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your desktop! Check it out at [7]http://www.inbox.com/marineaquarium -- Zhao JIN Ph.D. Candidate Ruth Ley Lab 467 Biotech Field of Microbiology, Cornell University Lab: 607.255.4954 Cell: 412.889.3675 _________________________________________________________________ [8]3D Earth Screensaver Preview Free 3D Earth Screensaver Watch the Earth right on your desktop! Check it out at [9]www.inbox.com/earth References 1. mailto:jrkrid...@inbox.com 2. mailto:z...@cornell.edu 3. mailto:r-help@r-project.org 4. mailto:R-help@r-project.org 5. https://stat.ethz.ch/mailman/listinfo/r-help 6. http://www.R-project.org/posting-guide.html 7. http://www.inbox.com/marineaquarium 8. http://www.inbox.com/earth 9. http://www.inbox.com/earth ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.