> -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of Daniel Malter > Sent: Tuesday, July 19, 2011 1:51 AM > To: r-help@r-project.org > Subject: Re: [R] Centering data frame by factor > > > P1-tapply(P1,Experiment,mean)[Experiment]
Note that the above solution works in this example because Experiment takes the values 1 and 2. If Experiment were coded as, say, 101 and 102 the above would not work. This is a case where converting Experiment to a factor would avoid problems. E.g., > RAW <- data.frame("Experiment"=c(2,2,2,1,1,1),"Group"=c("B","A","B","B","A","B"),"P1"=c(-2,0,2,1,-1,0),"P2"=c(-4,0,4,-1,0,1)) > RAW$E <- RAW$Experiment + 100 # relabeled Experiment > with(RAW, P1-tapply(P1,Experiment,mean)[Experiment]) # good 2 2 2 1 1 1 -2 0 2 1 -1 0 > with(RAW, P1-tapply(P1,E,mean)[E]) # bad <NA> <NA> <NA> <NA> <NA> <NA> NA NA NA NA NA NA > RAW$E <- factor(RAW$E) # convert to factor > with(RAW, P1-tapply(P1,E,mean)[E]) # good 102 102 102 101 101 101 -2 0 2 1 -1 0 Another way to approach the problem is to think of your normalized data as the residuals from a linear model: > residuals(lm(data=RAW, cbind(P1,P2) ~ E)) P1 P2 1 -2.000000e+00 -4.000000e+00 2 4.385598e-17 8.771196e-17 3 2.000000e+00 4.000000e+00 4 1.000000e+00 -1.000000e+00 5 -1.000000e+00 8.771196e-17 6 4.385598e-17 1.000000e+00 > zapsmall(.Last.value) # make reading easier P1 P2 1 -2 -4 2 0 0 3 2 4 4 1 -1 5 -1 0 6 0 1 That approach can make generizations to more factors or to smoothing approaches easier. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > > HTH, > Daniel > > > ronny wrote: > > > > Hi, > > > > I would like to center P1 and P2 of the following data frame by the factor > > "Experiment", i.e. substruct from each value the average of its > > experiment, and keep the original data structure, i.e. the experiment and > > the group of each value. > > > > RAW= > > > data.frame("Experiment"=c(2,2,2,1,1,1),"Group"=c("A","A","B","A","A","B"),"P1"=c(10,12,14,5,3,4),"P2"= > c(8,12,16,2,3,4)) > > > > Desired result: > > > > NORMALIZED= > > data.frame("Experiment"=c(2,2,2,1,1,1),"Group"=c("B","A","B","B","A","B"),"P1"=c(-2,0,2,1,- > 1,0),"P2"=c(-4,0,4,-1,0,1)) > > > > I tried using "by", but then I lose the original order, and the "Group" > > varaible. Can you help? > > > >> RAW > > Experiment Group P1 P2 > > 2 A 10 8 > > 2 A 12 12 > > 2 B 14 16 > > 1 A 5 2 > > 1 A 3 3 > > 1 B 4 4 > > > > NOT.OK<- within (RAW, > > {P1<-do.call(rbind,by(RAW$P1,RAW$Experiment,scale,scale=F))}) > > > >> NOT.OK > > Experiment Group P1 P2 > > 2 A 1 8 > > 2 A -1 12 > > 2 B 0 16 > > 1 A -2 2 > > 1 A 0 3 > > 1 B 2 4 > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Centering-data-frame-by-factor- > tp3677609p3677620.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.