On Jul 19, 2011, at 11:58 AM, William Dunlap wrote:
-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org
] On Behalf Of Daniel Malter
Sent: Tuesday, July 19, 2011 1:51 AM
To: r-help@r-project.org
Subject: Re: [R] Centering data frame by factor
P1-tapply(P1,Experiment,mean)[Experiment]
Note that the above solution works in this example
because Experiment takes the values 1 and 2. If
Experiment were coded as, say, 101 and 102 the above
would not work. This is a case where converting
Experiment to a factor would avoid problems.
I checked to see if my ave solution was subject to the same caveats
and it is not. The help page is less categorical about what the
grouping variables' structure should be, saying only that they are
"typically factors".
E.g.,
RAW <-
data
.frame
("Experiment
"=
c
(2,2,2,1,1,1
),"Group
"=
c
("B
","A","B","B","A","B"),"P1"=c(-2,0,2,1,-1,0),"P2"=c(-4,0,4,-1,0,1))
RAW$E <- RAW$Experiment + 100 # relabeled Experiment
with(RAW, P1-tapply(P1,Experiment,mean)[Experiment]) # good
2 2 2 1 1 1
-2 0 2 1 -1 0
with(RAW, P1-tapply(P1,E,mean)[E]) # bad
<NA> <NA> <NA> <NA> <NA> <NA>
NA NA NA NA NA NA
with(RAW, ave(P1, E, FUN=function(x) scale(x, scale=FALSE) ) )
# [1] -2 0 2 1 -1 0 good
RAW$E <- factor(RAW$E) # convert to factor
with(RAW, P1-tapply(P1,E,mean)[E]) # good
102 102 102 101 101 101
-2 0 2 1 -1 0
And take note that Bill made his variable a factor outside the tapply
environment. If he had just used it in the tapply function (as I often
do ...possibly unwisely in light of this gotcha) it would fail:
> with(RAW, P1-tapply(P1, factor(E), mean)[E])
<NA> <NA> <NA> <NA> <NA> <NA>
NA NA NA NA NA NA
... that is unless you also use factor(E) as the index:
> with(RAW, P1-tapply(P1, factor(E), mean)[factor(E)])
102 102 102 101 101 101
-2 0 2 1 -1 0
Thanks. Bill. I've learned a lot of R from you.
--
David.
Another way to approach the problem is to think of
your normalized data as the residuals from a linear model:
residuals(lm(data=RAW, cbind(P1,P2) ~ E))
P1 P2
1 -2.000000e+00 -4.000000e+00
2 4.385598e-17 8.771196e-17
3 2.000000e+00 4.000000e+00
4 1.000000e+00 -1.000000e+00
5 -1.000000e+00 8.771196e-17
6 4.385598e-17 1.000000e+00
zapsmall(.Last.value) # make reading easier
P1 P2
1 -2 -4
2 0 0
3 2 4
4 1 -1
5 -1 0
6 0 1
That approach can make generizations to more factors
or to smoothing approaches easier.
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
HTH,
Daniel
ronny wrote:
Hi,
I would like to center P1 and P2 of the following data frame by
the factor
"Experiment", i.e. substruct from each value the average of its
experiment, and keep the original data structure, i.e. the
experiment and
the group of each value.
RAW=
data
.frame
("Experiment
"=
c
(2,2,2,1,1,1
),"Group"=c("A","A","B","A","A","B"),"P1"=c(10,12,14,5,3,4),"P2"=
c(8,12,16,2,3,4))
Desired result:
NORMALIZED=
data
.frame
("Experiment
"=
c(2,2,2,1,1,1),"Group"=c("B","A","B","B","A","B"),"P1"=c(-2,0,2,1,-
1,0),"P2"=c(-4,0,4,-1,0,1))
I tried using "by", but then I lose the original order, and the
"Group"
varaible. Can you help?
RAW
Experiment Group P1 P2
2 A 10 8
2 A 12 12
2 B 14 16
1 A 5 2
1 A 3 3
1 B 4 4
NOT.OK<- within (RAW,
{P1<-do.call(rbind,by(RAW$P1,RAW$Experiment,scale,scale=F))})
NOT.OK
Experiment Group P1 P2
2 A 1 8
2 A -1 12
2 B 0 16
1 A -2 2
1 A 0 3
1 B 2 4
--
View this message in context:
http://r.789695.n4.nabble.com/Centering-data-frame-by-factor-
tp3677609p3677620.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.