you can use 'ave' to add a new column with the state average: > id<-as.character(c(01001:01010, 02001:02010)) > st<-substr(id,1,1) > cnty<-substr(id,2,5) > tfr10<-rnorm(1:20) > > mydata<-data.frame(id,st,cnty,tfr10) > mydata$stAvg <- ave(mydata$tfr10, mydata$st) > print(mydata) id st cnty tfr10 stAvg 1 1001 1 001 1.1896489 -0.3190678 2 1002 1 002 -1.0504707 -0.3190678 3 1003 1 003 -1.6130538 -0.3190678 4 1004 1 004 -1.1573924 -0.3190678 5 1005 1 005 -0.2013412 -0.3190678 6 1006 1 006 0.5176950 -0.3190678 7 1007 1 007 -1.3256951 -0.3190678 8 1008 1 008 0.4367956 -0.3190678 9 1009 1 009 0.2025659 -0.3190678 10 1010 1 010 -0.1894306 -0.3190678 11 2001 2 001 -0.9337906 -0.3842536 12 2002 2 002 0.2999035 -0.3842536 13 2003 2 003 0.5091345 -0.3842536 14 2004 2 004 -0.4787584 -0.3842536 15 2005 2 005 -1.6958660 -0.3842536 16 2006 2 006 -0.4430861 -0.3842536 17 2007 2 007 0.2100123 -0.3842536 18 2008 2 008 -1.7471779 -0.3842536 19 2009 2 009 0.1778717 -0.3842536 20 2010 2 010 0.2592210 -0.3842536 >
On Wed, Aug 31, 2011 at 12:50 PM, jour4life <jour4l...@gmail.com> wrote: > Hello all, > > I hope something is not already posted regarding this exact problem I am > trying to solve. I've read through the forums and previous postings and am > still confused as to how to approach this. Basically, what I am trying to do > is construct variables that utilizes an average of a variable from a > grouping, or higher order, variable. For instance, in my dataset I have > variables, with each observation being a county. Of those counties, we have > an ID variable, for which, I have extracted variables from the substring of > the ID variable. Thus, I was able to extract a state variable, for which, I > want to use the averages, calculated at the state level, and utilize those > averages for another variable. I know this may be confusing, so I'm posting > an example dataset here: > > id.tmp1<-as.character(01001:01010) > st<-substr(id,1,1) > cnty<-substr(id,2,5) > tfr10<-rnorn(1:10) > > mydata<-cbind(id,st,cnty,tfr10) > print(mydata) > id st cnty tfr10 > [1,] "1001" "1" "001" "1.07505442756833" > [2,] "1002" "1" "002" "-0.882434417011687" > [3,] "1003" "1" "003" "2.29276525788035" > [4,] "1004" "1" "004" "-0.312320296652298" > [5,] "1005" "1" "005" "1.09001860766383" > [6,] "1006" "1" "006" "-0.781940988103414" > [7,] "1007" "1" "007" "-0.614135968631341" > [8,] "1008" "1" "008" "0.515142965880679" > [9,] "1009" "1" "009" "0.0274456168157293" > [10,] "1010" "1" "010" "-0.538584996182184" > > What I want to do is get the average for of the variable "tfr10" by state. > Based on that, I will create another calculation that will output variables. > In other words, for each observation, calculate a new variable using the > average at the state level. Of course, this is a simple example and will > have 32 states, for which I do not want to create a "mean variable" for each > state to calculate another variable and would rather do this using a loop. > > Or, I can potentially create a "mean" variable, but based on the > observations at the state level using a loop. Whichever way is best and > easiest. I hope that this example is understandable. Any help or direction > would be greatly appreciated!!! > > Thanks, > > Carlos > > -- > View this message in context: > http://r.789695.n4.nabble.com/looping-by-grouping-variable-tp3781580p3781580.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.