Hi, I need to perform calculations on subsets of a data frame:
DF = data.frame(read.table(textConnection(" A B C D E F 1 a 1995 0 4 1 2 a 1997 1 1 3 3 b 1995 3 7 0 4 b 1996 1 2 3 5 b 1997 1 2 3 6 b 1998 6 0 0 7 b 1999 3 7 0 8 c 1997 1 2 3 9 c 1998 1 2 3 10 c 1999 6 0 0 11 d 1999 3 7 0 12 e 1995 1 2 3 13 e 1998 1 2 3 14 e 1999 6 0 0"),head=TRUE,stringsAsFactors=FALSE)) I'd like to create new dataframes for each unique year in which for each value of A, the values of D, E and F are summed over the last 3 years (e.g. 1998 = 1998, 1997, 1996): Question 1: How do I go from DF to newDFyear? Examples: newDF1995 B D E F a 0 4 1 b 3 7 0 e 1 2 3 newDF1998 B D E F a 1 1 3 b 8 4 6 c 2 4 6 e 1 2 3 Then, for each new DF I need to generate a square matrix after doing the following: newDF1998$G<-newDF1998$D + newDF1998$E + newDF1998$F newDF1998$D<-newDF1998$D/newDF1998$G newDF1998$E<-newDF1998$E/newDF1998$G newDF1998$F<-newDF1998$F/newDF1998$G newDF1998<-NewDF1998[,c(-5)] newDF1998 B D E F a 0.2 0.2 0.6 b 0.4 0.2 0.3 c 0.2 0.3 0.5 e 0.2 0.3 0.5 Question 2: How do I go from newDF1998 to a matrix a b c e a b c e in which Cell ab = (0.2*0.4 + 0.2*0.2 + 0.6*0.3)/((0.2*0.2 + 0.2*0.2 + 0.6*0.6)^0.5) * ((0.4*0.4 + 0.2*0.2 + 0.3*0.3)^0.5) = 0.84 Thanks a lot for your help! -- View this message in context: http://r.789695.n4.nabble.com/Yearly-aggregates-and-matrices-tp3438140p3438140.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.