Hi! I am using R version 2.7.0 and am working on a panel dataset read into R as a dataframe; I call it "ex". The variables in "ex" are: id year x
id: a character string which identifies the unit year: identifies the time period x: the variable of interest (which might contain NAs). Here is an example: > id <- rep(c("A","B","C"),2) > year <- c(rep(1970,3),rep(1980,3)) > x <- c(20,30,40,25,35,45) > ex <- data.frame(id=id,year=year,x=x) > ex id year x 1 A 1970 20 2 B 1970 30 3 C 1970 40 4 A 1980 25 5 B 1980 35 6 C 1980 45 I want to draw a subset of "ex" by selecting only the A and B units: > ex1 <- subset(ex[which(ex$id=="A"|ex$id=="B"),]) Now I want to do some computations on x for each selected unit only: > tapply(ex1$x, ex1$id, mean) A B C 22.5 32.5 NA But this gives me an NA value for the unit C, which I thought I had already left out. How do I ensure that the computation (in the last step) is limited to only the units I have selected in the first step? Dipankar [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.