Hi!

I am using R version 2.7.0 and am working on a panel dataset read into R as
a dataframe; I call it "ex". The variables in "ex" are: id  year  x

id: a character string which identifies the unit
year: identifies the time period
x: the variable of interest (which might contain NAs).

Here is an example:
> id <- rep(c("A","B","C"),2)
> year <- c(rep(1970,3),rep(1980,3))
> x <- c(20,30,40,25,35,45)
> ex <- data.frame(id=id,year=year,x=x)
> ex
 id year  x
1  A 1970 20
2  B 1970 30
3  C 1970 40
4  A 1980 25
5  B 1980 35
6  C 1980 45

I want to draw a subset of "ex" by selecting only the A and B units:

> ex1 <- subset(ex[which(ex$id=="A"|ex$id=="B"),])

Now I want to do some computations on x for each selected unit only:

> tapply(ex1$x, ex1$id, mean)
  A    B    C
22.5 32.5   NA

But this gives me an NA value for the unit C, which I thought I had already
left out. How do I ensure that the computation (in the last step) is limited
to only the units I have selected in the first step?

Dipankar

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to