On May 28, 2010, at 5:58 PM, Christian Schoder wrote:

Dear R-users,

I use firm-level data in panel structure. I would like to drop all firms that have less than x observations over the time scale in any of the variables considered. I would appreciate any help that (a) indicates relevant literature or websites or (b) indicates the code that could solve the problem.

Here, a detailed illustration of my problem: My data set is of the form
df
  id  y  z
1   a  1  1
2   b NA  2
3   b  3  3
4   c  2  2
5   c  4  4
6   c  5 NA
7   d  6 NA
8   d  5  5
9   d  6  6
10  d  7  7
11  e NA NA
12  e NA  4
13  e  3  3
where id is the index of the firm, and y and z are observations such as assets and sales. Now I would like to apply a procedure that drops all firms which have less then 2 observed realizations in y or z.


I try to avoid naming objects with  common function names like df:

> dfrm$nrecy <- ave(dfrm$y , dfrm$id, FUN=function(x) sum(!is.na(x)) )
> dfrm$nrecz <- ave(dfrm$z , dfrm$id, FUN=function(x) sum(!is.na(x)) )
> dfrm
   id  y  z nrecy nrecz
1   a  1  1     1     1
2   b NA  2     1     2
3   b  3  3     1     2
4   c  2  2     3     2
5   c  4  4     3     2
6   c  5 NA     3     2
7   d  6 NA     4     3
8   d  5  5     4     3
9   d  6  6     4     3
10  d  7  7     4     3
11  e NA NA     1     2
12  e NA  4     1     2
13  e  3  3     1     2
> dfrm[with(dfrm, pmin(nrecy, nrecz)>1), ]
   id y  z nrecy nrecz
4   c 2  2     3     2
5   c 4  4     3     2
6   c 5 NA     3     2
7   d 6 NA     4     3
8   d 5  5     4     3
9   d 6  6     4     3
10  d 7  7     4     3

Now it does not thereby assure that you will have at least 2 of each id with complete observationssince. But if you wanted a solution to that problem you would need a better testing data.frame.


Thus, it should give me a data.frame which looks like
df1
  id  y  z
1   c  2  2
2   c  4  4
3   c  5 NA
4   d  6 NA
5   d  5  5
6   d  6  6
7   d  7  7

Thank you very much!
Christian Schoder

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to