Dear R-users, I use firm-level data in panel structure. I would like to drop all firms that have less than x observations over the time scale in any of the variables considered. I would appreciate any help that (a) indicates relevant literature or websites or (b) indicates the code that could solve the problem.
Here, a detailed illustration of my problem: My data set is of the form > df id y z 1 a 1 1 2 b NA 2 3 b 3 3 4 c 2 2 5 c 4 4 6 c 5 NA 7 d 6 NA 8 d 5 5 9 d 6 6 10 d 7 7 11 e NA NA 12 e NA 4 13 e 3 3 where id is the index of the firm, and y and z are observations such as assets and sales. Now I would like to apply a procedure that drops all firms which have less then 2 observed realizations in y or z. Thus, it should give me a data.frame which looks like > df1 id y z 1 c 2 2 2 c 4 4 3 c 5 NA 4 d 6 NA 5 d 5 5 6 d 6 6 7 d 7 7 Thank you very much! Christian Schoder ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.