mathijsdevaan wrote on 12/09/2010 04:21:54 PM: > I have two columns with data (both identifiers - it's an affiliation list) > and I would like to delete the rows in which the observations in the second > column have a frequency < 5 in the entire second column. Example: > > 1 a > 1 b > 1 c > 2 a > 2 b > 2 d > > Let's say, I would like to delete the rows in which the observation in the > second column has a frequency < 2 in the entire second column. This would > result in: > > 1 a > 1 b > 2 a > 2 b > > How can I do this? Thanks in advance! >
It's not clear whether you want to delete rows where the value second column occurs less than 5 times or appears less than 2 times. I'll assume the latter. foo <- data.frame(k=rep(1:2, each=3), x=letters[c(1,2,3,1,2,4)]) bar <- subset(foo, x %in% names(table(foo$x))[table(foo$x)>=2]) No doubt others can write this more succinctly. -- Curt Seeliger, Data Ranger Raytheon Information Services - Contractor to ORD seeliger.c...@epa.gov 541/754-4638 [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.