On Fri, 2009-03-27 at 15:11 -0400, Laura Rodriguez Murillo wrote: > Hi dear list, > > I have a list of around 2000 identifiers aranged in a dataframe in one > column and I would like to choose a random subset of these. I wonder > if somebody can tell me if I could do this with R...
Not sure what you mean by identifiers, but to select a subset of the 2000 cells in that column, you could use sample(). See ?sample for details, but here is an example. ## choose a random subset of 500 out of 2000 entries ## dummy data dat <- data.frame(identifiers = sample(2000, 2000), X = rnorm(2000)) ## set seed to make this the same on your PC as mine ## comment this if you want a different subset each time you run set.seed(1234) ## random subset of 500 want <- sample(2000, 500) ## select out that subset ## head to show only first n of the selected head(dat$identifiers[want]) Gives: > head(dat$identifiers[want]) [1] 1327 587 835 430 1422 1687 This assumes the identifiers are unique. HTH G > > Thank you so much! > > Laura RM > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.