On Sat, Jul 10, 2010 at 6:28 PM, pdb <ph...@philbrierley.com> wrote: > > Hi all, > > I have a large data set and want to immediately build a 'blind' model > without first examining the data. Now it appears in the data there are a lot > of fields that are constant or all missing values - which prevents the model > from being built. > > Can someone point me the right direction as to how I can automatically purge > my data file of these useless fields. >
Try this. It will remove constant columns (such as column b below), all NA columns (such as column a below) and columns which are constant aside from NAs (such as column d below). In this example only column c should survive: # test data DF <- data.frame(a = NA, b = 1, c = 1:5, d = c(NA, NA, 1, 1, 1)) sd. <- sd(DF, na.rm = TRUE) DF[!is.na(sd.) & sd. > 0] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.