On Sat, Jul 10, 2010 at 6:28 PM, pdb <ph...@philbrierley.com> wrote:
>
> Hi all,
>
> I have a large data set and want to immediately build a 'blind' model
> without first examining the data. Now it appears in the data there are a lot
> of fields that are constant or all missing values - which prevents the model
> from being built.
>
> Can someone point me the right direction as to how I can automatically purge
> my data file of these useless fields.
>

Try this. It will remove constant columns (such as column b below),
all NA columns (such as column a below) and columns which are constant
aside from NAs (such as column d below).  In this example only column
c should survive:

# test data
DF <- data.frame(a = NA, b = 1, c = 1:5, d = c(NA, NA, 1, 1, 1))
sd. <- sd(DF, na.rm = TRUE)
DF[!is.na(sd.) & sd. > 0]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to