I don't think na.rm is a valid at parameter for the subset function. I would normally use the is.na function to logically test for NA values. I also don't know where your VALID_EMAIL variable is coming from.
a <- subset(mydf, !is.na(EMAIL_ADDRESS)) The na.strings argument to read.csv and friends is used to help recognise strings in the input that should be treated as NA. If you don't see "<NA>" in your input file then it will have no effect on the data import. --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. Jeff Johnson <mrjeffto...@gmail.com> wrote: >I have a dataset "mydf" with a field EMAIL_ADDRESS. When importing, I >specified: >mydf <- read.csv(file = extract, header = TRUE, stringsAsFactors = >FALSE, >na.strings=c("NA","")) > >I've also tried setting na.strings= c("NA","","<NA>") but I don't know >if >it's appropriate to put <NA> there. > >I'm running >a <- subset(mydf, VALID_EMAIL == FALSE, na.rm = TRUE, select = >EMAIL_ADDRESS) >dput(head(a,5)) > >structure(list(EMAIL_ADDRESS = c(NA_character_, NA_character_, >NA_character_, NA_character_, NA_character_)), .Names = >"EMAIL_ADDRESS", >row.names = c(17L, >22L, 23L, 24L, 30L), class = "data.frame") > >The results show a lot of <NA> values on screen and in the dput >statement. > >I don't quite understand why it is doing that. I would have expected it >to >exclude those since I had the na.rm = TRUE statement. Do you have any >suggestions? > >Thanks! ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.