subset.data.frame() does not have an na.rm argument! -pd
On 23 Jan 2014, at 00:58 , Jeff Johnson <mrjeffto...@gmail.com> wrote: > I have a dataset "mydf" with a field EMAIL_ADDRESS. When importing, I > specified: > mydf <- read.csv(file = extract, header = TRUE, stringsAsFactors = FALSE, > na.strings=c("NA","")) > > I've also tried setting na.strings= c("NA","","<NA>") but I don't know if > it's appropriate to put <NA> there. > > I'm running > a <- subset(mydf, VALID_EMAIL == FALSE, na.rm = TRUE, select = > EMAIL_ADDRESS) > dput(head(a,5)) > > structure(list(EMAIL_ADDRESS = c(NA_character_, NA_character_, > NA_character_, NA_character_, NA_character_)), .Names = "EMAIL_ADDRESS", > row.names = c(17L, > 22L, 23L, 24L, 30L), class = "data.frame") > > The results show a lot of <NA> values on screen and in the dput statement. > > I don't quite understand why it is doing that. I would have expected it to > exclude those since I had the na.rm = TRUE statement. Do you have any > suggestions? > > Thanks! > -- > Jeff > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.