G'day Ted, On Sun, 18 Jul 2010 09:25:09 +0100 (BST) (Ted Harding) <ted.hard...@manchester.ac.uk> wrote:
> On 18-Jul-10 05:47:03, Suresh Singh wrote: > > I have a data file in which one of the columns is country code and > > NA is the > > code for Namibia. > > When I read the data file using read.csv, NA for Namibia is being > > treated as > > null or "NA" > > > > How can I prevent this from happening? > > > > I tried the following but it didn't work > > input <- read.csv("padded.csv",header = TRUE,as.is = c("code2")) > > > > thanks, > > Suresh > > I suppose this was bound to happen, and in my view it represent > a bit of a mess! With a test file temp.csv: > > Code,Country > DE,Germany > IT,Italy > NA,Namibia > FR,France Thanks for providing an example. > leads to exactly the same result. And I have tried every variation > I can think of of "as.is" and "colClasses", still with exactly the > same result! Did you think of trying some variations of "na.strings"? ;-) IMO, the simplest way of coding missing values in CSV files is to have two consecutive commas; not some code (whether NA, 99, 999, -1, ...) between them. > Conclusion: If an entry in a data file is intended to become the > character value "NA", there seems to be no way of reading it in > directly. This should not be so: it should be preventable! It is, through simple use of the "na.strings" argument: R> X <- read.csv("temp.csv", na.strings="") R> X Code Country 1 DE Germany 2 IT Italy 3 NA Namibia 4 FR France R> which(is.na(X)) integer(0) HTH. Cheers, Berwin ========================== Full address ============================ Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr) School of Maths and Stats (M019) +61 (8) 6488 3383 (self) The University of Western Australia FAX : +61 (8) 6488 1028 35 Stirling Highway Crawley WA 6009 e-mail: ber...@maths.uwa.edu.au Australia http://www.maths.uwa.edu.au/~berwin ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.