Re: [R] a question on sqldf's handling of missing value and factor

xin wei Wed, 02 Mar 2011 08:07:42 -0800

Dear Mr. Grothendieck :
thank you so much for your attention. You are the real expert here. the
following is a mock text file:
a       b       c
aa              23
aaa     34       
aaaa            77


note that both b and c column contain missing value (blank)
I save it under my C drive and use both read.table and sqldf to import it to
R and then use identical() function to compare the result. The following is
the result:

> setwd("c:/")
> library(sqldf)
> test <- file("test.txt") 
> testx <- sqldf("select * from test", 
+                 dbname = tempfile(), file.format = list(header = T,
sep="\t", row.names = F))
> testy<- read.table("test.txt", header = T, sep="\t")
> identical(testx, testy)
[1] FALSE
> testx
     a    b    c
1   aa      23.0
2  aaa 34.6  0.0
3 aaaa      77.8
> testy
     a    b    c
1   aa   NA 23.0
2  aaa 34.6   NA
3 aaaa   NA 77.8
> class(testx$b)
[1] "factor"
> class(testy$b)
[1] "numeric"
> 
 
read.table seems to get it right while sqldf treats b as factor (if I add
method="raw", b become character). what is more troubling is that column C
has number 0 at the second row while in the original file it is missing. In
my real world situation with a much larger text file, the problem is that
many cells are empty when they all actually have values in the original text
file. 

I would greatly appreciate your help if you can shed some light on this.

thanks

--
View this message in context: 
http://r.789695.n4.nabble.com/a-question-on-sqldf-s-handling-of-missing-value-and-factor-tp3331007p3331662.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] a question on sqldf's handling of missing value and factor

Reply via email to