On 23/09/15 11:19, peter dalgaard wrote:

On 23 Sep 2015, at 00:33 , Rolf Turner <r.tur...@auckland.ac.nz> wrote:


[read.csv() doesn't distinguish "123.4" from 123.4]

IMHO this is a bug in read.csv().


Dunno about that:

pd$ cat ~/tmp/junk.csv
"1";1
2;"2"
pd$ open !$
open ~/tmp/junk.csv

And lo and behold, Excel opens with

1 1
2 2

and all cells numeric.

I would say that this phenomenon ("Excel does it") is *overwhelming* evidence that it is bad practice!!! :-)

I don't think the CSV standard (if there is one...) specifies that
quoted strings are necessarily text.

Duncan Murdoch has pointed out that this is definitely *not* the case.

I think we have been here before, and found that even if we decide
that it is a bug (or misfeature), it would be hard to change, because
the modus operandi of read.* is to first read everything as character
and _then_ see (in type.convert()) which entries can be converted to
numeric, logical, etc.

As Arunkumar Srinivasan has pointed out, fread() from the data.table package can handle this, so it is *not impossible*.

cheers,

Rolf

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to