On 23/09/15 11:19, peter dalgaard wrote:
On 23 Sep 2015, at 00:33 , Rolf Turner <r.tur...@auckland.ac.nz> wrote:
[read.csv() doesn't distinguish "123.4" from 123.4]
IMHO this is a bug in read.csv().
Dunno about that:
pd$ cat ~/tmp/junk.csv
"1";1
2;"2"
pd$ open !$
open ~/tmp/junk.csv
And lo and behold, Excel opens with
1 1
2 2
and all cells numeric.
I would say that this phenomenon ("Excel does it") is *overwhelming*
evidence that it is bad practice!!! :-)
I don't think the CSV standard (if there is one...) specifies that
quoted strings are necessarily text.
Duncan Murdoch has pointed out that this is definitely *not* the case.
I think we have been here before, and found that even if we decide
that it is a bug (or misfeature), it would be hard to change, because
the modus operandi of read.* is to first read everything as character
and _then_ see (in type.convert()) which entries can be converted to
numeric, logical, etc.
As Arunkumar Srinivasan has pointed out, fread() from the data.table
package can handle this, so it is *not impossible*.
cheers,
Rolf
--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.