On 14-Jun-09 18:56:01, Gabor Grothendieck wrote: > If read.csv's colClasses= argument is NOT used then read.csv accepts > double quoted numerics: > > 1: > read.csv(stdin()) > 0: A,B > 1: "1",1 > 2: "2",2 > 3: > A B > 1 1 1 > 2 2 2 > > However, if colClasses is used then it seems that it does not: > >> read.csv(stdin(), colClasses = "numeric") > 0: A,B > 1: "1",1 > 2: "2",2 > 3: > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, > na.strings, : > scan() expected 'a real', got '"1"' > > Is this really intended? I would have expected that a csv file > in which each field is surrounded with double quotes is acceptable > in both cases. This may be documented as is yet seems undesirable > from both a consistency viewpoint and the viewpoint that it should > be possible to double quote fields in a csv file.
Well, the default for colClasses is NA, for which ?read.csv says: [...] Possible values are 'NA' (when 'type.convert' is used), [...] and then ?type.convert says: This is principally a helper function for 'read.table'. Given a character vector, it attempts to convert it to logical, integer, numeric or complex, and failing that converts it to factor unless 'as.is = TRUE'. The first type that can accept all the non-missing values is chosen. It would seem that type 'logical' won't accept integer (naively one might expect 1 --> TRUE, but see experiment below), so the first acceptable type for "1" is integer, and that is what happens. So it is indeed documented (in the R[ecursive] sense of "documented" :)) However, presumably when colClasses is used then type.convert() is not called, in which case R sees itself being asked to assign a character entity to a destination which it has been told shall be integer, and therefore, since the default for as.is is as.is = !stringsAsFactors but for this ?read.csv says that stringsAsFactors "is overridden bu [sic] 'as.is' and 'colClasses', both of which allow finer control.", so that wouldn't come to the rescue either. Experiment: X <-logical(10) class(X) # [1] "logical" X[1]<-1 X # [1] 1 0 0 0 0 0 0 0 0 0 class(X) # [1] "numeric" so R has converted X from class 'logical' to class 'numeric' on being asked to assign a number to a logical; but in this case its hands were not tied by colClasses. Or am I missing something?!! Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.hard...@manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 14-Jun-09 Time: 21:21:22 ------------------------------ XFMail ------------------------------ ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel