There is, however, an important distinction.

Quoting from ?TRUE  (or ?logical):

'TRUE' and 'FALSE' are reserved words denoting logical constants
     in the R language, whereas 'T' and 'F' are global variables whose
     initial values set to these.  All four are 'logical(1)' vectors.

 TRUE <- 3
Error in TRUE <- 3 : invalid (do_set) left-hand side to assignment

In other words, the rule is
  T is TRUE unless otherwise defined by the user
(ditto for F)

So this rule apparently applies to input from a file. Using colClasses is then an example of "otherwise defined by the user."

I think it's logical (pun not particularly intended) and consistent (though perhaps not ideal, but that's another question...)

-Don


At 5:37 PM -0500 2/28/10, Gabor Grothendieck wrote:
It is strange.  Even in R itself T and F are not guaranteed to be TRUE
and FALSE.

 T <- 1:3
 T
[1] 1 2 3


On Sun, Feb 28, 2010 at 4:55 PM, Rolf Turner <r.tur...@auckland.ac.nz> wrote:

 I had occasion recently to read in a one-line *.csv file that
 looked like:

 "CandidateName","NSN","Ethnicity","dob","gender"
 "Smith, Mary Jane",111222333,"E","2/25/1989","F"

 That "F" (for female) in the last field got transformed to
 FALSE.  Apparently read.csv (and hence read.table) are inferring
 that if the entries of a file are all F's and T's then the
 field is interpreted as logical.

 If I change the file to

 "CandidateName","NSN","Ethnicity","dob","gender"
 "Smith, Mary Jane",111222333,"E","2/25/1989","F"
 "Mingdinkler, Melvin Queue",999888777,"01/04/1942","M"

 then the read functions correctly interpret the last field
 as being character.

 The translation of "F" into FALSE resulted in some mysterious
 contretemps in further analysis, which it took me a while to
 track down.

 I solved the problem by putting in a colClasses argument in my
 call to read.csv().  But I really think that the read functions
 are being too clever by half here.  If field entries are surrounded
 by quotes, shouldn't they be left as character?  Even if they are
 all F's and T's?

 Furthermore using F's and T's to represent TRUE's and FALSE's is
 bad practice anyway.  Since FALSE and TRUE are reserved words it
 would make sense for the read function to assume that a field is
 logical if it consists entirely of these words.  But T's and F's
 .... I don't think so.

 I would argue that this behaviour should be changed.  I can see no
 downside to such a change.

        cheers,

                Rolf Turner

 ######################################################################
 Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

 ______________________________________________
 R-help@r-project.org mailing list
 https://*stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://*stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
---------------------------------
Don MacQueen
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062
m...@llnl.gov

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to