At 3:50 PM -0700 5/3/10, John Kane wrote:
I think that you are correct. R has the annoying habit of converting character data to factors when you don't want it to while it is importing data. This is because the in the option "stringsAsFactors" is set to TRUE for some weird historical reasons.



Well, "annoying" is in the eye of the beholder. The reason is not weird at all; the original S language, upon which R is based, was designed first for statistical analysis. When the language was expanded to include advanced modeling capabilities (linear models, generalized linear models, and more) it became apparent that factors are the appropriate form for using categorical data in such models. it is still the "R Project for Statistical Computing" (see the R home page), so the default is unchanged.

Hence, when users get factors when they were expecting numbers, it's virtually always because the have some non-numeric character strings mixed in with the data. R then defaults to interpreting it as categorical data, represented as a factor.

-Don
--
---------------------------------
Don MacQueen
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062
m...@llnl.gov

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to