At 3:50 PM -0700 5/3/10, John Kane wrote:
I think that you are correct. R has the annoying habit of converting character data to factors when you don't want it to while it is importing data. This is because the in the option "stringsAsFactors" is set to TRUE for some weird historical reasons.
Well, "annoying" is in the eye of the beholder. The reason is not weird at all; the original S language, upon which R is based, was designed first for statistical analysis. When the language was expanded to include advanced modeling capabilities (linear models, generalized linear models, and more) it became apparent that factors are the appropriate form for using categorical data in such models. it is still the "R Project for Statistical Computing" (see the R home page), so the default is unchanged.
Hence, when users get factors when they were expecting numbers, it's virtually always because the have some non-numeric character strings mixed in with the data. R then defaults to interpreting it as categorical data, represented as a factor.
-Don -- --------------------------------- Don MacQueen Lawrence Livermore National Laboratory Livermore, CA, USA 925-423-1062 m...@llnl.gov ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.