On Fri, Aug 17, 2012 at 07:34:35PM +0100, Rui Barradas wrote: > Hello, > > No, factors may use less memory. System dependent? > > > x <-sample(c("small","medium","large"),1e4,rep=TRUE) > > y <- factor(x) > > object.size(x) > 80184 bytes > > object.size(y) > 40576 bytes > > > > sessionInfo() > R version 2.15.1 (2012-06-22) > Platform: x86_64-pc-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=Portuguese_Portugal.1252 LC_CTYPE=Portuguese_Portugal.1252 > [3] LC_MONETARY=Portuguese_Portugal.1252 LC_NUMERIC=C > [5] LC_TIME=Portuguese_Portugal.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] Rcapture_1.2-0 xts_0.8-0 zoo_1.7-7 > > loaded via a namespace (and not attached): > [1] chron_2.3-39 fortunes_1.4-2 grid_2.15.1 lattice_0.20-6 tools_2.15.1 > > > And I agree with what Steve said, stringsAsFactors = FALSE saves hours > of debuging time.
Hi. I use stringsAsFactors = FALSE quite frequently. If there is a discussion on R-devel, whether this should be the default, i would support this. Factors are very useful and sometimes necessary, but they are hard to manipulate. As Jeff Newmiller said, it is a good strategy to prepare the data as character type and convert to a factor, when they are complete. The users should know, how to use factors, however the strategy "convert to a factor eventually" is more consistent with not having stringsAsFactors = TRUE as the default. Petr Savicky. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.