Hi: > is there a difference between the "as.factor" and "factor" commands > and also between "as.data.frame" and "data.frame"?
The as.* construct coerces an object from one class to another, so as.factor() can be used to, for example, coerce a character string to a factor object. The factor() function is used to *[re]define* a factor variable. [Sometimes it is useful or necessary to redefine a factor: for example, a subset of a data frame may contain a subset of the levels of a particular factor. Without a redefinition, all of the levels in the original data frame are retained by default, which may not be desirable for some reason.] Similarly, as.data.frame() coerces an object of another type (e.g., table, matrix) to a data frame, whereas data.frame() is used to define a data frame, where you have the freedom to use whatever variables and types that you wish, subject to the constraint that the number of rows is the same in each variable. Usually, the variable types are atomic, such as character, numeric, integer or factor, but it is possible to include other types of objects in a data frame. HTH, Dennis On Sat, Jan 22, 2011 at 7:11 AM, analys...@hotmail.com < analys...@hotmail.com> wrote: > > > On Jan 22, 9:50 am, Berwin A Turlach <ber...@maths.uwa.edu.au> wrote: > > On Sat, 22 Jan 2011 06:16:43 -0800 (PST) > > > > "analys...@hotmail.com" <analys...@hotmail.com> wrote: > > > (1) > > > > > > a = c("a","b") > > > > mode(a) > > > [1] "character" > > > > b = c(1,2) > > > > mode(b) > > > [1] "numeric" > > > > c = data.frame(a,b) > > > > mode(c$a) > > > [1] "numeric" > > > > R> str(c) > > 'data.frame': 2 obs. of 2 variables: > > $ a: Factor w/ 2 levels "a","b": 1 2 > > $ b: num 1 2 > > > > Character vectors are turned into factors by default by data.frame(). > > > > OTOH: > > > > R> c = data.frame(a,b, stringsAsFactors=FALSE) > > R> mode(c$a) > > [1] "character" > > > > > (2) > > > > > > a = c("a","a","b","b","c") > > > > levels(as.factor(a)) > > > [1] "a" "b" "c" > > > > levels(as.factor(a[1:3])) > > > [1] "a" "b" > > > > a = as.factor(a) > > > > levels(a) > > > [1] "a" "b" "c" > > > > levels(a[1:3]) > > > [1] "a" "b" "c" > > > > Subsetting factors does not get rid of no-longer used levels by default. > > > > OTOH: > > > > R> levels(a[1:3, drop=TRUE]) > > [1] "a" "b" > > > > or > > > > R> levels(factor(a[1:3])) > > [1] "a" "b" > > > > HTH. > > > > Cheers, > > > > Berwin > > > > Thanks for both responses. > > is there a difference between the "as.factor" and "factor" commands > and also between "as.data.frame" and "data.frame"? > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.